## Problem Statement


In the world of supply chain management, accurately predicting future demand and optimizing inventory levels are critical for efficient operations and customer satisfaction.

The provided dataset contains historical data related to a specific product (Product_ID: P1) over a period of time. The data includes information on the date, product demand, and current inventory levels. This dataset serves as the foundation for analyzing demand forecasting and inventory optimization techniques.

## Process Overview:
1. Preprocess the dataset.
2. Analyze demand and inventory trends.
3. est for stationarity (ADF Test).
4. Model demand using SARIMAX and forecast.
5. Optimize inventory with Newsvendor formula and reorder point.
6. Calculate safety stock and total cost.
7. Develop inventory management strategy.



## Data:

The dataset contains daily information for a product's demand and inventory levels. Below is a brief overview of the columns:

Date: The date corresponding to the entry.
Product_ID: The unique identifier for the product (e.g., P1).
Demand: The number of units demanded on that particular date.
Inventory: The number of units available in stock on that date.

**Data Source**: [demand_forecasting](<../../OneDrive/Documents/Project 365/Python/Datasets/demand_inventory.csv>)












## Table of Contents:

1.	Importing necessary libraries and load the dataset.
2.	Preprocessing the dataset by removing unnecessary columns.
3.	Visualizing demand and inventory over time.
4.	Performing stationarity test (ADF Test).
5.	Difference the demand data to make it stationary.
6.	Plot ACF and PACF for ARIMA/SARIMA parameter selection.
7.	Fit SARIMAX model and forecast demand.
8.	Creating future demand predictions for the next 10 days.
9.	Calculating optimal order quantity using the Newsvendor formula.
10.	Computing reorder point considering lead time and variability.
11.	Determine safety stock to account for demand uncertainty.
12.	Calculating total cost (holding cost + stockout cost).
13.	Conclusion


In [None]:
!pip install plotly

In [None]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
%matplotlib inline
import statsmodels.api as sm 
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.statespace.sarimax import SARIMAX


In [None]:
df = pd.read_csv(r'C:\Users\ajayk\Downloads\Demand-Forecasting-and-Inventory-Optimization\demand_inventory.csv')

In [None]:
df = df.drop(columns=['Unnamed: 0'])

VISUALIZATION 

In [None]:
plt.figure(figsize=(16,5))
plt.plot(df['Date'],df['Demand'],color='green', marker='o', linestyle='dashed')
plt.xlabel('Date')
plt.ylabel('Demand')
plt.title('Demand over time')
plt.grid(True)
plt.xticks(rotation=90)
plt.show()

In [None]:
import plotly.express as px

In [None]:


fig_demand = px.line(df, x='Date',
                     y='Demand',
                     title='Demand Over Time')
fig_demand.show()

In [None]:

fig_inventory = px.line(df, x='Date',
                     y='Inventory',
                     title='Inventory Over Time')
fig_inventory.show()

DEMAND FORECASTING 

In [None]:
from statsmodels.tsa.stattools import adfuller

In [None]:
def adf_test(df):
    result=adfuller(df)
    print('ADF STATS : {}'.format(result[0]))
    print('p-value : {}'.format(result[1]))
    if result[1] >= 0.05:
        print ("strong evidence against the null hypothesis, reject the null hypothesis. Data has no unit root and is stationary")
    else:
        print("weak evidence against null hypothesis, time series has a unit root, indicating it is non-stationary")
        
   

In [None]:
adf_test(df["Demand"])

Differencing to make the data stationary 

In [None]:
df['1st_diff_demand'] = df['Demand'].diff().dropna()
df 

In [None]:
fig_demand = px.line(df, x='Date',
                     y='1st_diff_demand',
                     title='Demand Over Time')
fig_demand.show()

In [None]:
#preparing data for time series (sarimax as it contains seasonality) and check for p , d, q using acf and pacf plot 

In [None]:
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d')


In [None]:
df['Date'] = pd.to_datetime(df['Date'],
                                     format='%Y/%m/%d')
time_series = df.set_index('Date')['Demand']

differenced_series = time_series.diff().dropna()

# Plot ACF and PACF of differenced time series
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
plot_acf(differenced_series, ax=axes[0])
plot_pacf(differenced_series, ax=axes[1])
plt.show()

Here we first calculated the differenced time series from the original time_series, removed any NaN values resulting from the differencing, and then plotted the ACF and PACF to provide insights into the potential order of the AR and MA components in the time series. These plots are useful for determining the appropriate parameters when using the SARIMA model for time series forecasting.

Based on the plots, we find that p=3 and q=1. The ACF plot cuts off at lag 1, indicating q=1, and the PACF plot also cuts off at lag 3, indicating p=3. 

In [None]:
order = (1, 1, 1)
seasonal_order = (1, 1, 1, 2) #2 because the data contains a time period of 2 months only
model = SARIMAX(time_series, order=order, seasonal_order=seasonal_order)
model_fit = model.fit(disp=False)

future_steps = 10
predictions = model_fit.predict(len(time_series), len(time_series) + future_steps - 1)
predictions = predictions.astype(int)
print(predictions)

INVENTORY OPTIMIZATION     

optimize inventory according to the forecasted demand for the next ten days

In [None]:
# Create date indices for the future predictions
future_dates = pd.date_range(start=time_series.index[-1] + pd.DateOffset(days=1), periods=future_steps, freq='D')

# Create a pandas Series with the predicted values and date indices
forecasted_demand = pd.Series(predictions, index=future_dates)
print(forecasted_demand)  

In [None]:
# Initial inventory level
initial_inventory = 5500

# Lead time (number of days it takes to replenish inventory) 
lead_time = 2 # it's different for every business, 2 days as an example

# Service level (probability of not stocking out)
service_level = 0.95 # it's different for every business, 0.95 is an example



Calculate the optimal order quantity using the Newsvendor formula

The Economic Order Quantity (EOQ) or optimal order quantity is the amount of stock that should be ordered to minimize the total costs associated with ordering and holding inventory. This is calculated based on the Newsvendor model, taking into account factors such as:

#Q∗ =μ+z⋅σ

#Q∗ is the optimal order quantity. or The Economic Order Quantity

#𝜇 - μ is the mean (average) of the forecasted demand.

#𝜎 - σ is the standard deviation of the forecasted demand.

#𝑧 - z is the z-score corresponding to the desired service level, obtained        using norm.ppf(service_level).

Service Level Approach: By using the z-score, the formula adjusts the order quantity to achieve the desired service level. The z-score tells us how many standard deviations above the mean demand we need to go to ensure that we meet the demand with the given probability (service level).



In [None]:
from scipy.stats import norm

In [None]:
forecasted_demand_mean = forecasted_demand.mean()
forecasted_demand_std = forecasted_demand.std()

# Calculate the z-score for the desired service level i.e 0.95
z = norm.ppf(service_level)

# Calculate the optimal order quantity using the Newsvendor formula
order_quantity = np.ceil(forecasted_demand_mean + z * forecasted_demand_std).astype(int)

print(f'Optimal Order Quantity: {order_quantity}')

Detailed explanation

Normal Distribution:

The Newsvendor problem assumes that the demand follows a normal distribution. This is a common assumption because many natural phenomena tend to follow a normal distribution due to the Central Limit Theorem.

Service Level:

The service level represents the probability that the demand will not exceed the stock level. For example, a 95% service level means that there is a 95% chance that the demand will be met without stocking out.
Z-Score:

The z-score is a statistical measure that represents the number of standard deviations a data point is from the mean. In this case, it represents how far the optimal stock level is from the mean demand, in terms of standard deviations.

Percent Point Function (ppf):

The ppf function is the inverse of the cumulative distribution function (CDF). While the CDF gives the probability that a value is less than or equal to a certain value, the ppf function gives the value that corresponds to a given probability.
By using norm.ppf(service_level), we obtain the z-score that corresponds to the desired service level. For instance, for a service level of 95%, norm.ppf(0.95) gives us the z-score that corresponds to the 95th percentile of the normal distribution.

In [None]:
std_lead_time_demand = forecasted_demand_std * np.sqrt(lead_time)

# Calculate the reorder point
reorder_point = forecasted_demand_mean * lead_time + z * std_lead_time_demand
print(f'Reorder Point: {reorder_point:.2f} units')

ANALYSIS

EOQ focuses on optimizing the order size to minimize costs.
ROP focuses on the timing of placing new orders to prevent stockouts.

Optimal Order Quantity (133 units): This is the amount you should order each time you place an order to minimize the total inventory costs while achieving the desired service level. It considers the average demand and variability, ensuring that you order enough to meet demand without overstocking.

Reorder Point (258.15 units): This is the inventory level at which you should place a new order to ensure you don't run out of stock before the new order arrives. It considers the lead time and variability in demand during this period.


Both metrics should work together: when your inventory level hits the ROP, you place an order of the EOQ to replenish your stock. In this case if the inventory hits 258 units, right then we place an order of 133 units more to replenish.

In [None]:
# Calculate the optimal safety stock
safety_stock = reorder_point - forecasted_demand.mean() * lead_time
print(f'safety_stock: {reorder_point:.2f} units')


#where  forecasted_demand.mean() * lead_time is avg_demand_lead_time 

ANALYSIS 

Safety Stock: 258 – Safety stock is the additional inventory kept on hand to account for uncertainties in demand and supply. It acts as a buffer against unexpected variations in demand or lead time. In this case, a safety stock of 258 units has been calculated, which helps ensure that there’s enough inventory to cover potential fluctuations in demand or lead time.

In [None]:
# Calculate the total cost (holding cost + stockout cost)
holding_cost = 0.1  # it's different for every business, 0.1 is an example
stockout_cost = 10  # # it's different for every business, 10 is an example
total_holding_cost = holding_cost * (initial_inventory + 0.5 * order_quantity)
total_stockout_cost = stockout_cost * np.maximum(0, forecasted_demand.mean() * lead_time - initial_inventory)

# Calculate the total cost
total_cost = total_holding_cost + total_stockout_cost

print("Total Cost:", total_cost)

ANALYSIS 


Total Cost: 556.65 – The total cost represents the combined costs associated with inventory management. In this case, the total cost has been calculated as approximately 561.80 units based on the order quantity, reorder point, safety stock, and associated costs.

CONCLUSION 



Demand Forecasting involves predicting the quantity and pattern of customer orders, which is crucial for businesses to efficiently allocate resources, manage inventory, and plan production. Inventory Optimization aims to strike a balance between having sufficient stock to meet demand without carrying excess inventory that ties up capital and storage space. 