# Capstone: Customer Demand Forecasting

**Notebook 5 - Contents**<br>
[Objective](#Objective)<br>
[Cost and Benefits Analysis](#Cost-and-Benefits-Analysis)<br>
[Objective](#Objective)<br>
[Objective](#Objective)<br>

## Objective

The forecasting reliability or forecast accuracy is key indicator in demand planning.  Robust implementation of forecasting process will reward business with benefits of minimizing stock-outs, improve service rate and reduce costs of supply chain. This is crucial in meeting customer expectation which leads to customer satisfaction and loyalty.  Ultimately business will enjoy growth and longevity. 

In this case study, RMSE is used for model performance assessment hence it will be used to conduct costs-benefits analysis to assess whether the costs of these forecasting errors outweigh the benefits of accurate forecasting. 

## Cost and Benefits Analysis

### Determine Costs and Benefits 

Costs associated with demand forecasting: 
- Costs of maintaining and monitoring forecasting model (labour)
- Costs of data collection and processing (labour)
- Costs of software and technology used for forecasting
- Training costs

Benefits of accurate forecasting: 
- Reduction in excess inventory carrying costs
- Reduction in stockouts, lost sales and backorders
- Reduction in wastages and obsolencence
- Improved production planning
- Improved supply chain efficiency (logistics and storage)
- Efficient resource allocation (human resource, finances, labour hours)
- Enhanced customer satistaction and retention

### Assumptions: 


1. Prediction is based on 90 days window.
2. Current state is based on past historical mean to do projection for next 3 months.
3. Product margins are not available in the datasets and data sources hence I will use the latest quarterly sales data from the biggest smartphone maker in US, i.e. [Apple](https://finance.yahoo.com/quote/AAPL/financials#:~:text=Annual-,Quarterly,-Income%20Statement).  Gross margin from Apple is 43% hence product costs is 57% (1 - gross margin).
4. Average price of Mobiles & Tablets from the datasets is \\$712.
6. Storage cost is \\$0.36 per unit per month as per [Amazon website](https://supplychain.amazon.com/pricing#storage:~:text=%240.13/mo-,%240.36/mo,-%240.09/mo).
7. Fulfillment freight cost is \\$5 per unit per [freight price list.](https://www.shipbob.com/blog/amazon-fba-fees/#:~:text=Amazon%20FBA%20Fees%20table)

In [1]:
# import libraries 
import numpy as np
import pandas as pd
import pickle

In [2]:
# import data for Mobiles & Tablets
mobiles_sales = pickle.load(open('../pkl/mobiles_sales_exog.pkl', 'rb'))

### Current State

Use historical mean to do projection for next 3 months

In [3]:
# prediction for next 3 months based on historical daily mean
currentstate_y_pred = round(mobiles_sales[mobiles_sales.index<='2021-06-30']['order_qty'].mean())
currentstate_y_pred

169

Based on above calculation on current state, if we use historical mean to project for next 3 months, it will vastly over-forecast and incur extra costs on storage, freight, inventory and eventually getting rid of stock obsolescence.

In [4]:
# Gross margin from Apples latest quarterly sales
gross_margin = 166816 / 383933
gross_margin

0.4344924765518984

In [5]:
# Calculate average cost per Mobiles & Tablets
avg_price = round(mobiles_sales['price'].mean())
avg_price

712

In [6]:
avg_cost = round(avg_price * (1 - gross_margin))
avg_cost

403

**Best Model = Random Forest Regressor = RMSE: 186 units**

In [7]:
RMSE = 186

Scenario 1 - over-forecasting

In [8]:
inventory_cost = round(avg_price*(1-gross_margin)*RMSE)
inventory_cost

74891

In [9]:
storage = round(RMSE*30*0.36*3)  # RMSE * 30 days a month * storage cost * 3 months
storage

6026

In [10]:
freight = round(RMSE*30*2)  # RMSE * 30 days a month * freight cost * 1 way
freight

11160

In [11]:
over_forecast = inventory_cost + storage + freight
over_forecast

92077

Scenario 2 - under-forecasting

In [12]:
margin_loss = round(avg_price*gross_margin*RMSE)
margin_loss

57541

In [13]:
business_loss = round(avg_price*gross_margin*RMSE)
business_loss

57541

In [14]:
under_forecast = margin_loss - storage - freight + business_loss
under_forecast

97896

## Conclusion

1. Based on model comparison, the best fit model is Random Forest model with RMSE of 186. This model is able to handle complex relationship of features and predict seasonality from limited historical data due to its robust algorithms and hyperparameter tuning features.  It also extract features importance with payment method 'easypay' being the highest importance followed by discount and pricing.
   
2. This finding from feature importance from Random Forest model complements the EDA which highlights the price and discounts relationship with demand.

3. EDA also shows that Sundays and December month play a signicant role on driving the demand. 
 
4. As time series model produces overfitting results, it suggests that demand are less time dependant but more driven by liberate pricing effort, discounts program as well as payment method made available to customer for convenience of purchasing.  

5. Over-forecasting is better than under-forecasting in terms of future revenue loss which could be exponentially costly and affect the longevitiy of business. The model would be able to understand the seasonality and patterns better to make more accurate results with lower errors if there are more historical data of longer horizon instead of just 1 year.

6. Other external factors are not available from the datasets which could be the main driver of demand. These factors can be economic condition, holidays, festive promotions such as Christmas, Black Friday or Amazon Prime Day which are the initiatives by external parties.



## Recommendation

1. It is recommended to collect more historical data to train the model. Also, efforts should focus on improving the data quality used for modeling. Identify and address any data anomalies, missing values, or outliers to optimize model performance.
   
2. Feature engineering to create interactive terms between features or variables to further fine-tune the model performance.

3. We could gather more information on products attributes such as color, brand and size.  Other information that would help with accurate demand forecasting are promotions program, other co-purchased products (market basket analysis), holiday seasons, special events (Formula 1 race, Olympics Game etc), weather data as well as external economic indicators (GDP, unemployment rate, interest rate)

4. We could train the datasets on other advanced models such as XGBoost or Prophet models and continue to finetune the models until optimum results.