## Demand Forecasting in Kenya’s Second-Hand Retail Sector

---
### Business Understanding

#### Domain Knowledge – Retail Demand Forecasting

In the highly competitive retail landscape, accurate demand planning plays a critical role in driving profitability, enhancing customer satisfaction, and supporting sustainable growth for businesses. As such, demand forecasting is a fundamental element of a successful retail strategy.

#### Industry Context

Retailers are constantly challenged to manage the balance between two key risks—overstocking and understocking.

- **Overstocking** results in tied-up working capital, elevated storage expenses, and an increased likelihood of product obsolescence.

- **Understocking** can lead to missed sales opportunities, customer dissatisfaction, and a potential decline in market share.



### Challenges in the Retail Domain  

| Challenge            | Description                                                                 |
|-----------------------|-----------------------------------------------------------------------------|
| **Seasonality**       | Demand fluctuates during holidays, school reopening, and seasonal changes.  |
| **Promotions & Discounts** | Short-term spikes caused by campaigns (e.g., holiday offers, back-to-school sales). |
| **External Factors**  | Economic trends, competitor actions, and consumer preference shifts add uncertainty. |
| **Data Quality Issues** | POS data may contain duplicates, missing values, or inconsistent product naming. |

### Forecasting Approaches  

| Approach Type        | Methods/Models                                   | Strengths                                                   |
|-----------------------|-------------------------------------------------|-------------------------------------------------------------|
| **Time-Series Models** | ARIMA, Prophet, Exponential Smoothing           | Capture seasonality, trends, and cyclical patterns well.     |
| **Machine Learning**   | Random Forest, XGBoost, LSTMs                   | Handle non-linear patterns, external predictors, and large datasets. |
| **Hybrid Approaches**  | Combination of Time-Series + ML methods         | Leverage strengths of both for improved accuracy.            |




---
The primary objective is to develop a reliable demand forecasting model that predicts future sales volumes for the company’s products. Currently, sales and replenishment decisions are largely reactive, leading to either stockouts (missed revenue opportunities and customer dissatisfaction) or overstocking (increased holding costs, wastage, and tied-up capital).
By leveraging historical sales data and external drivers, the goal is to:

+ Forecast daily/weekly/monthly demand.

+ Identify seasonal peaks and long-term trends.

+ Support data-driven planning for procurement, inventory, and sales strategies

---
### Why Forecasting Demand is Important  

| Area                     | Importance                                                                 |
|---------------------------|----------------------------------------------------------------------------|
| **Operational Efficiency** | Supports workforce planning, supply chain scheduling, and distribution logistics. |
| **Financial Performance** | Enhances budgeting accuracy, cash flow planning, and revenue predictability. |
| **Customer Experience**   | Ensures product availability during peak demand, reducing stockouts.        |
| **Strategic Decision-Making** | Enables data-driven sales target setting, marketing allocation, and supplier negotiations. |

---
### Key Stakeholders and Their Needs  

| Stakeholder                | Needs                                                                 |
|-----------------------------|----------------------------------------------------------------------|
| **Sales & Marketing Team** | Identify peak products/categories to plan promotions and campaigns.   |
| **Procurement & Supply Chain** | Optimize purchase orders, supplier negotiations, and lead-time planning. |
| **Finance & Management**   | Set realistic revenue targets, budgets, and cash flow projections.    |
| **Store/Branch Managers**  | Access localized forecasts (if multi-store) for daily stock control.  |
| **Executive Leadership**   | Understand long-term trends to guide strategic expansion decisions.   |


### Forecasting Granularity & Targets

**Granularity:** The project will assess forecasts at different levels. Product-level, category-level, and aggregated store-level. In order to meet varying stakeholder needs.

**Time Horizon:** Forecasts will be produced for daily, weekly, and monthly intervals.

**Sales Targets:** Forecasts will support both short-term targets (weekly and monthly) and long-term projections (quarterly and yearly).

### Acceptable Accuracy

While perfect forecasts are not realistic, the business aims for models that achieve:

>70–85% accuracy (measured by MAPE/RMSE) at the product level.

>90% accuracy at the aggregated store/category level.

This balance reflects the natural variability in retail demand while ensuring forecasts are reliable enough to support decision-making.



### Business Problem

---
## Data Understanding & Preparation  

### 1. Data Overview  
The dataset consists of transaction-level sales data with the following key fields:  

- **Date** – Timestamp of sales transactions.  
- **Receipt Number** – Unique identifier for transactions.  
- **Product/Category** – Classification of items sold.  
- **Quantity** – Number of units sold.  
- **Price** – Unit price at the time of sale.  

### 2. Data Quality Assessment  
| Potential Issue             | Action to Address                                              |
|------------------------------|---------------------------------------------------------------|
| **Missing Values**           | Identify and impute (or remove) missing entries.               |
| **Duplicates**               | Drop duplicate transactions using *Receipt Number* and *Date*. |
| **Inconsistent Naming**      | Standardize product/category names for consistency.             |
| **Outliers in Quantity/Price** | Detect unrealistic values (e.g., negative quantities).        |
| **Granularity Variations**   | Aggregate to daily/weekly/monthly levels as needed.             |

### 3. Data Preparation Steps  
| Step                        | Description                                                    |
|------------------------------|----------------------------------------------------------------|
| **Feature Engineering**      | Create *Revenue = Price × Quantity*, moving averages, lag features. |
| **Aggregation**              | Summarize sales by product, category, or store depending on use-case. |
| **Seasonality Features**     | Add variables for day-of-week, month, holidays, school terms.  |
| **Promotions/Discounts**     | Encode promotional periods as binary or categorical features.   |
| **External Data (Optional)** | Incorporate weather, macroeconomic, or competitor data if available. |

### 4. Prepared Dataset Outputs  
- **Daily Sales Dataset** – Useful for operational planning.  
- **Weekly Sales Dataset** – Useful for tactical decision-making.  
- **Monthly Sales Dataset** – Useful for long-term strategic planning.