# Sales Forecasting Analysis

This notebook provides exploratory data analysis and model experimentation for the Sales Forecasting ML project.

## Contents
1. Data Loading and Exploration
2. Data Visualization
3. Feature Engineering
4. Model Training and Evaluation
5. Results and Insights

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette('husl')

## 1. Data Loading and Exploration

Load the sales data and perform initial exploration.

In [None]:
# Load the data
df = pd.read_csv('../data/sample_sales_data.csv')
df['date'] = pd.to_datetime(df['date'])

# Display basic information
print(f"Dataset shape: {df.shape}")
print(f"\nDate range: {df['date'].min()} to {df['date'].max()}")
print(f"\nUnique products: {df['product_id'].nunique()}")

# Display first few rows
df.head()

## 2. Data Visualization

Visualize sales trends and patterns.

In [None]:
# Plot revenue over time
plt.figure(figsize=(14, 6))
for product in df['product_id'].unique():
    product_data = df[df['product_id'] == product]
    plt.plot(product_data['date'], product_data['revenue'], label=product, marker='o')

plt.title('Revenue Over Time by Product', fontsize=16)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Revenue ($)', fontsize=12)
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

## 3. Feature Engineering

Create features for model training.

In [None]:
# Add time-based features
df['day_of_week'] = df['date'].dt.dayofweek
df['day_of_month'] = df['date'].dt.day
df['month'] = df['date'].dt.month
df['week_of_year'] = df['date'].dt.isocalendar().week

# Display feature statistics
df.describe()

## 4. Model Training and Evaluation

Train and evaluate the forecasting model.

In [None]:
# Placeholder for model training code
# This section will be expanded with actual model implementation
print("Model training code to be implemented...")

## 5. Results and Insights

Summary of model performance and business insights.

In [None]:
# Placeholder for results
print("Results and insights to be added...")