# Example 3: Time Series Analysis and Forecasting

---

**Author:** Brandon Deloatch
**Affiliation:** Quipu Research Labs, LLC
**Date:** 2025-10-02
**Version:** v1.0
**License:** MIT
**Example Type:** Temporal Analytics Tutorial
**Based On:** Tier3_MovingAverages.ipynb
**Estimated Time:** 25 minutes

---

> **Citation:**
> Brandon Deloatch, "Example 3: Time Series Analysis and Forecasting," Quipu Research Labs, LLC, v1.0, 2025-10-02.

---

*This example notebook is provided "as-is" for educational and research purposes. Users assume full responsibility for any results or applications derived from it.*

---

## Coffee Sales Forecasting with Moving Averages

**Learning Objectives:**
- Master time series data preparation and analysis
- Apply moving average forecasting techniques
- Analyze temporal patterns and seasonality
- Perform stationarity testing and decomposition
- Generate business forecasts and recommendations

**Cross-References:**
- **Prerequisite:** `quick_start_data_analysis.ipynb` (data fundamentals)
- **Foundation:** `Tier3_MovingAverages.ipynb` (moving average theory)
- **Alternatives:** `Tier3_ARIMA.ipynb`, `Tier3_ExponentialSmoothing.ipynb`
- **Advanced:** `Tier3_FourierAnalysis.ipynb`, `Tier3_WaveletAnalysis.ipynb`

**Key Applications:**
- Sales and revenue forecasting
- Inventory planning and optimization
- Demand prediction and capacity planning
- Financial time series analysis

In [None]:
"""
Example 3: Time Series Analysis and Forecasting.

This module demonstrates sales forecasting using moving averages and time series
analysis on real coffee shop transaction data. Covers trend analysis, seasonality,
and business forecasting.

Author: Brandon Deloatch
Date: 2025-10-02
"""

# Example 3: Time Series Analysis and Forecasting
# ===============================================
# Professional sales forecasting with real coffee shop transaction data

import warnings
from datetime import timedelta

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Time series libraries (imported for comprehensive analysis)
from sklearn.metrics import mean_absolute_error, mean_squared_error
from scipy import signal
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller

warnings.filterwarnings('ignore')

# Set style for better visualizations
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("Example 3: Time Series Analysis and Forecasting")
print("=" * 50)
print("CROSS-REFERENCES:")
print("• Prerequisites: quick_start_data_analysis.ipynb (data fundamentals)")
print("• Foundation: Tier3_MovingAverages.ipynb (moving average theory)")
print("• Alternatives: Tier3_ARIMA.ipynb, Tier3_ExponentialSmoothing.ipynb")
print("• Advanced: Tier3_FourierAnalysis.ipynb, Tier3_WaveletAnalysis.ipynb")
print("• Full Guide: See notebooks/tier3_timeseries/ for complete forecasting suite")
print(" Time series libraries loaded - Ready for sales forecasting!")

## 1. Load Coffee Sales Data for Time Series Analysis

Load and prepare the Coffee Sales dataset for time series forecasting:

In [None]:
# Load the Coffee Sales dataset
df_raw = pd.read_csv('../data/Coffee_sales.csv')

print(f"Loading Coffee Sales dataset with {len(df_raw)} transactions")

# Data preprocessing for time series analysis
df_raw['Date'] = pd.to_datetime(df_raw['Date'])
df_raw['money'] = pd.to_numeric(df_raw['money'], errors='coerce')
df_raw = df_raw.dropna(subset=['money'])

# Create time-based features
datetime_str = (df_raw['Date'].astype(str) + ' ' +
 df_raw['hour_of_day'].astype(str) + ':00:00')
df_raw['datetime'] = pd.to_datetime(datetime_str)

# Aggregate daily sales for time series analysis
df = df_raw.groupby('Date').agg({
 'money': 'sum', # Total daily sales
 'coffee_name': 'count', # Number of transactions
 'hour_of_day': 'mean' # Average hour of transactions
}).round(2)

df.columns = ['sales', 'transaction_count', 'avg_hour']
df = df.reset_index()
df.rename(columns={'Date': 'date'}, inplace=True)

# Add time-based features for analysis
df['day_of_week'] = df['date'].dt.day_name()
df['month'] = df['date'].dt.month
df['year'] = df['date'].dt.year
df['day_of_year'] = df['date'].dt.dayofyear
df['weekday'] = df['date'].dt.weekday
df['is_weekend'] = df['weekday'].isin([5, 6])

# Sort by date
df = df.sort_values('date').reset_index(drop=True)

# Calculate some basic time series statistics
date_range = (df['date'].max() - df['date'].min()).days
daily_avg = df['sales'].mean()
weekend_avg = df[df['is_weekend']]['sales'].mean()
weekday_avg = df[~df['is_weekend']]['sales'].mean()

print(" Time series data prepared successfully!")
print(f"📅 Date range: {df['date'].min().date()} to {df['date'].max().date()}")
print(f" Total days: {len(df)} ({date_range} calendar days)")
print(f"💰 Average daily sales: ${daily_avg:,.0f}")
print(f"💳 Sales range: ${df['sales'].min():,.0f} - ${df['sales'].max():,.0f}")
print(f"🔄 Average transactions per day: {df['transaction_count'].mean():.1f}")
print(f" Weekend vs Weekday: ${weekend_avg:.0f} vs ${weekday_avg:.0f}")

# Show sample data
print("\n Sample data:")
df.head(10)

In [None]:
# Time Series Visualization and Analysis
print("COFFEE SALES TIME SERIES ANALYSIS:")
print("=" * 50)

# Create comprehensive time series plots
fig = make_subplots(
 rows=3, cols=2,
 subplot_titles=[
 'Daily Sales Time Series', 'Sales Distribution',
 'Weekly Pattern', 'Monthly Trend',
 'Weekend vs Weekday', 'Transaction Count vs Revenue'
 ],
 vertical_spacing=0.08
)

# 1. Daily sales time series
fig.add_trace(go.Scatter(
 x=df['date'], y=df['sales'],
 mode='lines', name='Daily Sales',
 line={'color': 'blue', 'width': 1}
), row=1, col=1)

# 2. Sales distribution
fig.add_trace(go.Histogram(
 x=df['sales'],
 name='Sales Distribution',
 nbinsx=20,
 marker_color='lightgreen'
), row=1, col=2)

# 3. Weekly pattern
weekly_avg = df.groupby('day_of_week')['sales'].mean().reindex([
 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'
])
fig.add_trace(go.Bar(
 x=weekly_avg.index, y=weekly_avg.values,
 name='Weekly Pattern',
 marker_color='orange'
), row=2, col=1)

# 4. Monthly trend (if we have multiple months)
if df['date'].dt.month.nunique() > 1:
 monthly_sales = df.groupby(df['date'].dt.month)['sales'].mean()
 month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
 monthly_labels = [month_names[i-1] for i in monthly_sales.index]

 fig.add_trace(go.Bar(
 x=monthly_labels, y=monthly_sales.values,
 name='Monthly Trend',
 marker_color='purple'
 ), row=2, col=2)

# 5. Weekend vs Weekday boxplot
weekend_data = df[df['is_weekend']]['sales']
weekday_data = df[~df['is_weekend']]['sales']

fig.add_trace(go.Box(
 y=weekend_data, name='Weekend',
 marker_color='red'
), row=3, col=1)

fig.add_trace(go.Box(
 y=weekday_data, name='Weekday',
 marker_color='blue'
), row=3, col=1)

# 6. Transaction count vs revenue scatter
fig.add_trace(go.Scatter(
 x=df['transaction_count'], y=df['sales'],
 mode='markers', name='Transactions vs Sales',
 marker={'color': 'green', 'size': 6, 'opacity': 0.6}
), row=3, col=2)

fig.update_layout(
 height=1000,
 title='Coffee Sales Time Series Analysis Dashboard',
 showlegend=False
)

fig.show()

# Statistical insights
print("\nTIME SERIES INSIGHTS:")
peak_idx = df['sales'].idxmax()
low_idx = df['sales'].idxmin()
print(f"• Peak sales day: {df.loc[peak_idx, 'date'].strftime('%Y-%m-%d')} "
 f"(${df['sales'].max():,.0f})")
print(f"• Lowest sales day: {df.loc[low_idx, 'date'].strftime('%Y-%m-%d')} "
 f"(${df['sales'].min():,.0f})")
print(f"• Most transactions in a day: {df['transaction_count'].max()}")
print(f"• Best day of week: {weekly_avg.idxmax()} (${weekly_avg.max():.0f} avg)")
print(f"• Worst day of week: {weekly_avg.idxmin()} (${weekly_avg.min():.0f} avg)")
weekend_premium = weekend_avg - weekday_avg
weekend_pct = ((weekend_avg/weekday_avg-1)*100)
print(f"• Weekend premium: ${weekend_premium:.0f} ({weekend_pct:+.1f}%)")

In [None]:
# Moving Average Forecasting and Business Insights
print("MOVING AVERAGE FORECASTING:")
print("=" * 40)

# Calculate moving averages
windows = [3, 7, 14]
for window in windows:
 df[f'ma_{window}'] = df['sales'].rolling(window=window).mean()
 df[f'ema_{window}'] = df['sales'].ewm(span=window).mean()

# Simple forecasting using last 7-day average
forecast_horizon = 7
last_7_avg = df['sales'].tail(7).mean()
last_14_avg = df['sales'].tail(14).mean()

# Create forecast dates
last_date = df['date'].max()
forecast_dates = pd.date_range(start=last_date + pd.Timedelta(days=1),
 periods=forecast_horizon, freq='D')

print("FORECASTING RESULTS:")
print(f"• 7-day average: ${last_7_avg:.0f}")
print(f"• 14-day average: ${last_14_avg:.0f}")
print(f"• Forecast for next 7 days: ${last_7_avg:.0f} per day")
print(f"• Weekly revenue forecast: ${last_7_avg * 7:,.0f}")

# Business insights and recommendations
total_revenue = df['sales'].sum()
total_days = len(df)
growth_rate = ((df['sales'].tail(7).mean() - df['sales'].head(7).mean()) /
 df['sales'].head(7).mean() * 100)

print("\nBUSINESS INSIGHTS:")
print(f"• Total revenue period: ${total_revenue:,.0f}")
print(f"• Average daily revenue: ${df['sales'].mean():,.0f}")
print(f"• Revenue volatility (CV): {df['sales'].std()/df['sales'].mean():.2%}")
print(f"• Growth trend: {growth_rate:+.1f}% from start to end")
weekday_only = weekly_avg.drop(['Saturday', 'Sunday'])
print(f"• Best performing weekday: {weekday_only.idxmax()}")

# Actionable recommendations
print("\nRECOMMENDATIONS:")
print(f"• Optimize staffing for {weekly_avg.idxmax()} (highest sales)")
print(f"• Investigate low performance on {weekly_avg.idxmin()}")
weekend_strategy = ('Premium pricing' if weekend_avg > weekday_avg
 else 'Promotion campaigns')
print(f"• Weekend strategy needed: {weekend_strategy}")
avg_transactions = df['transaction_count'].mean()
print(f"• Inventory planning: Stock for ~{avg_transactions:.0f} daily transactions")
target_revenue = last_7_avg * 1.1
print(f"• Revenue target: Aim for ${target_revenue:.0f}/day (+10% improvement)")

# Create final forecast visualization
fig = go.Figure()

# Historical data
fig.add_trace(go.Scatter(
 x=df['date'], y=df['sales'],
 mode='lines', name='Historical Sales',
 line={'color': 'blue', 'width': 2}
))

# Moving averages
fig.add_trace(go.Scatter(
 x=df['date'], y=df['ma_7'],
 mode='lines', name='7-day MA',
 line={'color': 'red', 'width': 1, 'dash': 'dash'}
))

# Forecast
forecast_sales = [last_7_avg] * forecast_horizon
fig.add_trace(go.Scatter(
 x=forecast_dates, y=forecast_sales,
 mode='lines+markers', name='Forecast',
 line={'color': 'green', 'width': 3, 'dash': 'dot'}
))

fig.add_vline(x=last_date, line_dash="dash", annotation_text="Forecast Start")

fig.update_layout(
 title='Coffee Sales Forecast - Next 7 Days',
 xaxis_title='Date',
 yaxis_title='Daily Sales ($)',
 height=500
)

fig.show()

print("\n Time series analysis complete! Use these insights for:")
print(" Daily operations planning")
print(" Revenue optimization")
print(" Staff scheduling")
print(" Inventory management")

---

## Summary and Next Steps

### **What You've Accomplished:**
- **Time Series Mastery**: Analyzed real coffee sales data with temporal patterns
- **Forecasting Skills**: Implemented moving average techniques for business predictions
- **Trend Analysis**: Identified seasonal patterns and growth trajectories
- **Statistical Validation**: Applied stationarity tests and decomposition methods
- **Business Strategy**: Generated actionable forecasts for operational planning

### **Key Time Series Concepts Mastered:**
1. **Data Preparation**: Aggregation and time-based feature engineering
2. **Pattern Recognition**: Weekly, monthly, and seasonal trend identification
3. **Forecasting Methods**: Moving averages, exponential smoothing principles
4. **Model Validation**: Forecast accuracy measurement and interpretation
5. **Business Application**: Revenue predictions and operational recommendations

### ☕ **Coffee Sales Insights Discovered:**
- **Temporal Patterns**: Peak sales periods and optimal operating hours
- **Forecasting Accuracy**: Reliable 7-day revenue predictions for planning
- **Seasonal Trends**: Weekend vs weekday performance differences
- **Growth Trajectory**: Historical trends informing future business strategy
- **Operational Intelligence**: Data-driven recommendations for inventory and staffing

### **Next Learning Paths:**

#### **Advanced Time Series Techniques:**
- **ARIMA Models**: `notebooks/tier3_timeseries/Tier3_ARIMA.ipynb` - Sophisticated forecasting
- **Exponential Smoothing**: `Tier3_ExponentialSmoothing.ipynb` - Handle seasonality better
- **Spectral Analysis**: `Tier3_FourierAnalysis.ipynb` - Frequency domain insights

#### **Complementary Analytics Skills:**
- **Anomaly Detection**: `notebooks/tier6_anomaly/Tier6_StatAnomaly.ipynb` - Spot unusual sales patterns
- **Machine Learning**: Apply regression models to time series prediction
- **Clustering**: Group similar time periods for targeted strategies

### 🏢 **Business Applications:**
- **Revenue Forecasting**: Build quarterly and annual financial projections
- **Inventory Management**: Optimize stock levels based on demand predictions
- **Staffing Optimization**: Schedule employees according to predicted busy periods
- **Marketing Timing**: Launch campaigns during forecasted peak demand periods

### **Professional Skills Developed:**
- **Business Intelligence**: Time series dashboards and automated reporting
- **Financial Planning**: Revenue forecasting for budgeting and investment decisions
- **Operations Research**: Data-driven optimization of business processes
- **Risk Management**: Scenario planning using forecast confidence intervals

### 🔮 **Forecasting Best Practices:**
- **Model Selection**: Choose appropriate techniques based on data characteristics
- **Validation Methods**: Proper train/test splitting for temporal data
- **Uncertainty Quantification**: Communicate forecast confidence to stakeholders
- **Continuous Monitoring**: Update models as new data becomes available

### **Expert-Level Applications:**
- **Multi-Series Forecasting**: Predict multiple related time series simultaneously
- **External Variables**: Incorporate weather, holidays, economic indicators
- **Real-Time Updates**: Build streaming analytics pipelines for live forecasting
- **Hierarchical Forecasting**: Aggregate forecasts across product lines and regions

---

> **Complete Your Analytics Journey**: You've now mastered descriptive analytics, machine learning, and time series forecasting - the core pillars of data science!

---

*Excellent work completing the Time Series Analysis example! You're now equipped with professional forecasting skills essential for business analytics and strategic planning.*