# Module 04: Moving Averages - The Complete Math

**Difficulty**: ⭐⭐ (Intermediate)

**Estimated Time**: 60 minutes

**Prerequisites**: 
- Module 00: Introduction and Stock Returns
- Module 01: Averages and Central Tendency
- Module 02: Spread and Variation
- Module 03: Percentages, Ratios, and Changes

## Learning Objectives

By the end of this notebook, you will be able to:
1. Calculate **Simple Moving Average (SMA)** step-by-step
2. Understand and compute **Weighted Moving Average (WMA)**
3. Calculate **Exponential Moving Average (EMA)** with smoothing constant
4. Explain **why EMA reacts faster than SMA** mathematically
5. Analyze **lag** and **smoothing** effects of different MA types
6. Apply moving averages to **Malaysian stock trading** decisions

## Why This Matters

**Moving averages are the foundation of trend analysis.**

They appear in virtually every technical indicator:
- **MACD** → Difference between two EMAs
- **Bollinger Bands** → SMA ± 2 standard deviations
- **Stochastic** → SMA of %K to create %D
- **ADX** → Moving averages of directional movement
- **Golden Cross/Death Cross** → SMA crossover signals

Understanding the mathematics of moving averages is essential for:
1. **Choosing the right MA type** for your strategy
2. **Optimizing periods** (10, 20, 50, 200-day)
3. **Understanding lag** and timing of signals
4. **Creating custom indicators** based on MAs

---

## Setup

Let's import libraries and download Malaysian stock data.

In [None]:
# Data manipulation and numerical operations
import pandas as pd
import numpy as np

# Data acquisition
import yfinance as yf

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Display settings
%matplotlib inline
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

# Pandas display options
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 20)
pd.set_option('display.precision', 4)

# Random seed for reproducibility
np.random.seed(42)

print("✓ Libraries imported successfully!")

In [None]:
# Download Malaysian stock data
print("Downloading Malaysian stock data...\n")

# Maybank - stable banking stock (for smooth trends)
maybank = yf.download('1155.KL', start='2023-01-01', end='2024-01-01', progress=False)

# Top Glove - volatile healthcare stock (for testing MA responsiveness)
topglove = yf.download('5225.KL', start='2023-01-01', end='2024-01-01', progress=False)

# Validate data
assert len(maybank) > 0, "Failed to download Maybank data"
assert len(topglove) > 0, "Failed to download Top Glove data"

print(f"✓ Maybank: {len(maybank)} days")
print(f"✓ Top Glove: {len(topglove)} days")
print("\nData ready for analysis!")

---

## Part 1: Simple Moving Average (SMA)

### What is SMA?

**Simple Moving Average (SMA)** is the arithmetic mean of the last N prices.

### Formula

$$
SMA_n = \frac{P_1 + P_2 + P_3 + \cdots + P_n}{n} = \frac{1}{n}\sum_{i=1}^{n}P_i
$$

Where:
- $P_i$ = Price at day $i$
- $n$ = Number of periods (e.g., 10, 20, 50 days)

### Key Properties

1. **Equal weighting** - All prices have the same importance
2. **Lag** - Always behind current price (smoothing effect)
3. **Stable** - Less sensitive to short-term fluctuations
4. **Simple** - Easy to calculate and understand

In [None]:
# Simple example: 5-day SMA calculation
prices = np.array([8.50, 8.55, 8.45, 8.60, 8.52])

print("=" * 70)
print("SIMPLE MOVING AVERAGE (SMA) - MANUAL CALCULATION")
print("=" * 70)

print(f"\nPrices: {prices}")
print(f"\nCalculating 5-day SMA:")
print(f"\nStep 1: Sum all prices")
total = 0
for i, price in enumerate(prices, 1):
    total += price
    print(f"  Day {i}: {price:.2f} → Running total: {total:.2f}")

print(f"\nStep 2: Divide by number of periods (n = 5)")
sma = total / len(prices)
print(f"  SMA = {total:.2f} / {len(prices)} = {sma:.4f}")

# Verify with numpy
sma_numpy = np.mean(prices)
print(f"\nVerification with NumPy: {sma_numpy:.4f}")
print(f"Difference: {abs(sma - sma_numpy):.10f}")
print("\n✓ Calculations match!")

### Rolling SMA - How It Updates

As new prices arrive, the SMA "rolls forward" by:
1. Adding the newest price
2. Dropping the oldest price
3. Recalculating the average

In [None]:
# Demonstrate rolling SMA
prices_extended = np.array([8.50, 8.55, 8.45, 8.60, 8.52, 8.58, 8.62])

print("Rolling 5-Day SMA Demonstration\n")
print("=" * 80)

for i in range(4, len(prices_extended)):
    # Get last 5 prices
    window = prices_extended[i-4:i+1]
    sma = np.mean(window)
    
    print(f"\nDay {i+1}:")
    print(f"  Window: {window}")
    print(f"  SMA = ({' + '.join([f'{p:.2f}' for p in window])}) / 5")
    print(f"  SMA = {sma:.4f}")
    print(f"  Current Price: {prices_extended[i]:.2f}")
    print(f"  Difference: {prices_extended[i] - sma:+.4f} (Price {'above' if prices_extended[i] > sma else 'below'} SMA)")

print("\n" + "=" * 80)
print("Notice: As prices change, the SMA smoothly follows the trend.")

In [None]:
# Calculate SMA for Maybank using different periods
maybank_close = maybank['Close']

# Calculate SMAs with different periods
sma_10 = maybank_close.rolling(window=10).mean()
sma_20 = maybank_close.rolling(window=20).mean()
sma_50 = maybank_close.rolling(window=50).mean()

print("SMA Calculation for Maybank (2023)\n")
print("=" * 70)
print(f"\nCurrent Price: RM {maybank_close.iloc[-1]:.4f}")
print(f"\n10-day SMA: RM {sma_10.iloc[-1]:.4f}")
print(f"20-day SMA: RM {sma_20.iloc[-1]:.4f}")
print(f"50-day SMA: RM {sma_50.iloc[-1]:.4f}")

print(f"\nPrice vs SMAs:")
print(f"  vs 10-day: {maybank_close.iloc[-1] - sma_10.iloc[-1]:+.4f} ({((maybank_close.iloc[-1] / sma_10.iloc[-1] - 1) * 100):+.2f}%)")
print(f"  vs 20-day: {maybank_close.iloc[-1] - sma_20.iloc[-1]:+.4f} ({((maybank_close.iloc[-1] / sma_20.iloc[-1] - 1) * 100):+.2f}%)")
print(f"  vs 50-day: {maybank_close.iloc[-1] - sma_50.iloc[-1]:+.4f} ({((maybank_close.iloc[-1] / sma_50.iloc[-1] - 1) * 100):+.2f}%)")

In [None]:
# Visualize multiple SMAs
plt.figure(figsize=(14, 8))

# Plot price
plt.plot(maybank_close.index, maybank_close.values, 
         linewidth=1.5, label='Price', color='black', alpha=0.7, zorder=5)

# Plot SMAs
plt.plot(sma_10.index, sma_10.values, 
         linewidth=2, label='SMA 10', alpha=0.8)
plt.plot(sma_20.index, sma_20.values, 
         linewidth=2, label='SMA 20', alpha=0.8)
plt.plot(sma_50.index, sma_50.values, 
         linewidth=2, label='SMA 50', alpha=0.8)

plt.title('Maybank - Price with Multiple SMAs (2023)', fontsize=14, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Price (RM)', fontsize=12)
plt.legend(loc='best', fontsize=10)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("Observations:")
print("• Shorter SMAs (10-day) follow price more closely")
print("• Longer SMAs (50-day) are smoother but have more lag")
print("• All SMAs are behind the current price (lag effect)")
print("• SMAs act as dynamic support/resistance levels")

---

## Part 2: Weighted Moving Average (WMA)

### What is WMA?

**Weighted Moving Average (WMA)** gives more weight to recent prices.

### Formula

$$
WMA_n = \frac{n \cdot P_1 + (n-1) \cdot P_2 + (n-2) \cdot P_3 + \cdots + 1 \cdot P_n}{n + (n-1) + (n-2) + \cdots + 1}
$$

Simplified:

$$
WMA_n = \frac{\sum_{i=1}^{n} (n - i + 1) \cdot P_i}{\sum_{i=1}^{n} i} = \frac{\sum_{i=1}^{n} (n - i + 1) \cdot P_i}{\frac{n(n+1)}{2}}
$$

### Key Properties

1. **Linear weighting** - Most recent price gets highest weight
2. **Faster response** than SMA (but still lags)
3. **More complex** to calculate than SMA
4. **Arithmetic decay** - Weights decrease linearly

In [None]:
# Example: 5-day WMA calculation
prices = np.array([8.50, 8.55, 8.45, 8.60, 8.52])

print("=" * 70)
print("WEIGHTED MOVING AVERAGE (WMA) - MANUAL CALCULATION")
print("=" * 70)

print(f"\nPrices (oldest to newest): {prices}")
print(f"\nWeights for 5-day WMA: [1, 2, 3, 4, 5]")
print(f"(Oldest price gets weight 1, newest gets weight 5)")

# Calculate weights
n = len(prices)
weights = np.arange(1, n + 1)

print(f"\nStep 1: Multiply each price by its weight")
weighted_sum = 0
for i, (price, weight) in enumerate(zip(prices, weights), 1):
    weighted_price = price * weight
    weighted_sum += weighted_price
    print(f"  Day {i}: {price:.2f} × {weight} = {weighted_price:.2f}")

print(f"\nSum of weighted prices: {weighted_sum:.2f}")

print(f"\nStep 2: Calculate sum of weights")
sum_weights = np.sum(weights)
print(f"  Sum of weights = {' + '.join(map(str, weights))} = {sum_weights}")
print(f"  Formula: n(n+1)/2 = 5(6)/2 = {n*(n+1)//2}")

print(f"\nStep 3: Divide weighted sum by sum of weights")
wma = weighted_sum / sum_weights
print(f"  WMA = {weighted_sum:.2f} / {sum_weights} = {wma:.4f}")

# Compare with SMA
sma = np.mean(prices)
print(f"\nComparison:")
print(f"  WMA: {wma:.4f}")
print(f"  SMA: {sma:.4f}")
print(f"  Difference: {wma - sma:+.4f}")
print(f"\nNotice: WMA is {'higher' if wma > sma else 'lower'} because recent prices have more weight")

In [None]:
# Implement WMA function
def calculate_wma(prices, period):
    """
    Calculate Weighted Moving Average.
    
    Parameters:
    -----------
    prices : pd.Series or np.array
        Price data
    period : int
        Number of periods
    
    Returns:
    --------
    pd.Series
        WMA values
    """
    weights = np.arange(1, period + 1)
    sum_weights = weights.sum()
    
    def wma_calc(window):
        if len(window) < period:
            return np.nan
        return np.sum(weights * window) / sum_weights
    
    return prices.rolling(window=period).apply(wma_calc, raw=True)

# Calculate WMA for Maybank
wma_10 = calculate_wma(maybank_close, 10)
wma_20 = calculate_wma(maybank_close, 20)

print("WMA vs SMA Comparison for Maybank\n")
print("=" * 70)
print(f"\nCurrent Price: RM {maybank_close.iloc[-1]:.4f}")
print(f"\n10-day WMA: RM {wma_10.iloc[-1]:.4f}")
print(f"10-day SMA: RM {sma_10.iloc[-1]:.4f}")
print(f"Difference: {wma_10.iloc[-1] - sma_10.iloc[-1]:+.4f}")

print(f"\n20-day WMA: RM {wma_20.iloc[-1]:.4f}")
print(f"20-day SMA: RM {sma_20.iloc[-1]:.4f}")
print(f"Difference: {wma_20.iloc[-1] - sma_20.iloc[-1]:+.4f}")

---

## Part 3: Exponential Moving Average (EMA)

### What is EMA?

**Exponential Moving Average (EMA)** gives exponentially decreasing weights to older prices.

### Formula

$$
\begin{aligned}
EMA_{today} &= \alpha \cdot P_{today} + (1 - \alpha) \cdot EMA_{yesterday} \\
\text{where: } \alpha &= \frac{2}{n + 1}
\end{aligned}
$$

Where:
- $\alpha$ = Smoothing factor (0 < α < 1)
- $P_{today}$ = Today's price
- $EMA_{yesterday}$ = Yesterday's EMA
- $n$ = Number of periods

### Key Properties

1. **Exponential weighting** - Recent prices matter more
2. **Fastest response** - More reactive than SMA or WMA
3. **Recursive** - Uses previous EMA in calculation
4. **Continuous** - Never completely drops old prices
5. **Most popular** - Used in MACD, many strategies

In [None]:
# Understanding the smoothing factor (alpha)
print("=" * 70)
print("EXPONENTIAL MOVING AVERAGE (EMA) - SMOOTHING FACTOR")
print("=" * 70)

periods = [10, 20, 50, 100, 200]

print(f"\n{'Period':>8} | {'Alpha (α)':>12} | {'Interpretation':^40}")
print("-" * 70)

for n in periods:
    alpha = 2 / (n + 1)
    weight_today = alpha * 100
    weight_past = (1 - alpha) * 100
    
    interpretation = f"{weight_today:.1f}% today, {weight_past:.1f}% past EMA"
    print(f"{n:8d} | {alpha:12.6f} | {interpretation:^40}")

print("\n" + "=" * 70)
print("Key Insights:")
print("• Smaller α (longer period) = More weight on past → Smoother")
print("• Larger α (shorter period) = More weight on today → More reactive")
print("• 10-day EMA: 18.2% weight on today's price")
print("• 200-day EMA: Only 1.0% weight on today's price")

In [None]:
# Manual EMA calculation step-by-step
prices = np.array([8.50, 8.55, 8.45, 8.60, 8.52, 8.58, 8.62, 8.65, 8.70, 8.68])
period = 5
alpha = 2 / (period + 1)

print("EMA Calculation - Step by Step (5-day EMA)\n")
print("=" * 80)
print(f"\nSmoothing factor (α) = 2 / (5 + 1) = {alpha:.4f}")
print(f"This means: {alpha*100:.2f}% weight on today, {(1-alpha)*100:.2f}% weight on past EMA")

# Start with SMA for first EMA value
ema_values = []
ema = np.mean(prices[:period])
ema_values.append(ema)

print(f"\nInitial EMA (using SMA of first 5 days):")
print(f"  Prices: {prices[:period]}")
print(f"  EMA[0] = SMA = {ema:.4f}")

print(f"\nNow calculate EMA recursively:")
print(f"Formula: EMA[today] = α × Price[today] + (1 - α) × EMA[yesterday]\n")

for i in range(period, len(prices)):
    price_today = prices[i]
    ema_yesterday = ema
    ema = alpha * price_today + (1 - alpha) * ema_yesterday
    ema_values.append(ema)
    
    print(f"Day {i+1}:")
    print(f"  Price today: {price_today:.2f}")
    print(f"  EMA yesterday: {ema_yesterday:.4f}")
    print(f"  EMA = {alpha:.4f} × {price_today:.2f} + {1-alpha:.4f} × {ema_yesterday:.4f}")
    print(f"  EMA = {alpha * price_today:.4f} + {(1-alpha) * ema_yesterday:.4f}")
    print(f"  EMA = {ema:.4f}\n")

print("=" * 80)
print(f"Final EMA: {ema:.4f}")

In [None]:
# Calculate EMA for Maybank
ema_10 = maybank_close.ewm(span=10, adjust=False).mean()
ema_20 = maybank_close.ewm(span=20, adjust=False).mean()
ema_50 = maybank_close.ewm(span=50, adjust=False).mean()

print("EMA Calculation for Maybank (2023)\n")
print("=" * 70)
print(f"\nCurrent Price: RM {maybank_close.iloc[-1]:.4f}")
print(f"\n10-day EMA: RM {ema_10.iloc[-1]:.4f}")
print(f"20-day EMA: RM {ema_20.iloc[-1]:.4f}")
print(f"50-day EMA: RM {ema_50.iloc[-1]:.4f}")

print(f"\nComparing EMA vs SMA (10-day):")
print(f"  EMA: RM {ema_10.iloc[-1]:.4f}")
print(f"  SMA: RM {sma_10.iloc[-1]:.4f}")
print(f"  Difference: {ema_10.iloc[-1] - sma_10.iloc[-1]:+.4f}")
print(f"\n  → EMA is {'closer to' if abs(ema_10.iloc[-1] - maybank_close.iloc[-1]) < abs(sma_10.iloc[-1] - maybank_close.iloc[-1]) else 'farther from'} current price")

In [None]:
# Visualize EMA vs SMA
plt.figure(figsize=(14, 8))

# Plot price
plt.plot(maybank_close.index, maybank_close.values, 
         linewidth=1.5, label='Price', color='black', alpha=0.7, zorder=5)

# Plot SMAs
plt.plot(sma_20.index, sma_20.values, 
         linewidth=2, label='SMA 20', linestyle='--', alpha=0.7)
plt.plot(sma_50.index, sma_50.values, 
         linewidth=2, label='SMA 50', linestyle='--', alpha=0.7)

# Plot EMAs
plt.plot(ema_20.index, ema_20.values, 
         linewidth=2, label='EMA 20', alpha=0.9)
plt.plot(ema_50.index, ema_50.values, 
         linewidth=2, label='EMA 50', alpha=0.9)

plt.title('Maybank - EMA vs SMA Comparison (2023)', fontsize=14, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Price (RM)', fontsize=12)
plt.legend(loc='best', fontsize=10)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("Observations:")
print("• EMAs (solid lines) react faster to price changes than SMAs (dashed)")
print("• EMAs are closer to current price (less lag)")
print("• During trends, EMAs turn direction sooner than SMAs")
print("• Both types smooth out noise, but EMA is more responsive")

---

## Part 4: Comparing MA Types - Responsiveness

### Why EMA Reacts Faster - The Mathematics

Let's analyze the **effective lookback** of each MA type:

**SMA**: Equal weights → Effective lookback = n periods

**WMA**: Linear weights → Effective lookback ≈ (2n)/3 periods

**EMA**: Exponential weights → Effective lookback ≈ (n-1)/2 periods

This means a 20-day EMA behaves like a ~9.5-day SMA in terms of lag!

In [None]:
# Visualize weight distribution for different MA types
period = 20

# Generate weights
days = np.arange(1, period + 1)

# SMA weights (equal)
sma_weights = np.ones(period) / period

# WMA weights (linear)
wma_weights = days / days.sum()

# EMA weights (exponential)
alpha = 2 / (period + 1)
ema_weights = np.array([alpha * (1 - alpha)**i for i in range(period)])
ema_weights = ema_weights / ema_weights.sum()  # Normalize

# Plot
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# SMA
axes[0].bar(days, sma_weights, color='blue', alpha=0.7, edgecolor='black')
axes[0].set_title('SMA Weights (Equal)', fontsize=12, fontweight='bold')
axes[0].set_xlabel('Days Ago')
axes[0].set_ylabel('Weight')
axes[0].grid(True, alpha=0.3, axis='y')
axes[0].set_ylim(0, max(ema_weights) * 1.1)

# WMA
axes[1].bar(days, wma_weights[::-1], color='orange', alpha=0.7, edgecolor='black')
axes[1].set_title('WMA Weights (Linear)', fontsize=12, fontweight='bold')
axes[1].set_xlabel('Days Ago')
axes[1].set_ylabel('Weight')
axes[1].grid(True, alpha=0.3, axis='y')
axes[1].set_ylim(0, max(ema_weights) * 1.1)

# EMA
axes[2].bar(days, ema_weights, color='green', alpha=0.7, edgecolor='black')
axes[2].set_title('EMA Weights (Exponential)', fontsize=12, fontweight='bold')
axes[2].set_xlabel('Days Ago')
axes[2].set_ylabel('Weight')
axes[2].grid(True, alpha=0.3, axis='y')
axes[2].set_ylim(0, max(ema_weights) * 1.1)

plt.tight_layout()
plt.show()

print("Weight Distribution (20-day MA):")
print(f"\nSMA:")
print(f"  Most recent day: {sma_weights[0]*100:.2f}%")
print(f"  Oldest day: {sma_weights[-1]*100:.2f}%")

print(f"\nWMA:")
print(f"  Most recent day: {wma_weights[-1]*100:.2f}%")
print(f"  Oldest day: {wma_weights[0]*100:.2f}%")

print(f"\nEMA:")
print(f"  Most recent day: {ema_weights[0]*100:.2f}%")
print(f"  10 days ago: {ema_weights[9]*100:.2f}%")
print(f"  20 days ago: {ema_weights[19]*100:.2f}%")

print(f"\nKey Insight:")
print(f"EMA gives {ema_weights[0]/sma_weights[0]:.1f}x more weight to today than SMA!")

In [None]:
# Test responsiveness on volatile stock (Top Glove)
topglove_close = topglove['Close']

# Calculate all three types
tg_sma_20 = topglove_close.rolling(window=20).mean()
tg_wma_20 = calculate_wma(topglove_close, 20)
tg_ema_20 = topglove_close.ewm(span=20, adjust=False).mean()

# Visualize on volatile stock
plt.figure(figsize=(14, 8))

plt.plot(topglove_close.index, topglove_close.values, 
         linewidth=1.5, label='Price', color='black', alpha=0.8, zorder=5)
plt.plot(tg_sma_20.index, tg_sma_20.values, 
         linewidth=2, label='SMA 20', alpha=0.8, linestyle='--')
plt.plot(tg_wma_20.index, tg_wma_20.values, 
         linewidth=2, label='WMA 20', alpha=0.8, linestyle='-.')
plt.plot(tg_ema_20.index, tg_ema_20.values, 
         linewidth=2, label='EMA 20', alpha=0.9)

plt.title('Top Glove - MA Type Comparison on Volatile Stock (2023)', 
          fontsize=14, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Price (RM)', fontsize=12)
plt.legend(loc='best', fontsize=10)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("Responsiveness Comparison on Top Glove (volatile stock):\n")
print("Order of responsiveness (fastest to slowest):")
print("1. EMA - Most reactive, follows price changes quickly")
print("2. WMA - Moderate reactivity, between EMA and SMA")
print("3. SMA - Slowest, smoothest, most lag")
print("\nBest use cases:")
print("• EMA: Short-term trading, quick signals")
print("• WMA: Balanced approach, less common")
print("• SMA: Long-term trends, support/resistance levels")

---

## Part 5: Lag Analysis and Crossover Signals

### Understanding Lag

**Lag** is the delay between a price change and the MA's response.

- **More lag** = Smoother, fewer false signals, but later entries/exits
- **Less lag** = More responsive, earlier signals, but more whipsaws

### Golden Cross and Death Cross

- **Golden Cross**: Short MA crosses above long MA (bullish)
- **Death Cross**: Short MA crosses below long MA (bearish)

Common pairs: 50/200, 20/50, 10/20

In [None]:
# Detect crossover signals
# Using 20-day and 50-day SMAs for Maybank

# Golden Cross: SMA20 crosses above SMA50
golden_cross = (sma_20 > sma_50) & (sma_20.shift(1) <= sma_50.shift(1))

# Death Cross: SMA20 crosses below SMA50
death_cross = (sma_20 < sma_50) & (sma_20.shift(1) >= sma_50.shift(1))

# Visualize crossovers
plt.figure(figsize=(14, 10))

# Price subplot
plt.subplot(2, 1, 1)
plt.plot(maybank_close.index, maybank_close.values, 
         linewidth=1.5, label='Price', color='black', alpha=0.7)
plt.plot(sma_20.index, sma_20.values, 
         linewidth=2, label='SMA 20', alpha=0.8)
plt.plot(sma_50.index, sma_50.values, 
         linewidth=2, label='SMA 50', alpha=0.8)

# Mark crossovers on price chart
for date in maybank_close[golden_cross].index:
    plt.axvline(x=date, color='green', linestyle=':', alpha=0.5, linewidth=1.5)
for date in maybank_close[death_cross].index:
    plt.axvline(x=date, color='red', linestyle=':', alpha=0.5, linewidth=1.5)

plt.title('Maybank - Price with SMA Crossover Signals', fontsize=12, fontweight='bold')
plt.ylabel('Price (RM)')
plt.legend(loc='best')
plt.grid(True, alpha=0.3)

# MA difference subplot
plt.subplot(2, 1, 2)
ma_diff = sma_20 - sma_50
plt.plot(ma_diff.index, ma_diff.values, linewidth=2, color='purple')
plt.axhline(y=0, color='black', linestyle='--', linewidth=2, alpha=0.7)
plt.fill_between(ma_diff.index, 0, ma_diff.values, 
                 where=(ma_diff >= 0), alpha=0.3, color='green', label='Bullish (SMA20 > SMA50)')
plt.fill_between(ma_diff.index, 0, ma_diff.values, 
                 where=(ma_diff < 0), alpha=0.3, color='red', label='Bearish (SMA20 < SMA50)')

# Mark crossovers
plt.scatter(maybank_close[golden_cross].index, ma_diff[golden_cross].values,
           marker='^', s=200, color='green', zorder=5, 
           edgecolors='black', linewidths=2, label='Golden Cross')
plt.scatter(maybank_close[death_cross].index, ma_diff[death_cross].values,
           marker='v', s=200, color='red', zorder=5, 
           edgecolors='black', linewidths=2, label='Death Cross')

plt.title('SMA Difference (SMA20 - SMA50)', fontsize=12, fontweight='bold')
plt.xlabel('Date')
plt.ylabel('Difference (RM)')
plt.legend(loc='best')
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Print crossover events
print("Crossover Signals for Maybank (2023):\n")
print(f"Golden Crosses (bullish): {golden_cross.sum()}")
if golden_cross.sum() > 0:
    print("Dates:")
    for date in maybank_close[golden_cross].index:
        price = maybank_close.loc[date]
        print(f"  {date.strftime('%Y-%m-%d')}: Price = RM {price:.2f}")

print(f"\nDeath Crosses (bearish): {death_cross.sum()}")
if death_cross.sum() > 0:
    print("Dates:")
    for date in maybank_close[death_cross].index:
        price = maybank_close.loc[date]
        print(f"  {date.strftime('%Y-%m-%d')}: Price = RM {price:.2f}")

---

## Part 6: Exercises

Time to practice! Complete these exercises to master moving averages.

### Exercise 1: Calculate Triple EMA (TEMA)

TEMA is a more responsive moving average that applies EMA three times.

**Formula**:
$$
TEMA = 3 \times EMA_1 - 3 \times EMA_2 + EMA_3
$$

Where:
- $EMA_1$ = EMA of price
- $EMA_2$ = EMA of $EMA_1$
- $EMA_3$ = EMA of $EMA_2$

**Tasks**:
1. Calculate 20-day TEMA for Top Glove
2. Compare TEMA with regular EMA and SMA
3. Visualize all three on the same chart
4. Explain which is most responsive

In [None]:
# Your code here



<details>
<summary><b>Click here for solution</b></summary>

```python
# Calculate TEMA for Top Glove
topglove_close = topglove['Close']
period = 20

# Step 1: EMA of price
ema1 = topglove_close.ewm(span=period, adjust=False).mean()

# Step 2: EMA of EMA
ema2 = ema1.ewm(span=period, adjust=False).mean()

# Step 3: EMA of EMA of EMA
ema3 = ema2.ewm(span=period, adjust=False).mean()

# Step 4: Calculate TEMA
tema = 3 * ema1 - 3 * ema2 + ema3

# Also calculate SMA and regular EMA for comparison
sma_20 = topglove_close.rolling(window=period).mean()
ema_20 = topglove_close.ewm(span=period, adjust=False).mean()

# Visualize
plt.figure(figsize=(14, 8))

plt.plot(topglove_close.index, topglove_close.values,
         linewidth=1.5, label='Price', color='black', alpha=0.7, zorder=5)
plt.plot(sma_20.index, sma_20.values,
         linewidth=2, label='SMA 20', linestyle='--', alpha=0.7)
plt.plot(ema_20.index, ema_20.values,
         linewidth=2, label='EMA 20', alpha=0.8)
plt.plot(tema.index, tema.values,
         linewidth=2, label='TEMA 20', alpha=0.9)

plt.title('Top Glove - TEMA vs EMA vs SMA Comparison', fontsize=14, fontweight='bold')
plt.xlabel('Date')
plt.ylabel('Price (RM)')
plt.legend(loc='best')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Calculate current distances from price
current_price = topglove_close.iloc[-1]
sma_dist = abs(current_price - sma_20.iloc[-1])
ema_dist = abs(current_price - ema_20.iloc[-1])
tema_dist = abs(current_price - tema.iloc[-1])

print("Responsiveness Comparison:")
print(f"\nCurrent Price: RM {current_price:.4f}")
print(f"\nDistance from current price:")
print(f"  SMA 20:  RM {sma_dist:.4f}")
print(f"  EMA 20:  RM {ema_dist:.4f}")
print(f"  TEMA 20: RM {tema_dist:.4f}")
print(f"\nRanking (most to least responsive):")
mas = [('TEMA', tema_dist), ('EMA', ema_dist), ('SMA', sma_dist)]
mas_sorted = sorted(mas, key=lambda x: x[1])
for i, (name, dist) in enumerate(mas_sorted, 1):
    print(f"  {i}. {name} (distance: RM {dist:.4f})")
```
</details>

### Exercise 2: Optimize MA Period

Find the optimal SMA period for Maybank that minimizes lag while filtering noise.

**Tasks**:
1. Calculate SMAs for periods from 5 to 100 days (step = 5)
2. For each period, calculate mean absolute error vs price
3. Plot error vs period
4. Identify the "sweet spot" period
5. Explain the trade-off between short and long periods

In [None]:
# Your code here



<details>
<summary><b>Click here for solution</b></summary>

```python
# Test different MA periods
maybank_close = maybank['Close']
periods = range(5, 101, 5)

results = []

for period in periods:
    # Calculate SMA
    sma = maybank_close.rolling(window=period).mean()
    
    # Calculate mean absolute error (excluding NaN)
    mae = (maybank_close - sma).abs().mean()
    
    # Calculate standard deviation of errors (measure of consistency)
    std_error = (maybank_close - sma).std()
    
    results.append({
        'period': period,
        'mae': mae,
        'std_error': std_error
    })

results_df = pd.DataFrame(results)

# Visualize
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

# MAE vs Period
axes[0].plot(results_df['period'], results_df['mae'], 
            linewidth=3, marker='o', markersize=6, color='blue')
axes[0].set_title('Mean Absolute Error vs MA Period', fontsize=12, fontweight='bold')
axes[0].set_xlabel('Period (days)')
axes[0].set_ylabel('Mean Absolute Error (RM)')
axes[0].grid(True, alpha=0.3)

# Std Error vs Period
axes[1].plot(results_df['period'], results_df['std_error'], 
            linewidth=3, marker='o', markersize=6, color='red')
axes[1].set_title('Error Volatility vs MA Period', fontsize=12, fontweight='bold')
axes[1].set_xlabel('Period (days)')
axes[1].set_ylabel('Error Standard Deviation (RM)')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Find optimal period (lowest MAE)
optimal_idx = results_df['mae'].idxmin()
optimal_period = results_df.loc[optimal_idx, 'period']
optimal_mae = results_df.loc[optimal_idx, 'mae']

print("MA Period Optimization Results:")
print(f"\nOptimal Period: {optimal_period} days")
print(f"Mean Absolute Error: RM {optimal_mae:.4f}")
print(f"\nTrade-offs:")
print(f"  Short periods (5-20):  Low lag, follows price closely, more noise")
print(f"  Medium periods (20-50): Balanced, good for trend following")
print(f"  Long periods (50-100): Very smooth, good support/resistance, high lag")
print(f"\nRecommendation for Maybank:")
print(f"  Short-term trading: 10-20 day MA")
print(f"  Medium-term: 20-50 day MA")
print(f"  Long-term: 50-200 day MA")
```
</details>

### Exercise 3: Build MACD from Scratch

MACD (Moving Average Convergence Divergence) is built entirely from EMAs.

**Formula**:
- MACD Line = EMA(12) - EMA(26)
- Signal Line = EMA(9) of MACD Line
- Histogram = MACD Line - Signal Line

**Tasks**:
1. Calculate MACD components for Maybank
2. Create a proper MACD visualization (price + MACD + histogram)
3. Identify bullish and bearish crossovers
4. Explain what MACD measures

In [None]:
# Your code here



<details>
<summary><b>Click here for solution</b></summary>

```python
# Calculate MACD components
maybank_close = maybank['Close']

# Step 1: Calculate fast and slow EMAs
ema_12 = maybank_close.ewm(span=12, adjust=False).mean()
ema_26 = maybank_close.ewm(span=26, adjust=False).mean()

# Step 2: Calculate MACD line
macd_line = ema_12 - ema_26

# Step 3: Calculate Signal line (9-day EMA of MACD)
signal_line = macd_line.ewm(span=9, adjust=False).mean()

# Step 4: Calculate Histogram
histogram = macd_line - signal_line

# Identify crossovers
bullish_cross = (macd_line > signal_line) & (macd_line.shift(1) <= signal_line.shift(1))
bearish_cross = (macd_line < signal_line) & (macd_line.shift(1) >= signal_line.shift(1))

# Visualize
fig, axes = plt.subplots(2, 1, figsize=(14, 10), height_ratios=[2, 1])

# Price chart
axes[0].plot(maybank_close.index, maybank_close.values,
            linewidth=1.5, label='Price', color='black', alpha=0.7)
axes[0].plot(ema_12.index, ema_12.values,
            linewidth=1.5, label='EMA 12', alpha=0.7, linestyle='--')
axes[0].plot(ema_26.index, ema_26.values,
            linewidth=1.5, label='EMA 26', alpha=0.7, linestyle='--')

# Mark crossovers on price chart
for date in maybank_close[bullish_cross].index:
    axes[0].axvline(x=date, color='green', linestyle=':', alpha=0.3, linewidth=1.5)
for date in maybank_close[bearish_cross].index:
    axes[0].axvline(x=date, color='red', linestyle=':', alpha=0.3, linewidth=1.5)

axes[0].set_title('Maybank - Price with EMA 12/26', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Price (RM)')
axes[0].legend(loc='best')
axes[0].grid(True, alpha=0.3)

# MACD chart
axes[1].plot(macd_line.index, macd_line.values,
            linewidth=2, label='MACD Line', color='blue')
axes[1].plot(signal_line.index, signal_line.values,
            linewidth=2, label='Signal Line', color='red')

# Histogram
colors = ['green' if h > 0 else 'red' for h in histogram]
axes[1].bar(histogram.index, histogram.values, 
           width=1, color=colors, alpha=0.3, label='Histogram')

# Mark crossovers
axes[1].scatter(macd_line[bullish_cross].index, macd_line[bullish_cross].values,
               marker='^', s=150, color='green', zorder=5,
               edgecolors='black', linewidths=2, label='Bullish Cross')
axes[1].scatter(macd_line[bearish_cross].index, macd_line[bearish_cross].values,
               marker='v', s=150, color='red', zorder=5,
               edgecolors='black', linewidths=2, label='Bearish Cross')

axes[1].axhline(y=0, color='black', linestyle='-', linewidth=1, alpha=0.5)
axes[1].set_title('MACD Indicator', fontsize=12, fontweight='bold')
axes[1].set_xlabel('Date')
axes[1].set_ylabel('MACD Value')
axes[1].legend(loc='best', fontsize=9)
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("MACD Analysis for Maybank:")
print(f"\nCurrent Values:")
print(f"  MACD Line:   {macd_line.iloc[-1]:.4f}")
print(f"  Signal Line: {signal_line.iloc[-1]:.4f}")
print(f"  Histogram:   {histogram.iloc[-1]:.4f}")

print(f"\nSignals:")
print(f"  Bullish crossovers: {bullish_cross.sum()}")
print(f"  Bearish crossovers: {bearish_cross.sum()}")

print(f"\nWhat MACD Measures:")
print(f"  • MACD Line: Difference between fast (12) and slow (26) EMAs")
print(f"  • Signal Line: Smoothed MACD (9-day EMA)")
print(f"  • Histogram: Momentum strength (MACD - Signal)")
print(f"  • Crossovers: Trend change signals")
print(f"  • Zero line: Trend direction (above = bullish, below = bearish)")
```
</details>

### Exercise 4: Envelope Channels

Create envelope channels using SMAs with percentage bands.

**Formula**:
- Middle Band = SMA(20)
- Upper Band = SMA(20) × (1 + percentage/100)
- Lower Band = SMA(20) × (1 - percentage/100)

**Tasks**:
1. Calculate 20-day SMA envelope with ±5% bands for Top Glove
2. Identify when price touches or breaks bands
3. Compare with Bollinger Bands (from Module 02)
4. Explain the difference between percentage bands and standard deviation bands

In [None]:
# Your code here



<details>
<summary><b>Click here for solution</b></summary>

```python
# Calculate envelope channels
topglove_close = topglove['Close']
period = 20
percentage = 5  # ±5%

# Calculate middle band (SMA)
middle_band = topglove_close.rolling(window=period).mean()

# Calculate envelope bands
upper_band = middle_band * (1 + percentage / 100)
lower_band = middle_band * (1 - percentage / 100)

# Calculate Bollinger Bands for comparison
bb_std = topglove_close.rolling(window=period).std()
bb_upper = middle_band + (2 * bb_std)
bb_lower = middle_band - (2 * bb_std)

# Identify touches
touch_upper = topglove_close >= upper_band
touch_lower = topglove_close <= lower_band

# Visualize
plt.figure(figsize=(14, 10))

# Envelope subplot
plt.subplot(2, 1, 1)
plt.plot(topglove_close.index, topglove_close.values,
        linewidth=1.5, label='Price', color='black', alpha=0.7, zorder=5)
plt.plot(middle_band.index, middle_band.values,
        linewidth=2, label='SMA 20 (Middle)', color='blue', linestyle='--')
plt.plot(upper_band.index, upper_band.values,
        linewidth=1.5, label=f'Upper Band (+{percentage}%)', color='red', linestyle=':')
plt.plot(lower_band.index, lower_band.values,
        linewidth=1.5, label=f'Lower Band (-{percentage}%)', color='green', linestyle=':')
plt.fill_between(middle_band.index, lower_band, upper_band, alpha=0.1, color='gray')

# Mark touches
plt.scatter(topglove_close[touch_upper].index, topglove_close[touch_upper].values,
           color='red', s=30, alpha=0.5, label='Upper band touch')
plt.scatter(topglove_close[touch_lower].index, topglove_close[touch_lower].values,
           color='green', s=30, alpha=0.5, label='Lower band touch')

plt.title(f'Top Glove - Envelope Channels (±{percentage}%)', fontsize=12, fontweight='bold')
plt.ylabel('Price (RM)')
plt.legend(loc='best', fontsize=9)
plt.grid(True, alpha=0.3)

# Bollinger Bands subplot for comparison
plt.subplot(2, 1, 2)
plt.plot(topglove_close.index, topglove_close.values,
        linewidth=1.5, label='Price', color='black', alpha=0.7, zorder=5)
plt.plot(middle_band.index, middle_band.values,
        linewidth=2, label='SMA 20', color='blue', linestyle='--')
plt.plot(bb_upper.index, bb_upper.values,
        linewidth=1.5, label='Upper BB (SMA + 2σ)', color='red', linestyle=':')
plt.plot(bb_lower.index, bb_lower.values,
        linewidth=1.5, label='Lower BB (SMA - 2σ)', color='green', linestyle=':')
plt.fill_between(middle_band.index, bb_lower, bb_upper, alpha=0.1, color='gray')

plt.title('Top Glove - Bollinger Bands (for comparison)', fontsize=12, fontweight='bold')
plt.xlabel('Date')
plt.ylabel('Price (RM)')
plt.legend(loc='best', fontsize=9)
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Statistics
print("Envelope Channel Analysis:")
print(f"\nTouches:")
print(f"  Upper band: {touch_upper.sum()} times ({touch_upper.sum()/len(topglove_close)*100:.1f}%)")
print(f"  Lower band: {touch_lower.sum()} times ({touch_lower.sum()/len(topglove_close)*100:.1f}%)")

print(f"\nDifference: Envelope vs Bollinger Bands:")
print(f"  Envelope: Fixed percentage bands (±{percentage}%)")
print(f"    - Constant width relative to SMA")
print(f"    - Does NOT adapt to volatility")
print(f"    - Simpler to calculate")
print(f"  Bollinger Bands: Standard deviation bands")
print(f"    - Width varies with volatility")
print(f"    - Adapts to market conditions")
print(f"    - More sophisticated")
print(f"\nUse cases:")
print(f"  Envelopes: Stable markets, mean reversion strategies")
print(f"  Bollinger: Volatile markets, breakout detection")
```
</details>

---

## Summary

Congratulations! You've completed Module 04. Let's review what you mastered:

### Key Concepts Mastered

1. **Simple Moving Average (SMA)**
   - Formula: Average of last N prices
   - Equal weighting for all periods
   - Smoothest, most lag
   - Best for: Long-term trends, support/resistance

2. **Weighted Moving Average (WMA)**
   - Linear weighting (recent prices weighted more)
   - More responsive than SMA
   - Less common in practice
   - Best for: Balanced approach

3. **Exponential Moving Average (EMA)**
   - Exponential weighting with smoothing factor α = 2/(n+1)
   - Most responsive to recent changes
   - Recursive calculation
   - Best for: Short-term trading, quick signals

4. **Mathematical Differences**
   - SMA: All prices equal weight
   - WMA: Linear decay in weights
   - EMA: Exponential decay in weights
   - EMA ≈ (n-1)/2 effective lookback vs SMA's n periods

5. **Lag and Responsiveness**
   - Longer periods = More smoothing, more lag
   - Shorter periods = Less lag, more noise
   - EMA < WMA < SMA in terms of lag

6. **Practical Applications**
   - Golden Cross / Death Cross (MA crossovers)
   - Dynamic support and resistance
   - MACD (based on EMAs)
   - Bollinger Bands (based on SMA)

### How This Connects to Technical Indicators

Moving averages are fundamental to:
- **MACD**: EMA(12) - EMA(26)
- **Bollinger Bands**: SMA(20) ± 2σ
- **Stochastic %D**: SMA of %K
- **ADX**: Multiple moving averages of directional movement
- **Ichimoku Cloud**: Multiple period averages

### What's Next?

In **Module 05: Momentum Mathematics**, you'll learn:
- RSI calculation in depth
- Stochastic Oscillator complete math
- ROC and momentum indicators
- Williams %R
- Combining momentum with moving averages

### Additional Practice

Before Module 05, try:
1. Calculate Hull Moving Average (HMA) - combines WMA techniques
2. Test different MA crossover strategies
3. Analyze lag for different periods empirically
4. Create adaptive MAs that change period based on volatility

---

## Additional Resources

### Further Reading
- [Investopedia: Moving Average](https://www.investopedia.com/terms/m/movingaverage.asp)
- [Investopedia: EMA](https://www.investopedia.com/terms/e/ema.asp)
- [Investopedia: MACD](https://www.investopedia.com/terms/m/macd.asp)
- [StockCharts: Moving Averages](https://school.stockcharts.com/doku.php?id=technical_indicators:moving_averages)

### Technical Papers
- Appel, Gerald (1979). "The Moving Average Convergence Divergence Method"
- Kaufman, Perry (2013). "Trading Systems and Methods" - Chapter on Moving Averages

### Python Documentation
- [Pandas rolling](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html)
- [Pandas ewm](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.ewm.html)

---

**Excellent work!** You now understand the mathematics behind the most fundamental technical analysis tool. Ready for Module 05?