<a href="https://colab.research.google.com/github/jhenningsen/Equity_Analysis/blob/main/EMA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##EMA
---
### Quantitative Analysis of Dual EMA Crossover Strategies

This notebook implements a quantitative analysis pipeline to evaluate the performance of **Exponential Moving Average (EMA) crossovers** for the input ticker. It begins by downloading historical price data from Yahoo Finance and cleaning it into a "flat" format suitable for time-series manipulation.

The script utilizes a modular function to calculate **Dual EMAs**—a short-term and a long-term average—and identifies "sign changes" where the faster EMA crosses above (**Bullish**) or below (**Bearish**) the slower EMA. Using nested loops, the analysis iterates through a grid of lookback combinations (Short: 3–9 days; Long: 10–21 days), ensuring the short lookback is always strictly less than the long lookback.



For each crossover event, the script calculates the **percentage return** for the duration of the trend (until the next signal). To accurately reflect strategy performance, returns are calculated based on the trade direction:
* **Bullish Crossovers:** Calculates returns for a Long position.
* **Bearish Crossovers:** Calculates returns for a Short position.

The final output is a pivoted summary table, filtered for the **top 10 performing combinations**, providing a year-by-year comparison of strategy effectiveness based on total accumulated returns.

---

In [42]:
# Import necessary libraries
import pandas as pd
import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt

# Define a ticker and a date range for your data
ticker = 'QQQ'
start_date = '2000-01-01'
end_date = '2025-12-31'
window = 5
short_lookback_range = [3,9]
long_lookback_range = [10,21]

#

# Download historical data from Yahoo Finance for a single ticker.
# This will result in a DataFrame with 'Date' as a simple index.
data = yf.download(ticker, start=start_date, end=end_date, auto_adjust=True)

# Use reset_index() to convert the 'Date' index into a column.
data = data.reset_index()

# Now, to get a new DataFrame with just the 'Price' level, we can use droplevel()
# This removes the 'Ticker' level from the columns, leaving only the 'Price' level.
data = data.droplevel(level='Ticker', axis=1)

# Now, add the 'Ticker' column at position 1 (right after the 'Date' column).
data.insert(1, 'Ticker', ticker)

# The DataFrame is now a flat table with no MultiIndex.
display(data)

[*********************100%***********************]  1 of 1 completed


Price,Date,Ticker,Close,High,Low,Open,Volume
0,2011-10-04,SVXY,10.525000,10.525000,9.825000,9.872500,81200
1,2011-10-05,SVXY,11.347500,11.410000,10.862500,10.882500,35600
2,2011-10-06,SVXY,11.582500,11.582500,11.200000,11.357500,22400
3,2011-10-07,SVXY,11.672500,11.797500,11.195000,11.797500,72800
4,2011-10-10,SVXY,12.150000,12.150000,11.850000,11.875000,60000
...,...,...,...,...,...,...,...
3574,2025-12-19,SVXY,54.110001,54.169998,53.369999,53.380001,847500
3575,2025-12-22,SVXY,54.990002,55.049999,54.520000,54.520000,863100
3576,2025-12-23,SVXY,54.919998,55.090000,54.849998,54.869999,797100
3577,2025-12-24,SVXY,54.930000,55.110001,54.840000,54.959999,549300


In [43]:
def calculate_ema(df, lookback):
    """
    Calculates a exponential moving average for a DataFrame.

    Args:
        df (pd.DataFrame): The input DataFrame with a 'close' column.
        lookback (int): The number of periods for the moving average.

    Returns:
        pd.DataFrame: The DataFrame with a new column for the moving average.
    """

    df_ema = df.copy()

    # .ewm() calculates the exponential weighted moving average
    # span=lookback is equivalent to the 'window' in SMA
    # adjust=False ensures the recursive calculation matches standard technical analysis
    df_ema['EMA'] = df_ema['Close'].ewm(span=lookback, adjust=False).mean()

    # Calculate the difference between the EMA and the Close price
    df_ema['EMA_sign'] = np.sign(df_ema['Close'] - df_ema['EMA'])

    return df_ema



In [44]:
def analyze_ema_crossover(df, short_lookback, long_lookback):
    """
    Analyzes the impact of EMA crossovers (Short EMA crossing Long EMA).

    Args:
        df (pd.DataFrame): The input DataFrame with 'Close' and 'Date'.
        short_lookback (int): Period for the faster EMA.
        long_lookback (int): Period for the slower EMA.

    Returns:
        pd.DataFrame: Yearly sum of returns triggered by crossover signals.
    """

    # 1. Validation Check
    if short_lookback >= long_lookback:
        raise ValueError(f"Short lookback ({short_lookback}) must be smaller than long lookback ({long_lookback}).")

    df_ema = df.copy()

    # 2. Calculate both EMAs
    df_ema['EMA_short'] = df_ema['Close'].ewm(span=short_lookback, adjust=False).mean()
    df_ema['EMA_long'] = df_ema['Close'].ewm(span=long_lookback, adjust=False).mean()

    # 3. Define the signal (Short > Long is 1, Short < Long is -1)
    df_ema['Crossover_sign'] = np.sign(df_ema['EMA_short'] - df_ema['EMA_long'])

    # 4. Identify where the sign changes
    previous_sign = df_ema['Crossover_sign'].shift(1)
    sign_changed_mask = (df_ema['Crossover_sign'] != previous_sign) & (~previous_sign.isna())

    sign_changes_only = df_ema[sign_changed_mask].copy()

    # 5. Calculate returns forward to the next crossover signal
    sign_changes_only.loc[:, 'Next_Close_Return'] = (
        (sign_changes_only['Close'] - sign_changes_only['Close'].shift(-1)) / sign_changes_only['Close']
    )

    # 5. Calculate returns forward to the next crossover signal
    # We multiply by the Crossover_sign to flip the logic:
    # If sign is 1 (Bullish): (Future - Current) / Current -> Standard Long Profit
    # If sign is -1 (Bearish): (Current - Future) / Current -> Standard Short Profit
    sign_changes_only.loc[:, 'Next_Close_Return'] = (
        (sign_changes_only['Close'].shift(-1) - sign_changes_only['Close']) / sign_changes_only['Close']
    ) * sign_changes_only['Crossover_sign']

    # 6. Time formatting and Year extraction
    sign_changes_only['Date'] = pd.to_datetime(sign_changes_only['Date'])
    sign_changes_only['Year'] = sign_changes_only['Date'].dt.year

    # 7. Filter for specific signals (e.g., Short crossing BELOW Long)
    # Change to 1 if you want to track returns when the short crosses ABOVE the long
    df_filtered = sign_changes_only[sign_changes_only['Crossover_sign'] == -1].copy()

    # 8. Aggregate by Year
    yearly_results = df_filtered.groupby('Year')['Next_Close_Return'].sum().reset_index()

    # Add metadata for the lookbacks used
    yearly_results['Short_Lookback'] = short_lookback
    yearly_results['Long_Lookback'] = long_lookback

    return yearly_results[['Year', 'Short_Lookback', 'Long_Lookback', 'Next_Close_Return']]

In [45]:
# Initialize an empty list to store the results from each lookback value
results_list = []

# Iterate through the short lookback values
for short_val in range(short_lookback_range[0], short_lookback_range[1] + 1):

    # Iterate through the long lookback values
    for long_val in range(long_lookback_range[0], long_lookback_range[1] + 1):

        # Check to ensure short is smaller than long
        if short_val < long_val:
            # Calculate EMA crossover results for the current pair
            # Note: Ensure you use the updated 'analyze_ema_crossover' function name from previous step
            df_ema_result = analyze_ema_crossover(
                df=data.copy(),
                short_lookback=short_val,
                long_lookback=long_val
            )

            # Append the result to the list
            results_list.append(df_ema_result)

# Concatenate all the DataFrames in the list into a single DataFrame
all_ema_results = pd.concat(results_list, ignore_index=True)

# Print the resulting DataFrame
print("DataFrame with EMA Crossover analysis for different lookback pairs:")
display(all_ema_results)

DataFrame with EMA Crossover analysis for different lookback pairs:


Unnamed: 0,Year,Short_Lookback,Long_Lookback,Next_Close_Return
0,2011,3,10,-0.071435
1,2012,3,10,-0.583890
2,2013,3,10,-0.295879
3,2014,3,10,0.236828
4,2015,3,10,0.125799
...,...,...,...,...
1255,2021,9,21,-0.567508
1256,2022,9,21,-0.015855
1257,2023,9,21,-0.218076
1258,2024,9,21,0.065251


In [46]:
# 1. Create a descriptive column for the EMA combination (e.g., "3 / 10")
# This makes the pivot table headers much easier to read
all_ema_results['EMA_Combo'] = (
    all_ema_results['Short_Lookback'].astype(str) +
    " / " +
    all_ema_results['Long_Lookback'].astype(str)
)

# 2. Pivot the DataFrame
# index: Years as rows
# columns: The EMA combinations
# values: The calculated returns
pivoted_ema_results = all_ema_results.pivot(
    index='Year',
    columns='EMA_Combo',
    values='Next_Close_Return'
)

# 3. Display the results
print("Pivoted DataFrame: Years vs. EMA Combinations (Short / Long)")
display(pivoted_ema_results)

Pivoted DataFrame: Years vs. EMA Combinations (Short / Long)


EMA_Combo,3 / 10,3 / 11,3 / 12,3 / 13,3 / 14,3 / 15,3 / 16,3 / 17,3 / 18,3 / 19,...,9 / 12,9 / 13,9 / 14,9 / 15,9 / 16,9 / 17,9 / 18,9 / 19,9 / 20,9 / 21
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2011,-0.071435,-0.071435,-0.071435,-0.022182,-0.022182,-0.022182,-0.022182,-0.022182,-0.022182,-0.04302,...,-0.014097,0.033085,0.033085,0.033085,0.01371,0.012445,0.012445,0.01886,-0.206326,-0.206326
2012,-0.58389,-0.607496,-0.687267,-0.699551,-0.660467,-0.581925,-0.483743,-0.476293,-0.400618,-0.650017,...,-0.505233,-0.475781,-0.406459,-0.406459,-0.431072,-0.346806,-0.384512,-0.313113,-0.320309,-0.310256
2013,-0.295879,-0.225752,-0.100335,-0.089172,-0.099262,-0.031625,-0.354174,-0.415766,-0.405634,-0.417929,...,-0.49812,-0.517591,-0.56079,-0.569071,-0.543331,-0.613281,-0.613281,-0.669827,-0.666939,-0.647659
2014,0.236828,0.32357,0.190026,0.159581,0.151138,0.137295,0.129665,0.120457,0.103767,0.073375,...,0.080041,0.041165,0.022129,0.008701,0.022775,-0.04634,-0.043347,-0.092629,-0.063518,-0.042112
2015,0.125799,0.12993,0.139274,0.066343,0.199935,0.204853,0.286996,0.403912,0.316755,0.409868,...,0.295713,0.349366,0.394655,0.339172,0.339172,0.339172,0.341409,0.341409,0.293623,0.169126
2016,-0.692335,-0.609333,-0.633602,-0.669635,-0.557389,-0.509949,-0.585266,-0.764331,-0.414146,-0.38132,...,-0.48631,-0.48631,-0.48631,-0.462268,-0.488354,-0.46424,-0.459824,-0.459824,-0.459824,-0.48658
2017,-0.444482,-0.366736,-0.405672,-0.413211,-0.410496,-0.331777,-0.257705,-0.25967,-0.236813,-0.245699,...,-0.508312,-0.550254,-0.550254,-0.550254,-0.520142,-0.545748,-0.513868,-0.398142,-0.336763,-0.336763
2018,0.927643,0.90256,0.90256,0.902363,0.853106,0.84914,0.878641,0.875808,0.903548,0.934642,...,0.831951,0.832108,0.870956,0.844162,0.911926,0.915208,0.907413,0.927836,0.91756,0.914963
2019,-0.205739,-0.205739,-0.192401,-0.195784,-0.203178,-0.208316,-0.094828,-0.133088,-0.133088,-0.094342,...,-0.11823,-0.11823,-0.090324,-0.133716,-0.140144,-0.11388,-0.11669,-0.11669,-0.11669,-0.121763
2020,0.069643,0.082232,0.026458,0.042969,0.026609,0.035798,-0.010272,-0.017619,0.012299,0.002786,...,0.022128,0.105049,0.092896,0.090409,0.022132,-0.007757,-0.009197,0.067398,0.069458,0.050339


In [47]:
# Calculate total return per combination across all years
pivoted_ema_results.loc['Total_Return'] = pivoted_ema_results.sum()
display(pivoted_ema_results)

EMA_Combo,3 / 10,3 / 11,3 / 12,3 / 13,3 / 14,3 / 15,3 / 16,3 / 17,3 / 18,3 / 19,...,9 / 12,9 / 13,9 / 14,9 / 15,9 / 16,9 / 17,9 / 18,9 / 19,9 / 20,9 / 21
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2011,-0.071435,-0.071435,-0.071435,-0.022182,-0.022182,-0.022182,-0.022182,-0.022182,-0.022182,-0.04302,...,-0.014097,0.033085,0.033085,0.033085,0.01371,0.012445,0.012445,0.01886,-0.206326,-0.206326
2012,-0.58389,-0.607496,-0.687267,-0.699551,-0.660467,-0.581925,-0.483743,-0.476293,-0.400618,-0.650017,...,-0.505233,-0.475781,-0.406459,-0.406459,-0.431072,-0.346806,-0.384512,-0.313113,-0.320309,-0.310256
2013,-0.295879,-0.225752,-0.100335,-0.089172,-0.099262,-0.031625,-0.354174,-0.415766,-0.405634,-0.417929,...,-0.49812,-0.517591,-0.56079,-0.569071,-0.543331,-0.613281,-0.613281,-0.669827,-0.666939,-0.647659
2014,0.236828,0.32357,0.190026,0.159581,0.151138,0.137295,0.129665,0.120457,0.103767,0.073375,...,0.080041,0.041165,0.022129,0.008701,0.022775,-0.04634,-0.043347,-0.092629,-0.063518,-0.042112
2015,0.125799,0.12993,0.139274,0.066343,0.199935,0.204853,0.286996,0.403912,0.316755,0.409868,...,0.295713,0.349366,0.394655,0.339172,0.339172,0.339172,0.341409,0.341409,0.293623,0.169126
2016,-0.692335,-0.609333,-0.633602,-0.669635,-0.557389,-0.509949,-0.585266,-0.764331,-0.414146,-0.38132,...,-0.48631,-0.48631,-0.48631,-0.462268,-0.488354,-0.46424,-0.459824,-0.459824,-0.459824,-0.48658
2017,-0.444482,-0.366736,-0.405672,-0.413211,-0.410496,-0.331777,-0.257705,-0.25967,-0.236813,-0.245699,...,-0.508312,-0.550254,-0.550254,-0.550254,-0.520142,-0.545748,-0.513868,-0.398142,-0.336763,-0.336763
2018,0.927643,0.90256,0.90256,0.902363,0.853106,0.84914,0.878641,0.875808,0.903548,0.934642,...,0.831951,0.832108,0.870956,0.844162,0.911926,0.915208,0.907413,0.927836,0.91756,0.914963
2019,-0.205739,-0.205739,-0.192401,-0.195784,-0.203178,-0.208316,-0.094828,-0.133088,-0.133088,-0.094342,...,-0.11823,-0.11823,-0.090324,-0.133716,-0.140144,-0.11388,-0.11669,-0.11669,-0.11669,-0.121763
2020,0.069643,0.082232,0.026458,0.042969,0.026609,0.035798,-0.010272,-0.017619,0.012299,0.002786,...,0.022128,0.105049,0.092896,0.090409,0.022132,-0.007757,-0.009197,0.067398,0.069458,0.050339


In [48]:
# 1. Identify the top 10 column names based on the highest values in 'Total_Return'
# ascending=False ensures:
#   - High positive numbers are at the very top
#   - Small negative numbers (closest to zero) follow them
top_10_combos = pivoted_ema_results.loc['Total_Return'].sort_values(ascending=False).head(10).index

# 2. Create a filtered DataFrame containing only these top 10 strategies
filtered_results = pivoted_ema_results[top_10_combos]

# 3. Display the filtered results
print("Top 10 EMA Combinations (Ranked from Best to Worst):")
display(filtered_results)

Top 10 EMA Combinations (Ranked from Best to Worst):


EMA_Combo,4 / 21,5 / 18,5 / 10,4 / 20,5 / 19,4 / 15,5 / 15,7 / 21,3 / 18,7 / 11
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2011,-0.009542,-0.009542,-0.022182,-0.009542,-0.009542,-0.04302,-0.009542,0.01371,-0.022182,-0.009542
2012,-0.454296,-0.513546,-0.562371,-0.548139,-0.435929,-0.365411,-0.497586,-0.286207,-0.400618,-0.597439
2013,-0.445721,-0.390371,-0.089859,-0.42637,-0.520238,-0.301423,-0.361935,-0.613281,-0.405634,-0.299673
2014,0.200976,0.18042,0.150669,0.228097,0.18042,0.129468,0.017756,0.07641,0.103767,-0.045431
2015,0.495687,0.361246,0.21105,0.443111,0.394671,0.257565,0.353037,0.349504,0.316755,0.317331
2016,-0.306073,-0.324644,-0.354139,-0.30033,-0.324644,-0.476996,-0.369921,-0.488354,-0.414146,-0.397971
2017,-0.493826,-0.481657,-0.398423,-0.451884,-0.523599,-0.318317,-0.342424,-0.511081,-0.236813,-0.346807
2018,0.848953,0.855298,0.911649,0.854594,0.853808,0.923291,0.92149,0.911871,0.903548,0.902435
2019,-0.095951,-0.149325,-0.20658,-0.132869,-0.11823,-0.162548,-0.051418,-0.11388,-0.133088,-0.089875
2020,0.076896,0.142521,0.038341,0.116572,0.131219,0.063492,0.077156,0.102299,0.012299,0.048663


In [53]:
# Ensure 'Date' column in the initial 'data' DataFrame is in datetime format
data['Date'] = pd.to_datetime(data['Date'])

# Extract the year from the 'Date' column
data['Year'] = data['Date'].dt.year

# Group by year and get the first and last close prices
# Note: 'first' is the price at the start of the year, 'last' is the price at the end
yearly_data = data.groupby('Year')['Close'].agg(['first', 'last'])

# Calculate the Yearly Return: (Last - First) / First
yearly_data['Yearly_Return'] = (yearly_data['last'] - yearly_data['first']) / yearly_data['first']

# Create a clean DataFrame with just the Return
yearly_performance = yearly_data[['Yearly_Return']].copy()

# Display the result
print("Yearly Return (percentage change) for each year:")
display(yearly_performance)

Yearly Return (percentage change) for each year:


Unnamed: 0_level_0,Yearly_Return
Year,Unnamed: 1_level_1
2011,0.241805
2012,1.384769
2013,0.836918
2014,-0.074945
2015,-0.189819
2016,0.925095
2017,1.631298
2018,-0.920237
2019,0.51733
2020,-0.378245


In [54]:
# Join the yearly_price_change DataFrame with the yearly_next_close_diff_sum Series on the 'Year' index
comparison_df = yearly_performance.join(filtered_results)

# Display the combined DataFrame, excluding the 'first' and 'last' columns
display(comparison_df)

Unnamed: 0_level_0,Yearly_Return,4 / 21,5 / 18,5 / 10,4 / 20,5 / 19,4 / 15,5 / 15,7 / 21,3 / 18,7 / 11
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2011,0.241805,-0.009542,-0.009542,-0.022182,-0.009542,-0.009542,-0.04302,-0.009542,0.01371,-0.022182,-0.009542
2012,1.384769,-0.454296,-0.513546,-0.562371,-0.548139,-0.435929,-0.365411,-0.497586,-0.286207,-0.400618,-0.597439
2013,0.836918,-0.445721,-0.390371,-0.089859,-0.42637,-0.520238,-0.301423,-0.361935,-0.613281,-0.405634,-0.299673
2014,-0.074945,0.200976,0.18042,0.150669,0.228097,0.18042,0.129468,0.017756,0.07641,0.103767,-0.045431
2015,-0.189819,0.495687,0.361246,0.21105,0.443111,0.394671,0.257565,0.353037,0.349504,0.316755,0.317331
2016,0.925095,-0.306073,-0.324644,-0.354139,-0.30033,-0.324644,-0.476996,-0.369921,-0.488354,-0.414146,-0.397971
2017,1.631298,-0.493826,-0.481657,-0.398423,-0.451884,-0.523599,-0.318317,-0.342424,-0.511081,-0.236813,-0.346807
2018,-0.920237,0.848953,0.855298,0.911649,0.854594,0.853808,0.923291,0.92149,0.911871,0.903548,0.902435
2019,0.51733,-0.095951,-0.149325,-0.20658,-0.132869,-0.11823,-0.162548,-0.051418,-0.11388,-0.133088,-0.089875
2020,-0.378245,0.076896,0.142521,0.038341,0.116572,0.131219,0.063492,0.077156,0.102299,0.012299,0.048663


In [55]:
# Calculate the sum and standard deviation of each column in the comparison_df
grand_totals = comparison_df.sum()
standard_deviations = comparison_df.std()

# Calculate the Mean Absolute Deviation for each column
mean_absolute_deviations = comparison_df.apply(lambda x: (x - x.mean()).abs().mean())

# Combine the metrics into a single DataFrame for display
summary_df = pd.DataFrame({
    'Grand Total': grand_totals,
    'Standard Deviation': standard_deviations,
    'Mean Absolute Deviation': mean_absolute_deviations
})

# Display the summary DataFrame
print("Grand Totals, Standard Deviations, and Mean Absolute Deviations for each column:")
display(summary_df)

Grand Totals, Standard Deviations, and Mean Absolute Deviations for each column:


Unnamed: 0,Grand Total,Standard Deviation,Mean Absolute Deviation
Yearly_Return,5.299524,0.682154,0.550806
4 / 21,-1.270692,0.388674,0.282027
5 / 18,-1.348881,0.372009,0.274384
5 / 10,-1.349798,0.352342,0.236789
4 / 20,-1.360768,0.38564,0.2898
5 / 19,-1.383912,0.386385,0.284707
4 / 15,-1.422384,0.355158,0.246359
5 / 15,-1.4372,0.373616,0.263812
7 / 21,-1.441032,0.394419,0.282479
3 / 18,-1.451686,0.356597,0.241053
