[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Tiger-Quant/demos2025/blob/master/september23.ipynb)

# Basic Strategy Tutorial

This tutorial covers the following ideas:
- Using `yfinance` to load historical data
- Implementing a repeatable strategy using moving averages
- How to calculate the profits from using such a strategy
- Compartmentalization of code for reusability

## Setup

Certain Python packages are required. This installs them if we are in Colab.

In [None]:
# ------------------------------------------------------------------
# Setup Cell: Run this first to install required libraries.
# ------------------------------------------------------------------
# The '%' command runs a command within the correct environment.
# The '-q' flag makes the output "quiet" to keep the notebook clean.

RunningInCOLAB = 'google.colab' in str(get_ipython())  # checks to see if we are in google colab
if RunningInCOLAB:                                     # installs packages if in colab 
    %pip install -q yfinance pandas matplotlib-venn numpy

print("✅ Setup complete. You can now run the rest of the notebook.")

Packages need to be imported to be able to actively use them. Shorter names are given for convenience's sake.

In [None]:
import yfinance as yf
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

## Data Loading

`yfinance` will be used to download the data wanted. For this example, we use the `SPY`, one of the [largest exchange-traded funds](https://www.ssga.com/us/en/intermediary/capabilities/spdr-core-equity-etfs/spy-sp-500) (ETFs). Feel free to explore this strategy with other tickers.

In [None]:
data_to_use = 'SPY'

data = yf.download(data_to_use, start='2022-01-01')
data.columns = data.columns.droplevel(1)
data

### Plotting the Data

It's always a good idea to plot the data that will be used. Here, a function will be defined for ease of use later on. 
Functions are pieces of code that can be called again, allowing for easy reusability instead of copy and pasting. This is called _compartmentalization_.

In [None]:
# Reusable function to plot data
def plot_data(data, data_name='SPY'):
    """Plots the closing prices of the given data."""
    plt.figure(figsize=(12, 6))
    plt.plot(data['Close'], label=data_name +' Close Price')
    plt.title(data_name + ' Closing Prices Since 2022')
    plt.xlabel('Date')
    plt.ylabel('Price (USD)')
    

plot_data(data) # calling the function to plot
plt.show() # show the plot 

Bar some exceptions, `SPY` has mostly went up since 2022.

## Strategy Implementation: Simple Moving Average

The simple moving average (SMA) is a simple yet intuitive strategy. The idea is that tickers typically follow a trend (the moving average). If it goes above this trend, it is a sign that it is doing well, and one should buy. If it goes below this trend, is is a sign that it might do bad, and one should sell. Here, we'll describe the steps to programatically do this.

### Calculate SMA

The simple moving average is an average of the last $n$ datapoints. It is simple because all the datapoints are weighed the same. In this example, we use $n=50$.

In [None]:
data['SMA50'] = data['Close'].rolling(window=50).mean()
# take mean of each group of 50 rows, i.e., 
# mean for Day 1-50, mean for Day 2-51 etc.
# or just the 50-day simple moving average
data

Rolling returns all possible consecutive windows of a desired size. We then calculate the mean, allowing us to get the average of the price over the window, which is the last 50 days.

**Note**: Notice that for the first 49 values, the moving average does not have a value. That is because there are not 49 prices before it to be able to get a rolling window (to then average). Moving averages cannot start immediately.

### Calculate Position

The position is simple. If the ticker is above the average, the position is 1, and that means we'd like to have some of the ticker. If the ticker is below the average, the position is 0, and that means we'd like to not have any ticker.

In [None]:
data['Position'] = np.where(data['Close'] > data['SMA50'], 1, 0) # 1 if above SMA, else 0
data

`Numpy`'s (or np) `where` function will evaluate a Boolean(true or false) expression, and set the values accordingly if it is true or false over an array of numbers.

**Note**: Notice that for the first 49 values, the moving average does not have a value. It fails the condition set, so it gets set to 0. This makes sense, since there is still not enough data to determine if we want stock right now with this strategy.

### Calculate Signal

The signal is a sign to either buy or sell. The signal being used is whenever the position changes. If the signal is 1, it means we went from a position of 0 -> 1, or from not wanting any of the ticker to wanting it. Therefore, we should buy. The opposite occurs if the signal is -1. 0s can occur, but that means the position has not changed, so no action will be taken.

In [None]:
data['Signal'] = data['Position'].diff() # difference between current and previous row
data[data['Signal'] != 0].head() # show rows that have a signal (buy or sell)

In [None]:
plot_data(data)
plt.plot(data['SMA50'], label='50-Day SMA', linestyle='--')
plt.legend() # show legend
plt.show()

#### Plot Signals

We can plot the signals to get a visual intuition of the strategy.

In [None]:
plot_data(data)
plt.plot(data['SMA50'], label='50-Day SMA', linestyle='--')

buy_signal = (data['Signal'] == 1.0) # where signal is 1
plt.scatter(data.index[buy_signal], 
            data['Close'][buy_signal],
            label='Buy Signal', marker='^', color='green', s=100)

sell_signal = (data['Signal'] == -1.0) # where signal is -1
plt.scatter(data.index[sell_signal], 
            data['Close'][sell_signal],
            label='Sell Signal', marker='v', color='red', s=100)

plt.legend()
plt.show()

## Calculating Strategy Returns


To calculate how our strategy would do, we need to calculate profit and return on investment (ROI).

### Calculating Profit 

Calculating profit is simple. It's the difference of revenue and expenses. Your revenue is how much money you made selling shares at the sell signal and your expenses is how much money you spent buying the shares at the buy signal.
This can be expressed mathematically:
$$
P = SQ - BQ
$$

where $P$ is the profit, $S$ is the prices of the shares sold at sell signal, $B$ is the price of the shares bought at the buy signal, $Q$ is the quantity of shares. This is assuming that all of the shares bought are all sold at the next signal, which is what we did for simple moving average.

Factoring can occur:
$$
P = Q(S - B)
$$

### Calculating ROI

The return on investment is a measure of the profitability of an investment/trade relative to the cost of investment. It can be expressed mathematically as such:
$$
ROI = \frac{P}{BQ}
$$

A higher ROI is preferred.


In [None]:
def calculate_returns(quantity, data):
    """Calculate the returns of a trading strategy based on buy and sell signals."""
    
    total_profit = 0
    total_money_invested = 0
    buy_signal = (data['Signal'] == 1.0) # where signal is 1
    sell_signal = (data['Signal'] == -1.0) # where signal is -1
    
    for buy_time, buy_row in data[buy_signal].iterrows(): # iterate over every buy signal
        buy_price = buy_row['Close']
        sell_trades = data[sell_signal & (data.index > buy_time)] # all sell signals after buy signal
        
        # check that a sell signal exists, could potentially not have a sell signal, like at the end
        if not sell_trades.empty:
            sell_price = (sell_trades.iloc[0])['Close'] # first sell signal after buy signal

            profit = quantity * (sell_price - buy_price)  # based on formula
            money_invested = quantity * buy_price
            print(f"Bought {quantity} share(s) at ${buy_price:.2f}. Sold {quantity} share(s) at ${sell_price:.2f}, Profit: ${profit:.2f}")
            
            
            total_profit = total_profit + profit
            total_money_invested = total_money_invested + money_invested
        else:
            print(f"Would buy {quantity} share(s) at ${buy_price:.2f}, but no sell signal found.")
    
    roi = total_profit / total_money_invested
    print(f"""\nTotal Profit: ${total_profit:.2f},
Money Invested: ${total_money_invested:.2f}, 
Return on Investment: {roi*100:.2f}%""")
    return total_profit, roi


With the way this function has been coded, this can calculate the returns of virtually any strategy that is based on buy and sell signals.

In [27]:
total_profit, roi = calculate_returns(quantity = 1, data=data)

Bought 1 share(s) at $423.03. Sold 1 share(s) at $418.66, Profit: $-4.38
Bought 1 share(s) at $421.88. Sold 1 share(s) at $416.63, Profit: $-5.25
Bought 1 share(s) at $423.53. Sold 1 share(s) at $416.89, Profit: $-6.64
Bought 1 share(s) at $374.92. Sold 1 share(s) at $380.60, Profit: $5.68
Bought 1 share(s) at $388.62. Sold 1 share(s) at $375.71, Profit: $-12.90
Bought 1 share(s) at $373.34. Sold 1 share(s) at $359.76, Profit: $-13.58
Bought 1 share(s) at $364.64. Sold 1 share(s) at $359.05, Profit: $-5.59
Bought 1 share(s) at $378.78. Sold 1 share(s) at $369.51, Profit: $-9.27
Bought 1 share(s) at $372.37. Sold 1 share(s) at $367.05, Profit: $-5.31
Bought 1 share(s) at $376.56. Sold 1 share(s) at $374.69, Profit: $-1.87
Bought 1 share(s) at $381.67. Sold 1 share(s) at $382.15, Profit: $0.48
Bought 1 share(s) at $383.45. Sold 1 share(s) at $382.04, Profit: $-1.42
Bought 1 share(s) at $383.53. Sold 1 share(s) at $383.97, Profit: $0.44
Bought 1 share(s) at $384.60. Sold 1 share(s) at $37

As shown, a profit technically was made. However, an ROI of less than a percent is not the best. However, it is repeatable and intuitive.