# COGS 188 - Final Project

# Optimizing Stock Market Trading Strategies: A Comparative Analysis

## Group members

- Brooks Ephraim
- Elan Hashem
- Bram Simonnet

# Abstract 
Our project explores AI-driven algorithmic trading strategies to optimize portfolio performance by predicting market trends and making informed trading decisions. We use historical stock data, including price movements, trading volume, and technical indicators, to develop predictive models. Our approach incorporates Neural Networks (NN), Temporal Difference (TD) Learning, and Dynamic Programming (DP) to forecast price changes and execute optimal trades. We evaluate model performance using financial metrics such as the Sharpe Ratio, total return, and average prediction error, comparing AI-driven strategies against the traditional Buy and Hold (BH) approach. Our results show that while BH outperforms most rule-based strategies in stable conditions, optimized AI models can generate higher portfolio returns when hyperparameters like train/test split and trade thresholds are tuned effectively. This study highlights the potential of AI in financial markets while acknowledging the important challenges of market volatility and overfitting.

# Background

Fill in the background and discuss the kind of prior work that has gone on in this research area here. **Use inline citation** to specify which references support which statements.  You can do that through HTML footnotes (demonstrated here). I used to reccommend Markdown footnotes (google is your friend) because they are simpler but recently I have had some problems with them working for me whereas HTML ones always work so far. So use the method that works for you, but do use inline citations.

Here is an example of inline citation. After government genocide in the 20th century, real birds were replaced with surveillance drones designed to look just like birds<a name="lorenz"></a>[<sup>[1]</sup>](#lorenznote). Use a minimum of 3 to 5 citations, but we prefer more <a name="admonish"></a>[<sup>[2]</sup>](#admonishnote). You need enough citations to fully explain and back up important facts. 

Remeber you are trying to explain why someone would want to answer your question or why your hypothesis is in the form that you've stated. 

# Problem Statement

Financial markets are inherently volatile, making it challenging to develop trading strategies that consistently outperform traditional approaches while managing risk. Initially, our goal was to maximize trading opportunities, but our focus developed towards predicting market trends and making informed trading decisions based on those predictions. This project integrates machine learning techniques, including Temporal Difference (TD) learning, Dynamic Programming (DP), and Neural Networks (NN), to optimize portfolio performance. By leveraging historical market data, we train models to predict price movements and identify optimal buy/sell points through backtracking and reinforcement learning principles. The effectiveness of our strategies is measured using key financial metrics such as the Sharpe ratio, total return, and prediction error. We benchmark our models against the Buy and Hold strategy to evaluate performance across different time periods and market conditions. Through hyperparameter tuning and company-wide analysis, we refine our models to maximize profitability while mitigating risk, ensuring adaptability in dynamic financial environments.

# Data

### **Source and Structure**
Our dataset was obtained from Yahoo Finance via the [Massive Yahoo Finance Dataset](https://www.kaggle.com/datasets/iveeaten3223times/massive-yahoo-finance-dataset) on Kaggle. It includes historical stock data for multiple companies, spanning several years, with daily and minute-level price observations.

Each row in the dataset represents a timestamped stock price entry, typically structured as follows:
- **Company**: Stock ticker symbol (e.g., AAPL, TSLA).
- **Date**: Timestamp for the recorded data.
- **Open, High, Low, Close Prices (OHLC)**: Standard stock price metrics.
- **Adjusted Close**: Adjusted for stock splits and dividends.
- **Volume**: The number of shares traded.

After cleaning and preprocessing, our final dataset contained over **600,000 observations**, with essential financial metrics used for strategy evaluation.

### **Data Cleaning and Preprocessing**
To ensure high-quality inputs for our models, we applied the following preprocessing steps:

#### **1. Handling Missing Data**
- Dropped rows with missing OHLC prices to maintain data integrity.
- Forward-filled missing values due to market closures on weekends and holidays.

#### **2. Feature Engineering**
To enhance model performance, we generated additional features:
- Returns Calculation: Daily percentage change in stock price:
  ```python
  
  df["Return"] = df.groupby("Company")["Close"].pct_change()
  ```
- **Price Indicators**:
  - Prev_Close: Closing price of the previous day.
  - Price Change**: Difference between current and previous close price.
  - Volatility: 5-day rolling standard deviation of closing prices.
- **Technical Indicators**:
  - **Simple Moving Averages (SMA)**:
    ```python
    df["SMA_10"] = df.groupby("Company")["Close"].transform(lambda x: x.rolling(window=10, min_periods=1).mean())
    df["SMA_50"] = df.groupby("Company")["Close"].transform(lambda x: x.rolling(window=50, min_periods=1).mean())
    ```

#### **3. Data Transformation**
- Converted timestamps to `datetime` format for easier time-series processing.
- Sorted data by `Company` and `Date` to maintain chronological integrity.
- Normalized price data where necessary for reinforcement learning models.


## **Final Processed Dataset**
- Companies Tracked: Multiple (e.g., AAPL, TSLA).
- Observations: Over 600,000 records post-cleaning.
- **Key Features Used**:
  - OHLC prices
  - Volume
  - Returns
  - Moving Averages
  - Volatility

This cleaned dataset was then used to train and evaluate different trading strategies, including reinforcement learning models (TD Learning, Dynamic Programming) and benchmarking against Buy-and-Hold strategies.

# Proposed Solution
 
Our approach optimizes algorithmic trading strategies by integrating reinforcement learning (TD Learning) and Dynamic Programming (DP) to make data-driven trading decisions. Instead of relying solely on rule-based or traditional statistical models, we use machine learning techniques to predict market movements and execute optimal buy/sell strategies.  

##### **Problem Formulation: Trading as a Sequential Decision Process**
We model the trading problem using Temporal Difference Learning and Dynamic Programming to determine the best buy, sell, or hold decisions based on historical stock data. Our goal is to maximize portfolio returns by learning from past market movements and optimizing trade execution.

##### **1. State (S): Market Representation**
Each state represents the market at a given time and includes:  
- **OHLCV Data**: Open, High, Low, Close, Volume.  
- **Technical Indicators**: Moving Averages (SMA, EMA), Volatility, Relative Strength Index (RSI), MACD.  
- **Market Trends**: Rolling price changes, trend momentum.  
- **Portfolio Status**: Current holdings, cash balance, previous transactions.

##### **2. Actions (A): Trading Decisions**
At each time step, the algorithm decides between:  
- **Buy**: Purchase stock at the current price.  
- **Sell**: Sell stock at the current price.  
- **Hold**: Maintain the current position.

##### **3. Reward Function (R): Incentivizing Profitability**
- **Profit Maximization**: The reward function is defined as:

\[
R_t = P_{t+1} - P_t
\]

- **Transaction Costs**: Penalize excessive trading to minimize fees.  
- **Holding Time Penalty**: Discourage holding onto a losing position for too long.

##### **4. Dynamic Programming (DP) for Trade Optimization**
Once TD Learning generates price predictions, Dynamic Programming (DP) is used to backtrack from the final date to identify optimal buy/sell points. By systematically evaluating potential trading decisions over time, DP finds the most profitable trades across the entire prediction window, outperforming naïve strategies.

#### **Implementation Steps**
1. **Data Preprocessing**: Clean and engineer features from Yahoo Finance data.  
2. **TD Learning Prediction**: Train a reinforcement learning model on historical prices.  
3. **DP Optimization**: Apply Dynamic Programming to maximize profit using predicted prices.  
4. **Strategy Backtesting**: Compare TD+DP Trading Strategy against the Buy-and-Hold Benchmark using real stock market data.  
5. **Performance Evaluation**: Measure effectiveness using Sharpe Ratio, Total Return, and Drawdowns.

By combining machine learning, TD Learning, and Dynamic Programming, our approach enhances decision-making in trading and adapts dynamically to changing market conditions.

# Evaluation Metrics

To assess the effectiveness of our AI-driven trading strategies, we employ multiple evaluation metrics that measure profitability, risk management, and model performance. These metrics allow us to compare our Temporal Difference Learning and Dynamic Programming methods against traditional strategies like Buy & Hold and other benchmark models. Below, we outline the key evaluation criteria used in our analysis.

### **Profitability & Risk-Adjusted Returns**
1. Sharpe Ratio

The Sharpe Ratio is a fundamental measure in finance that evaluates the risk-adjusted return of an investment strategy. A higher Sharpe ratio indicates better returns per unit of risk, making it a critical metric for comparing our AI models to benchmarks.

Mathematically, the Sharpe Ratio is defined as:

Sharpe Ratio = (Rx – Rf) / StdDev Rx 
- Rx = Expected portfolio return 
- Rf = Risk-free rate of return. 
- StdDev Rx = Standard deviation of portfolio return (or, volatility)

```python
def calculate_sharpe_ratio(returns, risk_free_rate=0.01):
    excess_returns = np.array(returns) - risk_free_rate  
    std_dev = np.std(excess_returns, ddof=1)  

    return np.inf if std_dev == 0 else np.mean(excess_returns) / std_dev
```

We use this ratio in BENCHMARK_EVAL.py to compare the Sharpe ratios of:
- Buy & Hold Strategy
- TD Learning Strategy
- DP Strategy
- Other benchmark strategies (e.g., SMA, EMA, MACD)

### Trading Strategies Evaluated
We implemented and backtested 8 different trading strategies:

| **Strategy** | **Description** |
|-------------|----------------|
| **Simple Moving Average (SMA)** | Uses short- and long-term moving averages for buy/sell signals. |
| **Exponential Moving Average (EMA)** | Similar to SMA but gives more weight to recent prices. |
| **Mean Reversion** | Buys undervalued stocks and sells overvalued ones based on deviation from SMA. |
| **Momentum Trading** | Rides trends by buying when prices increase and selling when they decrease. |
| **Moving Average Crossover (MAC)** | Combines SMA and momentum for trade signals. |
| **Scalping Strategy** | Executes quick trades on small price movements. |
| **Swing Trading** | Identifies medium-term price swings for entries/exits. |
| **Buy-and-Hold (Benchmark)** | Holds stocks long-term, serving as the **baseline** strategy. |

Each strategy applies a **buy/sell signal** based on different technical indicators, such as moving averages, price deviations, and momentum calculations.

Example of Simple Moving Average (SMA):

```python
df["SMA_Short"] = df.groupby("Company")["Close"].transform(lambda x: x.rolling(window=5, min_periods=1).mean())
df["SMA_Long"] = df.groupby("Company")["Close"].transform(lambda x: x.rolling(window=100, min_periods=1).mean())

df["SMA_Signal"] = 0
df.loc[df["SMA_Short"] > df["SMA_Long"], "SMA_Signal"] = 1  # Buy
df.loc[df["SMA_Short"] < df["SMA_Long"], "SMA_Signal"] = -1  # Sell
```
Each strategy was **backtested** by multiplying returns by its respective trading signals.



### Results: Sharpe Ratio Comparison
We computed Sharpe Ratios for each strategy to determine the most profitable and risk-efficient approach.

```python
print("Sharpe Ratios for Trading Strategies")
print(f" SMA Strategy Sharpe Ratio: {sma_sharpe:.4f}")
print(f" EMA Strategy Sharpe Ratio: {ema_sharpe:.4f}")
print(f" Mean Reversion Strategy Sharpe Ratio: {mean_reversion_sharpe:.4f}")
print(f" Momentum Strategy Sharpe Ratio: {momentum_sharpe:.4f}")
print(f" MAC Strategy Sharpe Ratio: {mac_sharpe:.4f}")
print(f" Scalping Strategy Sharpe Ratio: {scalping_sharpe:.4f}")
print(f" Swing Trading Strategy Sharpe Ratio: {swing_sharpe:.4f}")
print(f" Buy-and-Hold Strategy Sharpe Ratio: {buy_hold_sharpe:.4f}")
```

# Results

You may have done tons of work on this. Not all of it belongs here. 

Reports should have a __narrative__. Once you've looked through all your results over the quarter, decide on one main point and 2-4 secondary points you want us to understand. Include the detailed code and analysis results of those points only; you should spend more time/code/plots on your main point than the others.

If you went down any blind alleys that you later decided to not pursue, please don't abuse the TAs time by throwing in 81 lines of code and 4 plots related to something you actually abandoned.  Consider deleting things that are not important to your narrative.  If its slightly relevant to the narrative or you just want us to know you tried something, you could keep it in by summarizing the result in this report in a sentence or two, moving the actual analysis to another file in your repo, and providing us a link to that file.

### Subsection 1

You will likely have different subsections as you go through your report. For instance you might start with an analysis of the dataset/problem and from there you might be able to draw out the kinds of algorithms that are / aren't appropriate to tackle the solution.  Or something else completely if this isn't the way your project works.

### Subsection 2

Another likely section is if you are doing any feature selection through cross-validation or hand-design/validation of features/transformations of the data

### Subsection 3

Probably you need to describe the base model and demonstrate its performance.  Probably you should include a learning curve to demonstrate how much better the model gets as you increase the number of trials

### Subsection 4

Perhaps some exploration of the model selection (hyper-parameters) or algorithm selection task. Generally reinforement learning tasks may require a huge amount of training, so extensive grid search is unlikely to be possible. However expoloring a few reasonable hyper-parameters may still be possible.  Validation curves, plots showing the variability of perfromance across folds of the cross-validation, etc. If you're doing one, the outcome of the null hypothesis test or parsimony principle check to show how you are selecting the best model.

### Subsection 5 

Maybe you do model selection again, but using a different kind of metric than before?  Or you compare a completely different approach/alogirhtm to the problem? Whatever, this stuff is just serving suggestions.



# Discussion

### Interpreting the result

OK, you've given us quite a bit of tech informaiton above, now its time to tell us what to pay attention to in all that.  Think clearly about your results, decide on one main point and 2-4 secondary points you want us to understand. Highlight HOW your results support those points.  You probably want 2-5 sentences per point.


### Limitations

Are there any problems with the work?  For instance would more data change the nature of the problem? Would it be good to explore more hyperparams than you had time for?   


### Future work
Looking at the limitations and/or the toughest parts of the problem and/or the situations where the algorithm(s) did the worst... is there something you'd like to try to make these better.

### Ethics & Privacy

If your project has obvious potential concerns with ethics or data privacy discuss that here.  Almost every ML project put into production can have ethical implications if you use your imagination. Use your imagination.

Even if you can't come up with an obvious ethical concern that should be addressed, you should know that a large number of ML projects that go into producation have unintended consequences and ethical problems once in production. How will your team address these issues?

Consider a tool to help you address the potential issues such as https://deon.drivendata.org

### Conclusion

Reiterate your main point and in just a few sentences tell us how your results support it. Mention how this work would fit in the background/context of other work in this field if you can. Suggest directions for future work if you want to.

# Footnotes
<a name="lorenznote"></a>1.[^](#lorenz): Lorenz, T. (9 Dec 2021) Birds Aren’t Real, or Are They? Inside a Gen Z Conspiracy Theory. *The New York Times*. https://www.nytimes.com/2021/12/09/technology/birds-arent-real-gen-z-misinformation.html<br> 
<a name="admonishnote"></a>2.[^](#admonish): Also refs should be important to the background, not some randomly chosen vaguely related stuff. Include a web link if possible in refs as above.<br>
<a name="sotanote"></a>3.[^](#sota): Perhaps the current state of the art solution such as you see on [Papers with code](https://paperswithcode.com/sota). Or maybe not SOTA, but rather a standard textbook/Kaggle solution to this kind of problem
