# COGS 188 - Project Proposal

# Project Description

You have the choice of doing either (1) an AI solve a problem style project or (2) run a Special Topics class on a topic of your choice.  If you want to do (2) you should fill out the _other_ proposal for that. This is the proposal description for (1).

You will design and execute a machine learning project. There are a few constraints on the nature of the allowed project. 
- The problem addressed will not be a "toy problem" or "common training students problem" like 8-Queens or a small Traveling Salesman Problem or similar
- If its the kind of problem (e.g., RL) that interacts with a simulator or live task, then the problem will have a reasonably complex action space. For instance, a wupus world kind of thing with a 9x9 grid is definitely too small.  A simulated mountain car with a less complex 2-d road and simplified dynamics seems like a fairly low achievement level.  A more complex 3-d mountain car simulation with large extent and realistic dynamics, sure sounds great!
- If its the kind of problem that uses a dataset, then the dataset will have >1k observations and >5 variables. I'd prefer more like >10k observations and >10 variables. A general rule is that if you have >100x more observations than variables, your solution will likely generalize a lot better. The goal of training an unsupervised machine learning model is to learn the underlying pattern in a dataset in order to generalize well to unseen data, so choosing a large dataset is very important.
- The project must include some elements we talked about in the course
- The project will include a model selection and/or feature selection component where you will be looking for the best setup to maximize the performance of your AI system. Generally RL tasks may require a huge amount of training, so extensive grid search is unlikely to be possible. However expoloring a few reasonable hyper-parameters may still be possible. 
- You will evaluate the performance of your AI system using more than one appropriate metric
- You will be writing a report describing and discussing these accomplishments


Feel free to delete this description section when you hand in your proposal.

# Names

- Nicholas Gao: background, problem statement, data 
- Ryan Chen: Expectations, Timeline, Evaluation, proposed solution, name
- Matthew Miyagishima: Project Description, Abstract, Ethics and Privacy

# Abstract 
This section should be short and clearly stated. It should be a single paragraph <200 words.  It should summarize: 
- what your goal/problem is
- what the data used represents and how they are measured
- what you will be doing with the data
- how performance/success will be measured

The goal of our project is to design a stock trading agent that interacts with historical stock data that learnings optimal trading strategies using Markov Decision Processes (MDP) and Reinforcement Learning (RL). We will use historical stock data from Yahoo Finance. The data will be accessed through the yfinance Python package. The dataset stores key features such as Opening Price, Highest Price, Lowest Price, Closing Price, Trading Volume, and Date which are measured daily. First we will prepare the data by cleaning missing values and normalizing key features to ensure consistency. Then the data we will train an agent to buy, sell, or hold decisions based on past market trends utilizing reinforcement learning algorithm such as Q-Learning and Monte-Carlo Simulations. The performance of the agent will be evaulated using ...

# Background

Fill in the background and discuss the kind of prior work that has gone on in this research area here. **Use inline citation** to specify which references support which statements.  You can do that through HTML footnotes (demonstrated here). I used to reccommend Markdown footnotes (google is your friend) because they are simpler but recently I have had some problems with them working for me whereas HTML ones always work so far. So use the method that works for you, but do use inline citations.

Here is an example of inline citation. After government genocide in the 20th century, real birds were replaced with surveillance drones designed to look just like birds<a name="lorenz"></a>[<sup>[1]</sup>](#lorenznote). Use a minimum of 3 to 5 citations, but we prefer more <a name="admonish"></a>[<sup>[2]</sup>](#admonishnote). You need enough citations to fully explain and back up important facts. 

Remeber you are trying to explain why someone would want to answer your question or why your hypothesis is in the form that you've stated. 

# Problem Statement

Clearly describe the problem that you are solving. Avoid ambiguous words. The problem described should be well defined and should have at least one ML-relevant potential solution. Additionally, describe the problem thoroughly such that it is clear that the problem is quantifiable (the problem can be expressed in mathematical or logical terms), measurable (the problem can be measured by some metric and clearly observed), and replicable (the problem can be reproduced and occurs more than once).

# Data

You should have a strong idea of what dataset(s) will be used to accomplish this project. 

If you know what (some) of the data you will use, please give the following information for each dataset:
- link/reference to obtain it
- description of the size of the dataset (# of variables, # of observations)
- what an observation consists of
- what some critical variables are, how they are represented
- any special handling, transformations, cleaning, etc will be needed

If you don't yet know what your dataset(s) will be, you should describe what you desire in terms of the above bullets.

# Proposed Solution

The solution to the problem statement above will be agents trained on stock trading. Our agents will be trained to buy, hold, or sell stocks in its portfolio to maximize its returns. With two different reinforcement learning approaches, we will evaluate how each trained agent behave differently. Agents will be trained on data mentioned above (price data and technical indicators) to make optimal stock trading decisions. While we are not considering another model as a benchmark, we will benchmark our agents with historical averages of the S&P 500.

**Monte Carlo Methods**

The agent will simulate the entire trading period using historical data of stocks in the training set to calculate reward values for actions taken at different states, as well as generate an optimal policy to take advantage of bullish or bearish markets.

**Q-Learning**

Q-Learning is an algorithm that learns the optimal action at each state, and the model simply needs to follow the selected actions. We will implement this using a hashtable where keys are each trading day and the values are the actions to take.

# Evaluation Metrics

The main evaluation metric that we will use will be how much the agent grows/shrinks their portfolio over the test period. We will do so by giving the agent a portfolio to start off with at the beginning of the test period and evaluate the portfolio's worth daily throughout testing to measure how well the agent is doing. We believe that this is a good evaluation metric as the main goal of the agent is to maximize gains through buying, holding, and selling stocks.

A mathematical representation of this metric would be

$G_T = V_T - V_0$

Where
- $G_T$ is the gain/loss on day T
- $V_T$ is the value of the portfolio on day T
- $V_0$ is the value of the portfolio i the beginning

# Ethics & Privacy

Developing a stock trading agent raises several ethical and privacy concerns, particularly in fairness, transparency, and security. The first concern is Market Fairness because algorithmic trading has the possiblity to contirbute to market manipulation, flash crashes, and unfair trading advantages for those with more computational resources. Furthermore, high-frequency trading firms already exploit the small inefficiencies that cannot be done by human traders. The next concern is that training models on historical data must be done cautiously to avoid overfitting to past trends which could mislead users into making bad financial decisions. Ensuring a transparent and interpretable agent is crucial because black-box reinforcement learning models can make unpredictable trades. Lastly, the societal impact should be considered because automated trading can influence the price of stocks and exacerbate market volatility and systematic risks. Making sure that there are safeguards within the agent if it is deployed at scale is necessary to prevent market spikes and crashes. 

# Team Expectations 

* Respond in a timely manner (within the day unless message sent after business hours) to communications via text, calls, or emails.
* Attend all scheduled team meetings unless absense if communicated and excused beforehand.
* Be punctual in attending team meetings.
* Split work evenly and deliver assigned tasks in a timely manner.
* Communicate openly about any issues/concerns/questions/etc. with other team members.
* Collaborate effectively via the GitHub repository, including descriptive commit messages.

# Project Timeline Proposal

| Meeting Date  | Meeting Time| Completed Before Meeting  | Discuss at Meeting |
|---|---|---|---|
| 2/13  |  8 PM |  Brainstorm topics/questions (all)  | Determine best form of communication; Discuss and decide on final project topic; discuss hypothesis; begin background research | 
| 2/14  |  11 PM |  Edit, finalize, and submit proposal; | 
| 2/20  | 8 AM  | Project Proposal |Discuss how to build the agent and assign tasks for members to lead
| 2/27  | 8 PM  | Have data cleaning completed | Brainstorm how to start agent training and start programming
| 3/6  | 8 PM  | Finalize initial Agent code;  | Begin programming for optimization; Discuss/edit project code; Complete project |
| 3/13  | 8 PM  | Complete analysis; Draft results/conclusion/discussion | Discuss/edit full project |
| 3/19  | Before 11:59 PM  | NA | Turn in Final Project  |

# Footnotes
<a name="lorenznote"></a>1.[^](#lorenz): Lorenz, T. (9 Dec 2021) Birds Aren’t Real, or Are They? Inside a Gen Z Conspiracy Theory. *The New York Times*. https://www.nytimes.com/2021/12/09/technology/birds-arent-real-gen-z-misinformation.html<br> 
<a name="admonishnote"></a>2.[^](#admonish): Also refs should be important to the background, not some randomly chosen vaguely related stuff. Include a web link if possible in refs as above.<br>
<a name="sotanote"></a>3.[^](#sota): Perhaps the current state of the art solution such as you see on [Papers with code](https://paperswithcode.com/sota). Or maybe not SOTA, but rather a standard textbook/Kaggle solution to this kind of problem
