# RL Trading Agent Project Documentation

This notebook provides a comprehensive overview and usage guide for the RL Trading Agent project. The project uses reinforcement learning to train an agent to trade stocks based on engineered features from historical price data.

## Project Structure

```
RLTradingAgent/
├── agent/
│   ├── train.py         # Training script
│   ├── eval.py          # Evaluation script
│   ├── features.py      # Feature engineering and data splitting
│   ├── prepare_data.py  # Data preparation utility
│   └── env.py           # Custom trading environment
├── data/
│   ├── MSFT_train.csv   # Training data
│   ├── MSFT_test.csv    # Test data
│   └── MSFT_val.csv     # Validation data
└── Project_Documentation.ipynb  # This documentation
```

## 1. Feature Engineering (`features.py`)

The `features.py` module loads historical stock data and computes a rich set of features for RL training:
- **log_return:** Logarithmic returns
- **sma20, sma50:** Simple moving averages
- **volatility20:** Rolling volatility
- **ema12, ema26:** Exponential moving averages
- **macd, macd_signal:** MACD and its signal line
- **rsi14:** Relative Strength Index
- **bollinger_mid, bollinger_up, bollinger_down:** Bollinger Bands
- **volume_change:** Percentage change in volume

The `split_data` function divides the dataset into train, test, and validation sets based on date ranges.

In [None]:
# Example: Load and inspect features
from agent.features import load_data, split_data
df = load_data(ticker="MSFT")
df.head()

## 2. Data Preparation (`prepare_data.py`)

This script loads raw data, computes features, splits the data, and saves the resulting CSV files for training, testing, and validation.

Run from terminal:
```bash
python agent/prepare_data.py
```
Output files are saved in the `data/` directory.

## 3. Trading Environment (`env.py`)

Implements a custom OpenAI Gym-compatible environment for trading:
- **Observation:** Flattened window of engineered features for the past 30 days
- **Action space:** Discrete (0=SELL, 1=HOLD, 2=BUY)
- **Reward:** Based on trading actions, profit/loss, and volatility penalty
- **State variables:** Balance, shares held, equity

The environment is compatible with Stable Baselines3 RL algorithms.

In [None]:
# Example: Initialize and step through the environment
import pandas as pd
from agent.env import TradingEnv
df = pd.read_csv("data/MSFT_train.csv")
env = TradingEnv(df)
obs, info = env.reset()
for _ in range(5):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    print(f"Action: {action}, Reward: {reward:.2f}, Equity: {info['equity']:.2f}")
    if terminated:
        break

## 4. Training the RL Agent (`train.py`)

Trains a PPO agent using Stable Baselines3 on the custom environment.

**Key hyperparameters:**
- Policy: MlpPolicy
- Learning rate: 3e-4
- Batch size: 64
- n_steps: 2048
- Gamma: 0.99
- GAE lambda: 0.95
- Entropy coefficient: 0.03

Run training from terminal:
```bash
python agent/train.py
```
The trained model is saved as `ppo_trader.zip`.

## 5. Evaluating the Agent (`eval.py`)

Loads the trained model and evaluates its performance on the test set. Compares RL agent equity curve to a Buy & Hold strategy.

Run evaluation from terminal:
```bash
python agent/eval.py
```
A plot is displayed showing RL agent vs Buy & Hold performance.

In [None]:
# Example: Evaluate the trained agent
from stable_baselines3 import PPO
import pandas as pd
from agent.eval import evaluate_model
df_test = pd.read_csv("data/MSFT_test.csv")
model = PPO.load("ppo_trader")
agent_equity, actions = evaluate_model(model, df_test, window_size=30)
print("Final equity:", agent_equity[-1])

## 6. Extending the Project

- **Try different tickers:** Change the ticker in `prepare_data.py` and retrain.
- **Tune hyperparameters:** Adjust PPO settings in `train.py`.
- **Add features:** Enhance `features.py` with more indicators.
- **Improve reward function:** Refine reward logic in `env.py`.
- **Implement risk management:** Add drawdown or risk metrics.
- **Backtest on validation set:** Use `MSFT_val.csv` for out-of-sample testing.

## 7. References
- [Stable Baselines3 Documentation](https://stable-baselines3.readthedocs.io/)
- [OpenAI Gym](https://gym.openai.com/)
- [Pandas Documentation](https://pandas.pydata.org/)
- [yfinance Documentation](https://pypi.org/project/yfinance/)

---
For further questions, see code comments or contact the project maintainer.