# ü§ñ AlphaTraderLab v0 - Random Agent Demo

---

## Welcome to AlphaTraderLab!

In this notebook, we'll:
1. Set up the trading environment
2. Download historical Bitcoin price data
3. Test the environment with a **random agent** (no learning yet!)
4. Visualize the results

**What is a Random Agent?**  
It's an agent that randomly chooses actions (FLAT, LONG, or SHORT) without any intelligence. Think of it as flipping a coin to decide what to trade. Of course, it will perform poorly‚Äîbut it's a great way to test that our environment works correctly!

**In Step 2**, we'll train a smart RL agent that actually learns from data.

---

## üì¶ Step 1: Setup and Installation

First, we need to install all required packages. This cell detects if we're running in Google Colab and installs dependencies if needed.

In [None]:
# Detect if we're running in Google Colab
import sys

IN_COLAB = 'google.colab' in sys.modules

if IN_COLAB:
    print("üåê Running in Google Colab - Installing dependencies...")
    
    # Install required packages
    !pip install -q numpy pandas matplotlib yfinance gymnasium stable-baselines3 scipy
    
    print("‚úÖ Installation complete!")
else:
    print("üíª Running locally - Make sure you've installed requirements.txt")
    print("   Run: pip install -r requirements.txt")

## üìö Step 2: Import Libraries

Now let's import all the Python libraries we need:
- **numpy**: For numerical operations
- **pandas**: For handling data (price tables)
- **matplotlib**: For creating charts
- **yfinance**: For downloading historical market data
- **gymnasium**: The RL environment framework

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
import gymnasium as gym
from datetime import datetime, timedelta
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

print("‚úÖ All libraries imported successfully!")

## üîß Step 3: Load the Trading Environment

If we're in Colab, we need to upload our custom trading environment code.  
If we're running locally, we can just import it directly.

In [None]:
if IN_COLAB:
    print("üìÅ In Colab: Please upload the trading_env.py file")
    print("   (You can find it in the 'envs' folder of the project)")
    print()
    
    from google.colab import files
    
    # Upload the trading environment file
    print("üëâ Click 'Choose Files' and select 'trading_env.py'")
    uploaded = files.upload()
    
    print("\n‚úÖ File uploaded!")
    
    # Import the TradingEnv class
    from trading_env import TradingEnv
    
else:
    print("üíª Running locally - Importing TradingEnv from local files")
    
    # Add parent directory to path so we can import from envs
    import os
    import sys
    
    # Go up one level from notebooks/ to reach the project root
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
    if project_root not in sys.path:
        sys.path.insert(0, project_root)
    
    # Now import from the envs package
    from envs.trading_env import TradingEnv

print("‚úÖ TradingEnv loaded successfully!")

## üìä Step 4: Download Historical Market Data

Let's download Bitcoin (BTC-USD) historical price data from Yahoo Finance.  
We'll get **daily candles** from 2018 to today.

**What is OHLCV data?**
- **O**pen: Price at the start of the day
- **H**igh: Highest price during the day
- **L**ow: Lowest price during the day
- **C**lose: Price at the end of the day
- **V**olume: How much was traded

In [None]:
print("üì° Downloading BTC-USD historical data from Yahoo Finance...")
print("   This may take a few seconds...")
print()

# Download Bitcoin data (daily candles)
ticker = "BTC-USD"
start_date = "2018-01-01"
end_date = datetime.now().strftime("%Y-%m-%d")

# Fetch the data
df = yf.download(ticker, start=start_date, end=end_date, interval="1d", progress=False)

# Display basic info
print(f"‚úÖ Downloaded {len(df)} days of {ticker} data")
print(f"   Date range: {df.index[0].strftime('%Y-%m-%d')} to {df.index[-1].strftime('%Y-%m-%d')}")
print()
print("üìã First 5 rows of the data:")
print(df.head())
print()
print("üìã Last 5 rows of the data:")
print(df.tail())

## üìà Step 5: Visualize the Price Data

Let's plot the Bitcoin closing price over time to see what the data looks like.

In [None]:
# Create a figure with a nice size
plt.figure(figsize=(14, 6))

# Plot the closing price
plt.plot(df.index, df['Close'], label='BTC-USD Close Price', color='#FF6B35', linewidth=2)

# Customize the plot
plt.title('üìä Bitcoin (BTC-USD) Historical Price', fontsize=16, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Price (USD)', fontsize=12)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()

# Show the plot
plt.show()

print(f"üí∞ Current BTC price: ${df['Close'].iloc[-1]:,.2f}")
print(f"üìà All-time high in this dataset: ${df['Close'].max():,.2f}")
print(f"üìâ All-time low in this dataset: ${df['Close'].min():,.2f}")

## üéÆ Step 6: Create the Trading Environment

Now let's create an instance of our `TradingEnv` with the Bitcoin data we just downloaded.

We'll configure it with:
- **window_size = 30**: The agent sees the last 30 days of price data
- **initial_balance = $10,000**: Starting with $10k
- **transaction_cost = 0.1%**: Small fee when we change positions

In [None]:
print("üéÆ Creating the trading environment...")

# Create the environment
env = TradingEnv(
    df=df,
    window_size=30,          # Agent sees 30 days of history
    initial_balance=10000.0, # Start with $10,000
    transaction_cost=0.001   # 0.1% fee per trade
)

print("‚úÖ Environment created successfully!")
print()
print("üìä Environment Details:")
print(f"   - Observation space: {env.observation_space}")
print(f"   - Action space: {env.action_space}")
print(f"   - Actions: 0=FLAT, 1=LONG, 2=SHORT")
print(f"   - Data length: {len(df)} days")
print(f"   - Window size: {env.window_size} days")

## üé≤ Step 7: Test with a Random Agent

Let's run a simple test: we'll create an agent that takes **random actions** and see how it performs.

This is just to verify that our environment works correctly. The random agent will:
1. Reset the environment
2. Take 200 random steps
3. Track the portfolio value at each step

**Expected result**: The agent will probably lose money (that's normal for random trading!).

In [None]:
print("üé≤ Running a random agent for 200 steps...")
print("   (This agent randomly chooses: FLAT, LONG, or SHORT)")
print()

# Reset the environment
observation, info = env.reset(seed=42)

# Storage for tracking performance
equity_history = [env.initial_balance]  # Start with initial balance
action_history = []
reward_history = []

# Run for 200 steps (or until done)
num_steps = 200
done = False

for step in range(num_steps):
    if done:
        break
    
    # Take a random action
    action = env.action_space.sample()  # Randomly choose 0, 1, or 2
    
    # Execute the action in the environment
    observation, reward, done, truncated, info = env.step(action)
    
    # Store the results
    equity_history.append(info['equity'])
    action_history.append(action)
    reward_history.append(reward)
    
    # Print progress every 50 steps
    if (step + 1) % 50 == 0:
        action_names = ['FLAT', 'LONG', 'SHORT']
        print(f"   Step {step + 1:3d}: Action={action_names[action]}, "
              f"Equity=${info['equity']:,.2f}, Reward={reward:.4f}")

print()
print("‚úÖ Random agent test complete!")
print()
print("üìä Final Results:")
print(f"   - Starting balance: ${env.initial_balance:,.2f}")
print(f"   - Final equity: ${equity_history[-1]:,.2f}")
print(f"   - Total return: {((equity_history[-1] / env.initial_balance) - 1) * 100:.2f}%")
print(f"   - Total steps: {len(equity_history) - 1}")
print(f"   - Episode ended: {'Yes' if done else 'No'}")

## üìä Step 8: Visualize Agent Performance

Let's create some charts to see how the random agent performed.

In [None]:
# Create a figure with multiple subplots
fig, axes = plt.subplots(3, 1, figsize=(14, 10))

# Subplot 1: Equity Curve (Portfolio Value Over Time)
axes[0].plot(equity_history, color='#2E86AB', linewidth=2)
axes[0].axhline(y=env.initial_balance, color='gray', linestyle='--', label='Initial Balance')
axes[0].set_title('üí∞ Portfolio Equity Over Time', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Step')
axes[0].set_ylabel('Equity (USD)')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Subplot 2: Actions Taken
action_colors = ['#A8DADC', '#457B9D', '#E63946']  # FLAT=blue, LONG=darker blue, SHORT=red
action_names = ['FLAT', 'LONG', 'SHORT']
for i in range(3):
    action_mask = [1 if a == i else 0 for a in action_history]
    axes[1].scatter(
        range(len(action_history)), 
        action_mask,
        c=action_colors[i], 
        label=action_names[i],
        alpha=0.6,
        s=20
    )
axes[1].set_title('üéØ Actions Taken (0=FLAT, 1=LONG, 2=SHORT)', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Step')
axes[1].set_ylabel('Action')
axes[1].set_yticks([0, 1, 2])
axes[1].set_yticklabels(['FLAT', 'LONG', 'SHORT'])
axes[1].legend()
axes[1].grid(True, alpha=0.3)

# Subplot 3: Rewards Over Time
axes[2].plot(reward_history, color='#06A77D', linewidth=1, alpha=0.7)
axes[2].axhline(y=0, color='gray', linestyle='--', alpha=0.5)
axes[2].set_title('üìà Rewards per Step', fontsize=14, fontweight='bold')
axes[2].set_xlabel('Step')
axes[2].set_ylabel('Reward')
axes[2].grid(True, alpha=0.3)

# Adjust layout and show
plt.tight_layout()
plt.show()

# Print action distribution
print("\nüìä Action Distribution:")
action_counts = pd.Series(action_history).value_counts().sort_index()
for action_idx, count in action_counts.items():
    action_name = ['FLAT', 'LONG', 'SHORT'][action_idx]
    percentage = (count / len(action_history)) * 100
    print(f"   {action_name}: {count} times ({percentage:.1f}%)")

## üîç Step 9: Analyze the Environment

Let's take a closer look at what the agent actually "sees" (the observation).

In [None]:
# Reset the environment to get a fresh observation
observation, info = env.reset(seed=123)

print("üîç Analyzing the Observation Space")
print("=" * 50)
print()
print(f"üìê Observation shape: {observation.shape}")
print(f"   Total features: {len(observation)}")
print()
print("üß© Observation breakdown:")
window_features = env.window_size * 5  # OHLCV = 5 features per candle
print(f"   - Market data (OHLCV): {window_features} values")
print(f"     ({env.window_size} candles √ó 5 features)")
print(f"   - Current position: 1 value")
print(f"   - Equity ratio: 1 value")
print()
print("üìä Sample observation values (first 10 features):")
print(f"   {observation[:10]}")
print()
print("üìä Sample observation values (last 5 features):")
print(f"   {observation[-5:]}")
print()
print("‚úÖ The observation is a flat vector that the RL agent will learn from!")

## üéØ Step 10: Understanding the Results

### What Did We Learn?

1. **The Environment Works!** ‚úÖ  
   We successfully created a trading environment that follows the Gymnasium API.

2. **Random Trading is Bad** üìâ  
   The random agent probably lost money (or made very little). This shows that smart decision-making matters!

3. **Observations are Complex** üß©  
   The agent sees 30 days of OHLCV data plus portfolio info‚Äîa lot of information to process.

4. **Actions Have Consequences** ‚ö°  
   Each action (FLAT/LONG/SHORT) affects the portfolio value, and transaction costs add up.

### What's Next?

In **Step 2**, we'll:
- Train a **PPO (Proximal Policy Optimization)** agent
- The agent will learn to recognize profitable patterns
- Compare its performance to the random agent
- Add more sophisticated metrics (Sharpe ratio, max drawdown, etc.)

### Important Reminders

‚ö†Ô∏è **This is for learning only!**  
Do NOT use this to trade real money without extensive testing and validation.

üß† **Learning Takes Time**  
RL agents need thousands (or millions) of steps to learn good strategies.

üìö **Keep Experimenting**  
Try different:
- Assets (stocks, crypto, forex)
- Window sizes
- Reward functions
- Transaction costs

---

## üéâ Congratulations!

You've successfully set up your first RL trading environment! üöÄ

Keep learning, keep experimenting, and remember: the goal is to understand the fundamentals.

**See you in Step 2!** üëã

---

## üîß Optional: Save the Environment State

If you want to save your results for later analysis, run this cell.

In [None]:
# Save the equity history and action history to a CSV file
results_df = pd.DataFrame({
    'step': range(len(equity_history)),
    'equity': equity_history,
    'action': [None] + action_history,  # First row has no action
    'reward': [None] + reward_history    # First row has no reward
})

# Save to CSV
filename = 'random_agent_results.csv'
results_df.to_csv(filename, index=False)

print(f"üíæ Results saved to: {filename}")
print(f"   You can download this file and analyze it later.")

# If in Colab, download the file
if IN_COLAB:
    files.download(filename)
    print(f"üì• File '{filename}' downloaded to your computer!")