# Data Collection for Reinforcement Learning in Finance

This notebook collects and prepares financial data for reinforcement learning applications. It downloads historical price data for a portfolio of stocks, filters out young assets, and creates technical features for machine learning.

## Overview
- Load portfolio holdings from CSV
- Download 20-year historical data using yfinance
- Filter assets based on data availability (minimum 5 years)
- Calculate technical indicators and features
- Save processed data for training

## Key Features Calculated
- Log returns
- Simple Moving Averages (10-day and 30-day)
- Volatility measures
- Sharpe ratios (60-day and 120-day windows)
- Rolling statistics for returns

## 1. Import Required Libraries

In [3]:
import pandas as pd
import yfinance as yf
import numpy as np
import warnings
from datetime import datetime

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')

datetime.now().strftime('%Y-%m-%d %H:%M:%S')

'2025-07-31 20:29:50'

## 2. Load Portfolio and Download Price Data

In [17]:
# Load portfolio holdings from CSV file

portfolio = pd.read_csv("portfolio_holdings.csv")
tickers = portfolio['Ticker'].unique().tolist()
print(f" Portfolio loaded successfully with {len(tickers)} unique tickers")
print(f"Tickers: {tickers}")


# Download historical price data
PERIOD = '20y' # 20 years of data
print(f"\n Downloading {PERIOD} of historical data for {len(tickers)} tickers...")

data = yf.download(tickers, period=PERIOD, auto_adjust=True)['Close'].round(2)
print(f" Data download completed. Shape: {data.shape}")

[*****                 11%                       ]  2 of 18 completed

 Portfolio loaded successfully with 18 unique tickers
Tickers: ['RDDT', 'NVDA', 'SMR', 'MU', 'MRVL', 'MSFT', 'ASML', 'AEM', 'AMD', 'VERU', 'AI', 'GOOGL', 'INGM', 'PLUG', 'IONQ', 'CHYM', 'RGTI', 'ARBE']

 Downloading 20y of historical data for 18 tickers...


[*********************100%***********************]  18 of 18 completed

 Data download completed. Shape: (5032, 18)





In [18]:
data.head()

Ticker,AEM,AI,AMD,ARBE,ASML,CHYM,GOOGL,INGM,IONQ,MRVL,MSFT,MU,NVDA,PLUG,RDDT,RGTI,SMR,VERU
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
2005-08-01,9.38,,20.05,,16.64,,7.25,,,19.19,18.03,11.9,0.21,73.6,,,,1.34
2005-08-02,9.56,,20.42,,16.91,,7.44,,,19.13,18.65,11.93,0.21,72.5,,,,1.31
2005-08-03,10.13,,20.65,,16.77,,7.4,,,19.05,18.96,11.88,0.21,69.7,,,,1.27
2005-08-04,10.16,,20.15,,16.63,,7.41,,,18.36,19.0,11.62,0.21,66.4,,,,1.24
2005-08-05,10.0,,19.91,,16.46,,7.27,,,18.41,19.31,11.58,0.22,63.0,,,,1.35


In [19]:

print(f" RL Data Summary:")
print(f"Shape: {data.shape}")
print(f"Date range: {data.index.min()} to {data.index.max()}")
print(f"Number of trading days: {len(data)}")
print(f"\nLatest prices:")
data.tail()

 RL Data Summary:
Shape: (5032, 18)
Date range: 2005-08-01 00:00:00 to 2025-07-31 00:00:00
Number of trading days: 5032

Latest prices:


Ticker,AEM,AI,AMD,ARBE,ASML,CHYM,GOOGL,INGM,IONQ,MRVL,MSFT,MU,NVDA,PLUG,RDDT,RGTI,SMR,VERU
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
2025-07-25,126.85,26.01,166.47,1.56,709.44,34.55,193.18,21.3,43.17,74.21,513.71,111.26,173.5,1.84,149.66,15.44,51.67,0.56
2025-07-28,123.74,25.79,173.66,1.51,728.13,33.43,192.58,21.0,42.34,75.91,512.5,111.25,176.75,1.77,151.6,15.57,50.99,0.53
2025-07-29,126.3,24.48,177.44,1.38,718.49,32.62,195.75,20.8,40.53,76.34,512.57,111.96,175.51,1.59,144.85,14.47,48.97,0.49
2025-07-30,123.37,24.22,179.51,1.38,721.45,33.82,196.53,20.43,39.88,81.74,513.24,114.74,179.27,1.55,149.33,14.17,50.51,0.49
2025-07-31,123.59,23.96,176.93,1.36,697.29,34.94,192.34,19.81,40.57,80.62,534.04,108.23,177.77,1.5,158.26,14.9,51.5,0.49


## 5. Calculate Technical Features

Calculate various technical indicators that will be used as features for the reinforcement learning model:

- **Log Returns**: Natural logarithm of price ratios
- **Simple Moving Averages**: 10-day and 30-day SMAs
- **Volatility**: 10-day rolling standard deviation of returns
- **Sharpe Ratios**: Risk-adjusted return measures (60-day and 120-day windows)
- **Rolling Statistics**: Mean and standard deviation of returns over different periods

In [38]:
# Calculate log returns (natural logarithm of price ratios)
log_returns = np.log(data / data.shift(1)).fillna(0)
# Calculate Simple Moving Averages (SMA) relative to current price
# These are normalized by current price to make them comparable across assets
sma10 = data.rolling(10).mean() / data - 1 # 10-day SMA
sma30 = data.rolling(30).mean() / data - 1 # 30-day SMA

# Calculate volatility (10-day rolling standard deviation of returns)
volatility_10 = log_returns.rolling(10).std().fillna(0)

# Calculate 60-day rolling statistics
log_return_mean_60 = log_returns.rolling(60).mean()
log_return_std_60 = log_returns.rolling(60).std()
# Sharpe ratio = mean return / standard deviation (with small epsilon to avoid division by zero)
sharpe_60 = log_return_mean_60 / (log_return_std_60 + 1e-8)

# Calculate 120-day rolling statistics
log_return_mean_120 = log_returns.rolling(120).mean()
log_return_std_120 = log_returns.rolling(120).std()
sharpe_120 = log_return_mean_120 / (log_return_std_120 + 1e-8)

print(" Technical features calculated successfully")
print(f"Features calculated for {len(tickers)} assets over {len(data)} trading days")

 Technical features calculated successfully
Features calculated for 18 assets over 5032 trading days


## 6. Create Feature Dataset for Machine Learning

Transform the data into a long-form dataset suitable for machine learning and reinforcement learning applications. Each row represents a single asset on a single date with all calculated features.

In [39]:
# Stack features into a single DataFrame in long form for ML/RL
feature_frames = []

for ticker in data:
    df = pd.DataFrame({
        'Date': log_returns.index,
        'Ticker': ticker,
        'log_return': log_returns[ticker].values,
        'sma10': sma10[ticker].values,
        'sma30': sma30[ticker].values,
        'volatility_10': volatility_10[ticker].values,
        'log_return_mean_60': log_return_mean_60[ticker].values,
        'log_return_std_60': log_return_std_60[ticker].values,
        'sharpe_60': sharpe_60[ticker].values,
        'log_return_mean_120': log_return_mean_120[ticker].values,
        'log_return_std_120': log_return_std_120[ticker].values,
        'sharpe_120': sharpe_120[ticker].values
    })
    feature_frames.append(df.round(4))

# Combine all feature frames
features = pd.concat(feature_frames, ignore_index=True)

features.fillna(0, inplace=True)

print(f"\n Dataset Summary:")
print(f"Features shape: {features.shape}")
print(f"\n Date range: {features.Date.min()} to {features.Date.max()}")
print(f"\n Sample of features:")
features.head()


 Dataset Summary:
Features shape: (90576, 12)

 Date range: 2005-08-01 00:00:00 to 2025-07-31 00:00:00

 Sample of features:


Unnamed: 0,Date,Ticker,log_return,sma10,sma30,volatility_10,log_return_mean_60,log_return_std_60,sharpe_60,log_return_mean_120,log_return_std_120,sharpe_120
0,2005-08-01,AEM,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2005-08-02,AEM,0.019,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2005-08-03,AEM,0.0579,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2005-08-04,AEM,0.003,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2005-08-05,AEM,-0.0159,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [40]:
prices = data.copy()
prices.fillna(0, inplace=True)

print(f"Price data shape: {prices.shape}")
print(f"\n Date range: {prices.index.min()} to {prices.index.max()}")

# Display sample of price data
print(f"\n Sample of price data:")
prices.head()

Price data shape: (5032, 18)

 Date range: 2005-08-01 00:00:00 to 2025-07-31 00:00:00

 Sample of price data:


Ticker,AEM,AI,AMD,ARBE,ASML,CHYM,GOOGL,INGM,IONQ,MRVL,MSFT,MU,NVDA,PLUG,RDDT,RGTI,SMR,VERU
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
2005-08-01,9.38,0.0,20.05,0.0,16.64,0.0,7.25,0.0,0.0,19.19,18.03,11.9,0.21,73.6,0.0,0.0,0.0,1.34
2005-08-02,9.56,0.0,20.42,0.0,16.91,0.0,7.44,0.0,0.0,19.13,18.65,11.93,0.21,72.5,0.0,0.0,0.0,1.31
2005-08-03,10.13,0.0,20.65,0.0,16.77,0.0,7.4,0.0,0.0,19.05,18.96,11.88,0.21,69.7,0.0,0.0,0.0,1.27
2005-08-04,10.16,0.0,20.15,0.0,16.63,0.0,7.41,0.0,0.0,18.36,19.0,11.62,0.21,66.4,0.0,0.0,0.0,1.24
2005-08-05,10.0,0.0,19.91,0.0,16.46,0.0,7.27,0.0,0.0,18.41,19.31,11.58,0.22,63.0,0.0,0.0,0.0,1.35


## 7. Save Data for Training

Save the processed data to CSV files for use in subsequent notebooks and training steps.

In [41]:
# Save features and prices for training
features.to_csv("features_for_training.csv", index=False)
prices.to_csv("price_data_for_training.csv", index=True)

# Save list of tickers used for RL
with open('rl_tickers.txt', 'w') as f:
    for ticker in tickers:
        f.write(f"{ticker}\n")


print(f"\n Data collection completed successfully at {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")


 Data collection completed successfully at 2025-07-31 20:52:35


## Summary

This notebook successfully:

1. **Loaded portfolio data** from `portfolio_holdings.csv`
2. **Downloaded 20 years** of historical price data for all tickers
3. **Filtered assets** based on data availability (minimum 5 years)
4. **Calculated technical features** including returns, moving averages, volatility, and Sharpe ratios
5. **Created training datasets** in both long-form (features) and wide-form (prices)
6. **Saved processed data** for use in subsequent analysis and training

### Key Statistics:
- **Assets processed**: {len(rl_tickers)} tickers with sufficient historical data
- **Time period**: 20 years of data
- **Features created**: 10 technical indicators per asset
- **Total observations**: {features.shape[0]:,} feature rows

### Next Steps:
The processed data is now ready for:
- Feature engineering and selection
- Model training and validation
- Reinforcement learning algorithm development
- Portfolio optimization strategies