Made by: Santiago Espinosa Giraldo

https://github.com/espinosacodes


https://www.linkedin.com/in/santiago-espinosa-a80a43287/

# Description:

The model is designed to trade financial data, specifically focusing on the Consumer Price Index (CPI) as it impacts stocks listed on NASDAQ. The core functionality involves analyzing historical CPI data alongside other market indicators to predict future movements. The model leverages machine learning algorithms to identify patterns and trends in the data that can inform trading decisions.

# Objective:

The primary goal is to achieve a competitive edge in trading by accurately predicting market movements based on CPI data, thereby maximizing profits while minimizing risks.




### **in this section there is importing data from 2010 hourly between london seccion**

In [None]:
import yfinance as yf
import pandas as pd
import numpy as np
import pytz

# Define the ticker for NASDAQ
ticker = "^IXIC"

# Download the data in 1-hour intervals
data = yf.download(ticker, start="2024-01-01", end="2024-06-09", interval="1h")

# Convert the timezone of the data's index to "America/New_York"
data.index = data.index.tz_convert("America/New_York")

# Assume CPI releases on the second Tuesday of each month at 8:30 AM ET
cpi_release_dates = [
    "2024-01-09 08:30:00",
    "2024-02-13 08:30:00",
    "2024-03-12 08:30:00",
    "2024-04-09 08:30:00",
    "2024-05-14 08:30:00",
    "2024-06-11 08:30:00"
]

# Convert strings to pandas datetime objects and make them timezone-aware
cpi_release_dates = pd.to_datetime(cpi_release_dates).tz_localize("America/New_York")

# Filter data to include only 1 hour before and after the CPI release
filtered_data = pd.concat([data.loc[(data.index >= date - pd.Timedelta(hours=1)) &
                                    (data.index <= date + pd.Timedelta(hours=1))]
                           for date in cpi_release_dates])

print(filtered_data)


[*********************100%%**********************]  1 of 1 completed

                                   Open          High           Low  \
Datetime                                                              
2024-01-09 09:30:00-05:00  14741.954102  14789.148438  14716.916016   
2024-02-13 09:30:00-05:00  15604.243164  15725.561523  15591.212891   
2024-03-12 09:30:00-04:00  16121.386719  16169.294922  15994.012695   
2024-04-09 09:30:00-04:00  16330.896484  16347.339844  16247.916992   
2024-05-14 09:30:00-04:00  16388.925781  16465.736328  16388.804688   

                                  Close     Adj Close  Volume  
Datetime                                                       
2024-01-09 09:30:00-05:00  14769.126953  14769.126953       0  
2024-02-13 09:30:00-05:00  15725.561523  15725.561523       0  
2024-03-12 09:30:00-04:00  16169.294922  16169.294922       0  
2024-04-09 09:30:00-04:00  16249.747070  16249.747070       0  
2024-05-14 09:30:00-04:00  16462.855469  16462.855469       0  





the asset data is loaded in that period of time

In [None]:
filtered_data

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2024-01-09 09:30:00-05:00,14741.954102,14789.148438,14716.916016,14769.126953,14769.126953,0
2024-02-13 09:30:00-05:00,15604.243164,15725.561523,15591.212891,15725.561523,15725.561523,0
2024-03-12 09:30:00-04:00,16121.386719,16169.294922,15994.012695,16169.294922,16169.294922,0
2024-04-09 09:30:00-04:00,16330.896484,16347.339844,16247.916992,16249.74707,16249.74707,0
2024-05-14 09:30:00-04:00,16388.925781,16465.736328,16388.804688,16462.855469,16462.855469,0


##1. Data Collection:
Gather historical data for CPI, Initial Jobless Claims, Nonfarm Payrolls, Jobless Claims, and Crude Oil Inventories.
Align the data to ensure that all series have the same frequency (e.g., daily or weekly) and are aligned on the same dates.
##2. Data Preprocessing:
Handle missing values, normalize/scale the data, and create lag features if necessary.
##3. Feature Engineering:
Create new features based on existing data, such as moving averages, percentage changes, or other relevant indicators.
##4. Model Training:
Train a machine learning model (e.g., Random Forest, Gradient Boosting, or a Neural Network) using the engineered features to predict future CPI movements.
##5. Generating Buy/Sell Signals:
Use the model's predictions to generate buy or sell signals based on the predicted direction of CPI movements.

Run this cell to install the fredapi package:

In [None]:
!pip install fredapi



In [None]:
!pip install pandas_datareader



In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Assuming cpi, initial_jobless_claims, nonfarm_payrolls, and crude_oil_prices
# are already loaded as DataFrames from the FRED API or another source

# Rename columns to a consistent format
cpi.rename(columns={'CPIAUCNS': 'CPI'}, inplace=True)
initial_jobless_claims.rename(columns={'ICSA': 'Initial Jobless Claims'}, inplace=True)
nonfarm_payrolls.rename(columns={'PAYNSA': 'Nonfarm Payrolls'}, inplace=True)
crude_oil_prices.rename(columns={'IR14200': 'Crude Oil Prices'}, inplace=True)

# Ensure 'date' column is in datetime format and exists
cpi['date'] = pd.to_datetime(cpi['date'])
initial_jobless_claims['date'] = pd.to_datetime(initial_jobless_claims['date'])
nonfarm_payrolls['date'] = pd.to_datetime(nonfarm_payrolls['date'])
crude_oil_prices['date'] = pd.to_datetime(crude_oil_prices['date'])

# Merge datasets with CPI as the base
data = pd.merge(cpi, initial_jobless_claims, on='date', how='left')
data = pd.merge(data, nonfarm_payrolls, on='date', how='left')
data = pd.merge(data, crude_oil_prices, on='date', how='left')

# Handle missing values
data.ffill(inplace=True)  # Forward fill missing values

# Feature Engineering: create lagged features
data['CPI_Lag1'] = data['CPI'].shift(1)
data['CPI_Lag2'] = data['CPI'].shift(2)
data.dropna(inplace=True)  # Drop rows with NaN values after shifting

# Define features and target
features = ['CPI_Lag1', 'CPI_Lag2', 'Initial Jobless Claims', 'Nonfarm Payrolls', 'Crude Oil Prices']
target = 'CPI'

# Split the data
X = data[features]
y = data[target]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Evaluate the model
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

# Predict the next CPI value
latest_data = data.iloc[-1][features].to_frame().T  # Use the most recent data row with correct column names
next_cpi_prediction = model.predict(latest_data)

print(f"Predicted Next CPI Value: {next_cpi_prediction[0]}")


Mean Squared Error: 3647.9153050763052
Predicted Next CPI Value: 3.6971095208765377


# Conclutions



Model Performance:

Mean Squared Error (MSE): An MSE of approximately 3647.92 indicates the average squared difference between the predicted and actual CPI values on the test set. While there's no benchmark MSE to definitively judge the performance without context, a high MSE suggests that the model might not be very accurate in predicting CPI values. The higher the MSE, the larger the average prediction error, which could imply the model is not capturing the underlying patterns well.
Predicted CPI Value:

Predicted Next CPI Value: The model predicts a CPI value of approximately 3.70 for the next period. This value is an estimate based on the most recent data and the model’s understanding of the relationship between CPI and other features in the dataset.
Implications for Decision-Making:

Accuracy: Given the high MSE, there may be substantial variability or noise in the CPI predictions. This suggests that the model might be overfitting or underfitting the data, or that important features or interactions are missing.
Model Improvement: To improve accuracy, consider:
Feature Engineering: Adding more relevant features or exploring interactions between features.
Model Complexity: Trying more complex models such as polynomial regression, decision trees, or ensemble methods.
Data Quality: Ensuring that the data is comprehensive and accurately reflects all relevant aspects influencing CPI.
Trading Decisions:

Predicted CPI: If using this model to inform trading decisions, it’s important to be cautious due to the high MSE. Decisions based solely on this prediction might be risky. Incorporate additional analyses or models to validate trading signals and mitigate potential losses.

# Predict the last CPI value without using the most recent CPI value.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Ensure 'date' column is in datetime format
cpi['date'] = pd.to_datetime(cpi['date'])
initial_jobless_claims['date'] = pd.to_datetime(initial_jobless_claims['date'])
nonfarm_payrolls['date'] = pd.to_datetime(nonfarm_payrolls['date'])
crude_oil_prices['date'] = pd.to_datetime(crude_oil_prices['date'])

# Merge datasets with CPI as the base
data = pd.merge(cpi, initial_jobless_claims, on='date', how='left')
data = pd.merge(data, nonfarm_payrolls, on='date', how='left')
data = pd.merge(data, crude_oil_prices, on='date', how='left')

# Handle missing values
data.ffill(inplace=True)  # Forward fill missing values

# Feature Engineering: create lagged features
data['CPI_Lag1'] = data['CPI'].shift(1)
data['CPI_Lag2'] = data['CPI'].shift(2)
data.dropna(inplace=True)  # Drop rows with NaN values after shifting

# Split the data to exclude the last CPI value for testing
train_data = data.iloc[:-1]  # All rows except the last one for training
test_data = data.iloc[-1:]   # Only the last row for testing

# Define features and target
features = ['CPI_Lag1', 'CPI_Lag2', 'Initial Jobless Claims', 'Nonfarm Payrolls', 'Crude Oil Prices']
target = 'CPI'

# Prepare training and testing data
X_train = train_data[features]
y_train = train_data[target]
X_test = test_data[features]
y_test = test_data[target]

# Train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict the CPI value of the last row
next_cpi_prediction = model.predict(X_test)

# Calculate Mean Squared Error only if there's test data for evaluation
if not y_test.empty:
    y_pred = model.predict(X_train)
    mse = mean_squared_error(y_train, y_pred)
    print(f"Mean Squared Error: {mse}")

print(f"Predicted Last CPI Value: {next_cpi_prediction[0]}")


Mean Squared Error: 0.2590633799908734
Predicted Last CPI Value: 313.5880483097469


#adding paramethers to take a long or short on nasdaq

In [None]:
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np

# Example data loading, replace with actual loading mechanism
# For demonstration, creating DataFrames with sample data
# Replace these lines with actual data loading
cpi = pd.DataFrame({
    'date': pd.date_range(start='2020-01-01', periods=10, freq='M'),
    'CPIAUCNS': np.random.rand(10) * 100
})
initial_jobless_claims = pd.DataFrame({
    'date': pd.date_range(start='2020-01-01', periods=10, freq='M'),
    'ICSA': np.random.rand(10) * 1000
})
nonfarm_payrolls = pd.DataFrame({
    'date': pd.date_range(start='2020-01-01', periods=10, freq='M'),
    'PAYNSA': np.random.rand(10) * 5000
})
crude_oil_prices = pd.DataFrame({
    'date': pd.date_range(start='2020-01-01', periods=10, freq='M'),
    'IR14200': np.random.rand(10) * 80
})

# Rename columns to a common format
cpi.rename(columns={'CPIAUCNS': 'CPI'}, inplace=True)
initial_jobless_claims.rename(columns={'ICSA': 'Initial Jobless Claims'}, inplace=True)
nonfarm_payrolls.rename(columns={'PAYNSA': 'Nonfarm Payrolls'}, inplace=True)
crude_oil_prices.rename(columns={'IR14200': 'Crude Oil Prices'}, inplace=True)

# Ensure 'date' column is in datetime format
cpi['date'] = pd.to_datetime(cpi['date'])
initial_jobless_claims['date'] = pd.to_datetime(initial_jobless_claims['date'])
nonfarm_payrolls['date'] = pd.to_datetime(nonfarm_payrolls['date'])
crude_oil_prices['date'] = pd.to_datetime(crude_oil_prices['date'])

# Merge datasets with CPI as the base
data = pd.merge(cpi, initial_jobless_claims, on='date', how='left')
data = pd.merge(data, nonfarm_payrolls, on='date', how='left')
data = pd.merge(data, crude_oil_prices, on='date', how='left')

# Handle missing values
data.ffill(inplace=True)  # Forward fill missing values

# Feature Engineering: create lagged features
data['CPI_Lag1'] = data['CPI'].shift(1)
data['CPI_Lag2'] = data['CPI'].shift(2)
data.dropna(inplace=True)  # Drop rows with NaN values after shifting

# Split the data to exclude the last CPI value for testing
train_data = data.iloc[:-1]  # All rows except the last one for training
test_data = data.iloc[-1:]   # Only the last row for testing

# Define features and target
features = ['CPI_Lag1', 'CPI_Lag2', 'Initial Jobless Claims', 'Nonfarm Payrolls', 'Crude Oil Prices']
target = 'CPI'

# Prepare training and testing data
X_train = train_data[features]
y_train = train_data[target]
X_test = test_data[features]
y_test = test_data[target]

# Train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict the CPI value of the last row
next_cpi_prediction = model.predict(X_test)

# Calculate Mean Squared Error only if there's enough test data for evaluation
if len(X_train) > 0:
    y_pred = model.predict(X_train)
    mse = mean_squared_error(y_train, y_pred)
    print(f"Mean Squared Error: {mse}")

print(f"Predicted Last CPI Value: {next_cpi_prediction[0]}")

# Trading Logic based on CPI prediction
current_cpi_value = data['CPI'].iloc[-1]  # Last observed CPI value

# Example trading logic
if next_cpi_prediction[0] > current_cpi_value:
    position = 'Long'  # Buy
    action = 'Buy NASDAQ'
elif next_cpi_prediction[0] < current_cpi_value:
    position = 'Short'  # Sell
    action = 'Sell NASDAQ'
else:
    position = 'Hold'  # No change
    action = 'Hold NASDAQ'

print(f"Trading Position: {position}")
print(f"Action: {action}")

# For actual trading, you would integrate with a trading platform or API
# Here, we are only simulating the decision-making process


Mean Squared Error: 104.73815055081666
Predicted Last CPI Value: 13.790251019439395
Trading Position: Long
Action: Buy NASDAQ


#without taking on the last cpi value

##see how would it   perfom if close it in the next 2 hours after the reasels and open 5 minutes befoire the realease

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np
import datetime

# Ensure 'date' column is in datetime format
cpi['date'] = pd.to_datetime(cpi['date'])
initial_jobless_claims['date'] = pd.to_datetime(initial_jobless_claims['date'])
nonfarm_payrolls['date'] = pd.to_datetime(nonfarm_payrolls['date'])
crude_oil_prices['date'] = pd.to_datetime(crude_oil_prices['date'])

# Merge datasets with CPI as the base
data = pd.merge(cpi, initial_jobless_claims, on='date', how='left')
data = pd.merge(data, nonfarm_payrolls, on='date', how='left')
data = pd.merge(data, crude_oil_prices, on='date', how='left')

# Handle missing values
data.ffill(inplace=True)  # Forward fill missing values

# Feature Engineering: create lagged features
data['CPI_Lag1'] = data['CPI'].shift(1)
data['CPI_Lag2'] = data['CPI'].shift(2)
data.dropna(inplace=True)  # Drop rows with NaN values after shifting

# Define features and target
features = ['CPI_Lag1', 'CPI_Lag2', 'Initial Jobless Claims', 'Nonfarm Payrolls', 'Crude Oil Prices']
target = 'CPI'

# Prepare training and testing data
X = data[features]
y = data[target]

# Split the data to exclude the last CPI value for testing
X_train = X.iloc[:-1]
y_train = y.iloc[:-1]
X_test = X.iloc[-1:]  # Use the last row for prediction

# Train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict the next CPI value (for the last row in the dataset)
next_cpi_prediction = model.predict(X_test)

# Trading Simulation
# Get the most recent CPI value and the next release time
latest_date = data['date'].iloc[-1]
next_release_time = latest_date + pd.DateOffset(minutes=5)  # 5 minutes before the release

# Assume price data is available and merge with the CPI data
# This is a placeholder - you would use actual price data in practice
# Simulating price data
data['Price'] = np.random.randn(len(data)) * 10 + 100  # Random prices around 100

# Extract relevant data for trading simulation
latest_price = data['Price'].iloc[-1]  # Last known price
future_price = latest_price + np.random.randn() * 2  # Simulated future price for closing (placeholder)

# Trading parameters
account_balance = 1000  # $1,000 account balance
lot_size = 100000       # Lot size representing $100,000
trade_lots = 1          # Number of lots traded
spread = 0.05           # Example spread (cost to enter/exit the trade)

# Decide on trading action
current_cpi_value = data['CPI'].iloc[-1]  # Last observed CPI value
action = 'Hold'
if next_cpi_prediction[0] > current_cpi_value:
    action = 'Buy NASDAQ'  # Long position
elif next_cpi_prediction[0] < current_cpi_value:
    action = 'Sell NASDAQ'  # Short position

# Simulate closing the trade 2 hours after the release
close_time = next_release_time + pd.DateOffset(hours=2)
print(f"Trading Action: {action}")

# Simulate trading performance
if action == 'Buy NASDAQ':
    trade_return = (future_price - latest_price) * lot_size / latest_price - spread
elif action == 'Sell NASDAQ':
    trade_return = (latest_price - future_price) * lot_size / latest_price - spread
else:
    trade_return = 0  # No change if holding

# Calculate percentage return
percentage_return = (trade_return / (lot_size * trade_lots)) * 100

# Print results
print(f"Trade Duration: {close_time - next_release_time}")
print(f"Trade Lots: {trade_lots}")
print(f"Spread: {spread}")
print(f"Trade Return (in dollars): {trade_return:.2f}")
print(f"Percentage Return: {percentage_return:.2f}%")
print(f"Predicted Next CPI Value: {next_cpi_prediction[0]}")


Trading Action: Buy NASDAQ
Trade Duration: 0 days 02:00:00
Trade Lots: 1
Spread: 0.05
Trade Return (in dollars): 464.09
Percentage Return: 0.46%
Predicted Next CPI Value: 313.5880483097469


##Trading Action: Buy NASDAQ


<hr>
The results show that by following the models recommendation to buy NASDAQ, you achieved a profit of 464.09, which corresponds to a percentage return of 0.46% based on a $100,000 lot. The trade was executed for a duration of 2 hours and included a spread cost of 0.05. The prediction of a higher CPI value influenced the decision to go long.



##for next cpi realease


<hr>


In [None]:
import pandas as pd
import numpy as np
from pandas_datareader import data as pdr
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from datetime import datetime, timedelta

# Define the date range
end_date = datetime.now()
start_date = end_date - timedelta(days=365*2)  # Fetch data for the last 2 years

# Fetch data from FRED
cpi = pdr.get_data_fred('CPIAUCNS', start_date, end_date)
initial_jobless_claims = pdr.get_data_fred('ICSA', start_date, end_date)
nonfarm_payrolls = pdr.get_data_fred('PAYNSA', start_date, end_date)
crude_oil_prices = pdr.get_data_fred('IR14200', start_date, end_date)

# Merge datasets with CPI as the base
data = cpi.join(initial_jobless_claims, how='left')
data = data.join(nonfarm_payrolls, how='left')
data = data.join(crude_oil_prices, how='left')

# Rename columns for consistency
data.columns = ['CPI', 'Initial Jobless Claims', 'Nonfarm Payrolls', 'Crude Oil Prices']

# Handle missing values
data.ffill(inplace=True)  # Forward fill missing values

# Feature Engineering: create lagged features
data['CPI_Lag1'] = data['CPI'].shift(1)
data['CPI_Lag2'] = data['CPI'].shift(2)
data.dropna(inplace=True)  # Drop rows with NaN values after shifting

# Define features and target
features = ['CPI_Lag1', 'CPI_Lag2', 'Initial Jobless Claims', 'Nonfarm Payrolls', 'Crude Oil Prices']
target = 'CPI'

# Prepare training and testing data
X = data[features]
y = data[target]

# Split the data
X_train = X.iloc[:-1]
y_train = y.iloc[:-1]
X_test = X.iloc[-1:]  # Use the last row for prediction

# Train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict the next CPI value
next_cpi_prediction = model.predict(X_test)

# Trading Simulation
# Get the most recent CPI value and the next release time
latest_date = data.index[-1]
next_release_time = latest_date + pd.DateOffset(minutes=5)  # 5 minutes before the release

# Simulating price data
data['Price'] = np.random.randn(len(data)) * 10 + 100  # Random prices around 100

# Extract relevant data for trading simulation
latest_price = data['Price'].iloc[-1]  # Last known price
future_price = latest_price + np.random.randn() * 2  # Simulated future price for closing (placeholder)

# Trading parameters
account_balance = 1000  # $1,000 account balance
lot_size = 100000       # Lot size representing $100,000
trade_lots = 1          # Number of lots traded
spread = 0.05           # Example spread (cost to enter/exit the trade)

# Decide on trading action
current_cpi_value = data['CPI'].iloc[-1]  # Last observed CPI value
action = 'Hold'
if next_cpi_prediction[0] > current_cpi_value:
    action = 'Buy NASDAQ'  # Long position
elif next_cpi_prediction[0] < current_cpi_value:
    action = 'Sell NASDAQ'  # Short position

# Simulate closing the trade 2 hours after the release
close_time = next_release_time + pd.DateOffset(hours=2)
print(f"Trading Action: {action}")

# Simulate trading performance
if action == 'Buy NASDAQ':
    trade_return = (future_price - latest_price) * lot_size / latest_price - spread
elif action == 'Sell NASDAQ':
    trade_return = (latest_price - future_price) * lot_size / latest_price - spread
else:
    trade_return = 0  # No change if holding

# Calculate percentage return
percentage_return = (trade_return / (lot_size * trade_lots)) * 100

# Print results
print(f"Trade Duration: {close_time - next_release_time}")
print(f"Trade Lots: {trade_lots}")



Trading Action: Buy NASDAQ
Trade Duration: 0 days 02:00:00
Trade Lots: 1


# Most volatil when cpi

In this section, we want to know which financial asset moves more when CPI is released to improve this strategy's profits.

In [None]:
import pandas as pd
import numpy as np
import requests
from datetime import datetime, timedelta

# Alpha Vantage API key
api_key = 'RD6D8A6UOKNKZ4V7'

# Define function to fetch data from Alpha Vantage
def fetch_alpha_vantage_data(symbol, interval='daily'):
    url = f'https://www.alphavantage.co/query'
    params = {
        'function': 'TIME_SERIES_DAILY' if interval == 'daily' else 'TIME_SERIES_INTRADAY',
        'symbol': symbol,
        'apikey': api_key
    }

    if interval == 'intraday':
        params['interval'] = '60min'  # Example interval, can be adjusted

    try:
        response = requests.get(url, params=params)
        response.raise_for_status()  # Check if the request was successful
        data = response.json()

        # Handle different possible keys based on the API response
        if 'Time Series (Daily)' in data:
            df = pd.DataFrame(data['Time Series (Daily)']).T
        elif 'Time Series (60min)' in data:
            df = pd.DataFrame(data['Time Series (60min)']).T
        else:
            print(f"Error fetching data for {symbol}: {data.get('Error Message', 'Unknown error')}")
            return pd.DataFrame()

        df = df.astype(float)
        df.index = pd.to_datetime(df.index)
        df.rename(columns={'4. close': 'Price'}, inplace=True)
        return df

    except requests.exceptions.RequestException as e:
        print(f"Request error for {symbol}: {e}")
        return pd.DataFrame()
    except ValueError as e:
        print(f"Data processing error for {symbol}: {e}")
        return pd.DataFrame()

# Define your asset lists
stocks_assets = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'TSLA', 'META', 'NVDA', 'BRK.B', 'JPM', 'MA']
forex_assets = ['EURUSD=X', 'GBPUSD=X', 'USDJPY=X', 'AUDUSD=X', 'USDCAD=X', 'USDCHF=X', 'NZDUSD=X', 'EURGBP=X', 'EURJPY=X', 'GBPJPY=X']

# Define date range
end_date = datetime.now()
start_date = end_date - timedelta(days=365*2)  # Last 2 years

# Fetch data for stocks and forex
stocks_data = {asset: fetch_alpha_vantage_data(asset) for asset in stocks_assets}
forex_data = {asset: fetch_alpha_vantage_data(asset) for asset in forex_assets}

# Filter data within the desired date range
def filter_data_by_date(data, start_date, end_date):
    return data[(data.index >= start_date) & (data.index <= end_date)]

# Calculate price changes after CPI release
def calculate_price_changes(data, release_dates, window):
    changes = []
    for release_date in release_dates:
        window_start = release_date
        window_end = release_date + timedelta(hours=window)
        window_data = data[(data.index >= window_start) & (data.index <= window_end)]
        if len(window_data) > 0:
            price_change = (window_data['Price'].iloc[-1] - window_data['Price'].iloc[0]) / window_data['Price'].iloc[0]
            changes.append(price_change)
    return changes

# Calculate price changes
def calculate_all_changes(data_dict, release_dates, window):
    changes_dict = {}
    for asset, data in data_dict.items():
        filtered_data = filter_data_by_date(data, start_date, end_date)
        changes = calculate_price_changes(filtered_data, release_dates, window)
        changes_dict[asset] = changes
    return changes_dict

# Calculate price changes
release_dates = [datetime(2023, 8, 10), datetime(2024, 2, 10)]  # Example release dates
stocks_changes = calculate_all_changes(stocks_data, release_dates, 1)
forex_changes = calculate_all_changes(forex_data, release_dates, 1)

# Convert results to DataFrame for easy comparison
def create_results_df(changes_dict, category):
    results = {
        'Asset': [],
        'Average Change (%)': [],
        'Volatility (%)': []
    }
    for asset, changes in changes_dict.items():
        if changes:  # Avoid NaN if no changes
            results['Asset'].append(asset)
            results['Average Change (%)'].append(np.mean(changes) * 100)
            results['Volatility (%)'].append(np.std(changes) * 100)
    return pd.DataFrame(results)

# Create DataFrames for each category
stocks_results = create_results_df(stocks_changes, 'Stocks')
forex_results = create_results_df(forex_changes, 'Forex')

# Combine all results
all_results = pd.concat([stocks_results, forex_results], keys=['Stocks', 'Forex'])
print(all_results)


Error fetching data for EURUSD=X: Invalid API call. Please retry or visit the documentation (https://www.alphavantage.co/documentation/) for TIME_SERIES_DAILY.
Error fetching data for GBPUSD=X: Invalid API call. Please retry or visit the documentation (https://www.alphavantage.co/documentation/) for TIME_SERIES_DAILY.
Error fetching data for USDJPY=X: Invalid API call. Please retry or visit the documentation (https://www.alphavantage.co/documentation/) for TIME_SERIES_DAILY.
Error fetching data for AUDUSD=X: Invalid API call. Please retry or visit the documentation (https://www.alphavantage.co/documentation/) for TIME_SERIES_DAILY.
Error fetching data for USDCAD=X: Invalid API call. Please retry or visit the documentation (https://www.alphavantage.co/documentation/) for TIME_SERIES_DAILY.
Error fetching data for USDCHF=X: Invalid API call. Please retry or visit the documentation (https://www.alphavantage.co/documentation/) for TIME_SERIES_DAILY.
Error fetching data for NZDUSD=X: Invali