Downloading Historical Financial Data

In [42]:
import yfinance as yf # for retrieving historical financial data from Yahoo Finance.
import pandas as pd # for data manipulation and analysis

assets = {
    'Equity_Index': "^GSPC", # S&P 500       
    'Stock': "AAPL", # Apple              
    'Currency_Pair': "GBPUSD=X", # GBPUSD   
    'Commodity': "ZC=F", # Corn futures       
    'Crypto': "ETH-USD"  # Ethereum          
}

start_date = "2020-05-01"
end_date = "2025-05-01"

data = yf.download(list(assets.values()), start=start_date, end=end_date)["Close"]


[*********************100%***********************]  5 of 5 completed


Renaming the DataFrame columns from technical ticker symbols to more readable asset names and saving the cleaned DataFrame to a CSV file

In [43]:
data = data.rename(columns={'^GSPC':'S&P 500',
                     'AAPL':'Apple',
                     'GBPUSD=X':'GBPUSD',
                     'ZC=F':'Corn futures',
                     "ETH-USD": 'Ethereum'})
data.to_csv('data.csv')

In [44]:
data.head()

Ticker,Apple,Ethereum,GBPUSD,Corn futures,S&P 500
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-05-01,70.059341,214.219101,1.258147,311.5,2830.709961
2020-05-02,,215.325378,,,
2020-05-03,,210.933151,,,
2020-05-04,71.050621,208.174011,1.245423,310.75,2842.73999
2020-05-05,72.116989,206.774399,1.244555,313.0,2868.439941


In [45]:
data.tail()

Ticker,Apple,Ethereum,GBPUSD,Corn futures,S&P 500
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-04-26,,1821.881104,,,
2025-04-27,,1792.86499,,,
2025-04-28,209.864792,1798.851807,1.32901,475.5,5528.75
2025-04-29,210.933395,1799.175659,1.343616,460.5,5560.830078
2025-04-30,212.22171,1793.775391,1.341079,467.25,5569.060059


Verify that the DataFrame index is correctly parsed as datetime values

In [46]:
data.index

DatetimeIndex(['2020-05-01', '2020-05-02', '2020-05-03', '2020-05-04',
               '2020-05-05', '2020-05-06', '2020-05-07', '2020-05-08',
               '2020-05-09', '2020-05-10',
               ...
               '2025-04-21', '2025-04-22', '2025-04-23', '2025-04-24',
               '2025-04-25', '2025-04-26', '2025-04-27', '2025-04-28',
               '2025-04-29', '2025-04-30'],
              dtype='datetime64[ns]', name='Date', length=1826, freq='D')

The in-sample period is starting on 2020-05-01

In [47]:
data.iloc[:-365]

Ticker,Apple,Ethereum,GBPUSD,Corn futures,S&P 500
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-05-01,70.059341,214.219101,1.258147,311.50,2830.709961
2020-05-02,,215.325378,,,
2020-05-03,,210.933151,,,
2020-05-04,71.050621,208.174011,1.245423,310.75,2842.739990
2020-05-05,72.116989,206.774399,1.244555,313.00,2868.439941
...,...,...,...,...,...
2024-04-26,168.283676,3130.164795,1.250907,440.00,5099.959961
2024-04-27,,3252.168213,,,
2024-04-28,,3262.774658,,,
2024-04-29,172.458466,3215.428955,1.251048,439.25,5116.169922


The out-of-sample period is lasting for 365 days and is starting on 2024-05-01

In [48]:
data.iloc[-365:]

Ticker,Apple,Ethereum,GBPUSD,Corn futures,S&P 500
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2024-05-01,168.283676,2969.784668,1.249016,443.25,5018.390137
2024-05-02,171.991272,2988.168457,1.253934,452.00,5064.200195
2024-05-03,182.279160,3103.541992,1.254060,447.00,5127.790039
2024-05-04,,3117.576416,,,
2024-05-05,,3137.249023,,,
...,...,...,...,...,...
2025-04-26,,1821.881104,,,
2025-04-27,,1792.864990,,,
2025-04-28,209.864792,1798.851807,1.329010,475.50,5528.750000
2025-04-29,210.933395,1799.175659,1.343616,460.50,5560.830078
