## **0. Data donwload**

- This notebook downloads daily OHLCV data for the portfolio asset universe (equities, indices, FX, commodities, and bond ETFs) for the period 2019-2024 using Yahoo Finance (yfinance) and stores cleaned CSV files under `data/raw/prices`.
- These raw files will be the input for all subsequent steps: returns calculation, portfolio construction, risk metrics and optimization.
- The `auto_adjust=True` argument from `yfinance` was used to choose **Adj_close** as my **Close** price

**RESULTS**
- All assets were downloaded successfully and saved in `data/raw/prices`
- Special characters ('^', '=F', '=X') were removed from **tickers** when defining file names

#### **0.1 Importing libraries**

In [1]:
# Importing necessary libraries
import pandas as pd
import yfinance as yf
from src.helpers_io import raw_path

#### **0.2 Asset universe**

- **AMZN** – Amazon.com, Inc.
- **BZ=F** – Brent Crude Oil Futures
- **CL=F** – WTI Crude Oil Futures
- **EURUSD=X** – Euro / US Dollar
- **GC=F** – Gold Futures
- **GBPUSD=X** – British Pound / US Dollar
- **IEF** – iShares 7–10 Year Treasury Bond ETF
- **JPM** – JPMorgan Chase & Co.
- **MSFT** – Microsoft Corporation
- **NG=F** – Natural Gas Futures
- **NVDA** – NVIDIA Corporation
- **ORCL** – Oracle Corporation
- **SI=F** – Silver Futures
- **TLT** – iShares 20+ Year Treasury Bond ETF
- **USDJPY=X** – US Dollar / Japanese Yen
- **^FTSE** – FTSE 100 Index
- **^GSPC** – S&P 500 Index
- **^IXIC** – NASDAQ Composite Index

In [2]:
# 1. Asset universe
equities = ["MSFT", "AMZN", "NVDA", "ORCL", "JPM"]
indices = ["^GSPC", "^IXIC", "^FTSE"]
fx = ["EURUSD=X", "GBPUSD=X", "USDJPY=X"]
commodities = ["CL=F", "BZ=F", "NG=F", "GC=F", "SI=F"]
bonds = ["TLT", "IEF"]

all_tickers = equities + indices + fx + commodities + bonds

# 2. Date range
start_date = "2019-01-01"
end_date = "2024-12-31"

#### **0.3 Downloading data from Yahoo Finance**

In [4]:
# 3. Download and save each asset to data/raw
prices_dir = raw_path("prices")
prices_dir.mkdir(parents=True, exist_ok=True)   # Ensures 'prices' folder exists

for ticker in all_tickers:
    # auto_adust = True -> Adj_close = Close
    # progress = False -> No progress bar
    # multi_level_index = False -> Avoids multi level index in the column names
    data = yf.download(ticker, start=start_date, end=end_date, auto_adjust=True, progress=False, multi_level_index=False)

    # Converting 'data' into a 'DataFrame'
    data = pd.DataFrame(data)

    # Basic sanity check
    if data.empty:
        print(f"Warning: no data returned for {ticker}")
        continue

    # Reset index to have "Date" as a column for better data manipulation
    data = data.reset_index()

    # Clean filename (removing ^, =X, =F from tickers)
    filename = f"{ticker.replace('^', '').replace('=X', '').replace('=F', '')}_prices.csv"

    # Full file path inside 'data/raw/prices'
    filepath = prices_dir / filename

    # Save as CSV in data/raw/prices
    data.to_csv(filepath, index=False)

print("All data downloaded successfully! ✅")

# 4. Checking files
downloaded = len(list(prices_dir.glob("*.csv")))
print(f"{downloaded} files saved in {prices_dir}")

All data downloaded successfully! ✅
18 files saved in C:\Users\james\Desktop\UK Life\Data Scientist Career Path\My notes (Python, SQL, etc.)\Project Portfolios\finance-project\data\raw\prices
