## **0. Data Ingestion Pipeline**

**OVERVIEW**

This notebook orchestrates the automated data ingestion process for the portfolio analysis framework. Instead of manually listing tickers, it utilizes the `src.data.downloader` module to:

1.  **Parse** the user-defined asset universe from `tickers.txt`.
2.  **Audit** asset metadata (Currency, Quote Type) via Yahoo Finance.
3.  **Inject Dependencies** automatically (e.g., if a USD asset is found, `GBPUSD=X` is added).
4.  **Download** historical OHLCV data to `data/raw/prices`.
5.  **Generate** a data manifest (`data_manifest.csv`) for downstream processing.

**INPUTS**
* `tickers.txt`: A simple text file in the project root listing the desired assets.
* **Parameters**: Start Date, End Date, and Target Currency (e.g., GBP).

**OUTPUTS**
* Raw CSV files stored in `data/raw/prices`.
* Audit trail saved as `data/raw/data_manifest.csv`.

#### **0.1 Importing libraries**

In [1]:
# Importing necessary libraries
import pandas as pd
from src.data.downloader import download_data

#### **0.2 Configuration**
Define the temporal scope and the base currency for the portfolio.

* **Asset Universe**: Defined externally in `tickers.txt` (Project Root).
* **Start/End Date**: Locked to ensure reproducibility of the backtest.
* **Target Currency**: The currency in which the portfolio will ultimately be denominated. This triggers the automated "Dependency Injection" of FX pairs (e.g., `GBPUSD=X`) if needed.

In [2]:
# Pipeline parameters
INPUT_FILENAME = "tickers.txt"
TARGET_CURRENCY = "GBP"
RISK_FREE_TICKER = "^IRX"

# Backtest Period (Fixed for Reproducibility)
START_DATE = "2019-01-01"
END_DATE = "2024-12-31"

#### **0.3 Pipeline Execution**
Triggering the `download_data` engine. This process handles API connections, error logging, and file management internally.

In [3]:
# Execute the engineering pipeline
metadata = download_data(
    start_date=START_DATE, 
    end_date=END_DATE, 
    input_filename=INPUT_FILENAME, 
    target_currency=TARGET_CURRENCY,
    risk_free_ticker=RISK_FREE_TICKER
)

# Display summary of the ingestion process
if not metadata.empty:
    print(f"Total assets processed: {len(metadata)}")
    print("\nSample of Data Manifest:")
    display(metadata)
    
    # Audit: Verify FX Injection visually
    fx_checks = metadata[metadata['type'] == 'CURRENCY']
    if not fx_checks.empty:
        print("\nFX Dependencies:")
        display(fx_checks)
else:
    print("⚠️ Pipeline returned no data. Check logs/console for errors.")

2025-12-22 22:00:49,330 - INFO - Starting pipeline. Period: 2019-01-01 to 2024-12-31
2025-12-22 22:00:49,337 - INFO - Loaded 19 unique tickers from tickers.txt
2025-12-22 22:00:49,340 - INFO - Analyzing assets and checking for FX dependencies
2025-12-22 22:00:55,784 - INFO - FX dependency: GBPUSD=X (for USD assets)
2025-12-22 22:00:55,785 - INFO - FX dependency: GBPJPY=X (for JPY assets)
2025-12-22 22:00:55,817 - INFO - Metadata manifest saved to C:\Users\james\Desktop\UK Life\Data Scientist Career Path\My notes (Python, SQL, etc.)\Portfolio of projects\finance-project\data\raw\data_manifest.csv
2025-12-22 22:00:55,820 - INFO - Starting download for 21 assets...
2025-12-22 22:01:04,021 - INFO - Pipeline finished. Info: {'success': 21, 'failed': 0}


Total assets processed: 21

Sample of Data Manifest:


Unnamed: 0,ticker,name,original_currency,type,source
0,AMZN,"Amazon.com, Inc.",USD,EQUITY,user_input
1,BNO,"United States Brent Oil Fund, L",USD,ETF,user_input
2,EURUSD=X,EUR/USD,USD,CURRENCY,user_input
3,GLD,SPDR Gold Shares,USD,ETF,user_input
4,IEF,iShares 7-10 Year Treasury Bond,USD,ETF,user_input
5,ISF.L,ISHARES PLC ISHARES CORE FTSE10,GBP,ETF,user_input
6,JPM,JP Morgan Chase & Co.,USD,EQUITY,user_input
7,MSFT,Microsoft Corporation,USD,EQUITY,user_input
8,NVDA,NVIDIA Corporation,USD,EQUITY,user_input
9,ORCL,Oracle Corporation,USD,EQUITY,user_input



FX Dependencies:


Unnamed: 0,ticker,name,original_currency,type,source
2,EURUSD=X,EUR/USD,USD,CURRENCY,user_input
15,USDJPY=X,USD/JPY,JPY,CURRENCY,user_input
19,GBPUSD=X,GBP/USD Exchange Rate,USD,CURRENCY,FX_dependency
20,GBPJPY=X,GBP/JPY Exchange Rate,JPY,CURRENCY,FX_dependency
