# 📥 01_download.ipynb  
## S&P 500 Data Download

### 1. Objective  
We want to **automatically fetch** the daily price history of the S&P 500 index and save it as a CSV file for later analysis.

> _Like asking a librarian to print out the last 10 years of newspaper headlines for us to read later._

---

### 2. Data Source & Why Stooq  
- We use **Stooq** because Yahoo Finance often limits our requests.  
- Stooq gives us the same daily Open/High/Low/Close/Volume without blocking us.

---

### 3. Steps  
1. **Import libraries** (`pandas_datareader`, `pandas`, `pathlib`).  
2. **Define paths** where we will save the data.  
3. **Download** the data from Stooq, **trim** to 2010–2025, and **sort** by date.  
4. **Save** the result as `data/raw/sp500.csv`.  
5. **Preview** the first 5 rows.

In [None]:
# 01_download.ipynb
import pandas as pd
from pandas_datareader import data as pdr
from pathlib import Path
import os, sys

# Confirma entorno activo (opcional)
print("Usando:", sys.executable)

# Define ruta de salida
root = Path().resolve().parent      # proyecto/sp500_dl
out = root / "data" / "raw" / "sp500.csv"
out.parent.mkdir(parents=True, exist_ok=True)

# Descarga desde Stooq (no rate-limit)
df = (
    pdr.DataReader("^SPX", "stooq")  # símbolo S&P 500 en Stooq
       .sort_index()                 # orden cronológico
)

# Filtra rango de fechas
df = df.loc["2010-01-01":"2025-05-20"]

# Guarda
df.to_csv(out)

# Show the first 5 rows
df.head()


Usando: c:\Users\Antho\.conda\envs\sp500_dl\python.exe


Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-05-26,3004.08,3021.72,2988.17,2991.77,3349414000.0
2020-05-27,3015.65,3036.25,2969.75,3036.13,3652902000.0
2020-05-28,3046.61,3068.67,3023.4,3029.73,3136898000.0
2020-05-29,3025.17,3049.17,2998.61,3044.31,4372836000.0
2020-06-01,3038.78,3062.18,3031.54,3055.73,2494242000.0
