## Lecture 4 — Data (BCRP + Yahoo) + Plots + Stats + VaR 

Reproduce the key parts of the lecture notebook using:

- **Peru (BCRP API)**: `PD04637PD`, `PD04639PD`, `PD04704XD`, `PD04701XD`  
  *(FX + commodities exactly as in the notebook)*
- **USA (yfinance)**: `SPY`, `TLT`, `GLD`

**Deliverables**
- Multiple **plots** (including **one with annotations**)  
- A **summary statistics table**  
- **Historical 95% VaR** for a **60/40 portfolio** (SPY/TLT)

1. Build (and display) the **BCRPData API URL** that requests the 4 series used in the notebook.  


In [None]:
import pandas as pd

# 1. Definir los códigos de las series solicitadas
series_codes = ["PD04637PD", "PD04639PD", "PD04704XD", "PD04701XD"]

# Unir los códigos 
series_string = "-".join(series_codes)

# Definir el rango de fechas
format_out = "json"
start_date = "2022-01-01"
end_date = "2024-12-31"

# Construir la URL completa
base_url = "https://estadisticas.bcrp.gob.pe/estadisticas/series/api"
api_url = f"{base_url}/{series_string}/{format_out}/{start_date}/{end_date}"

print("BCRPData API URL:")
print(api_url)

2. Download those series and build a **tidy** table: `date`, `series`, `value`.  


In [None]:
import requests
import pandas as pd
response = requests.get(api_url)
data = response.json()

periods = data['periods']
tidy_data = []

meses_esp = {
    'Ene': 'Jan', 'Feb': 'Feb', 'Mar': 'Mar', 'Abr': 'Apr', 
    'May': 'May', 'Jun': 'Jun', 'Jul': 'Jul', 'Ago': 'Aug', 
    'Set': 'Sep', 'Oct': 'Oct', 'Nov': 'Nov', 'Dic': 'Dec'
}

for period in periods:
    date_str = period['name'] 
    
    # Traducir el mes
    for esp, eng in meses_esp.items():
        if esp in date_str:
            date_str = date_str.replace(esp, eng)
            break
    
    values = period['values']
    for i, val in enumerate(values):
        try:
            clean_val = float(val)
        except (ValueError, TypeError):
            clean_val = None
            
        tidy_data.append({
            'date': date_str,
            'series': series_codes[i],
            'value': clean_val
        })

df_tidy = pd.DataFrame(tidy_data)

df_tidy['date'] = pd.to_datetime(df_tidy['date'], format='%d.%b.%y')

print("Tabla TIDY:")
print(df_tidy.head())

3. Clean to **wide format** with columns: `fx_interbank`, `fx_sbs`, `gold`, `copper` (as in the notebook).  


In [None]:
df_wide = df_tidy.pivot(index='date', columns='series', values='value')
# Renombramos las columnas
df_wide = df_wide.rename(columns={
    "PD04637PD": "fx_interbank",
    "PD04639PD": "fx_sbs",
    "PD04704XD": "gold",
    "PD04701XD": "copper"
})

df_wide = df_wide.sort_index().dropna()

print("Peru Wide :")
print(df_wide.head())

4. Download `SPY`, `TLT`, `GLD` from yfinance and build: `date`, `ticker`, `close`.  


In [None]:
import yfinance as yf

tickers = ['SPY', 'TLT', 'GLD']

usa_data = yf.download(tickers, start="2020-01-01", end="2024-12-31")['Close']

df_usa_tidy = usa_data.stack().reset_index()
df_usa_tidy.columns = ['date', 'ticker', 'close']

print("\nTabla USA (yfinance):")
print(df_usa_tidy.head())

5. Compute **daily returns** by ticker (`ret`) and validate there are **no inf values**.  


In [None]:
# Aseguramos el calculo del retorno de cada activo
df_usa_tidy['ret'] = df_usa_tidy.groupby('ticker')['close'].pct_change()

# Eliminamos los primeros registros
df_usa_tidy = df_usa_tidy.dropna(subset=['ret'])

# Que no hayan valores infinitos
import numpy as np
inf_count = np.isinf(df_usa_tidy['ret']).sum()

print(f"Numero de valores infinitos: {inf_count}")

# Si hubiera infinitos los eliminamos:
df_usa_tidy = df_usa_tidy[~np.isinf(df_usa_tidy['ret'])]

print("\nRetornos Calculados:")
print(df_usa_tidy.head())

6. *(Quantities)* Compare FX levels in Peru: produce a **plot** and a short comment.  


In [None]:
import matplotlib.pyplot as plt

# Graficar niveles de FX (Peru)
plt.figure(figsize=(10, 5))
plt.plot(df_wide.index, df_wide['fx_interbank'], label='FX Interbancario', alpha=0.8)
plt.plot(df_wide.index, df_wide['fx_sbs'], label='FX SBS', linestyle='--', alpha=0.8)

plt.title('Serie de tiempo de tipo de cambio')
plt.xlabel('Fecha')
plt.ylabel('Ratio PEN/USD')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

7. *(Proportions)* Compute the **share of positive-return days** by ticker (USA).  


In [None]:
# Creamos una columna booleana que es True si el retorno es > 0
df_usa_tidy['pos_ret'] = df_usa_tidy['ret'] > 0

# Calculamos el promedio de esa columna booleana por ticker
positive_shares = df_usa_tidy.groupby('ticker')['pos_ret'].mean()

print(positive_shares)

8. Plot that share as a **bar chart** and add **labels above each bar** (`annotate`).  


In [None]:
# Gráfico de barras con etiquetas (annotate)
plt.figure(figsize=(8, 6))
bars = plt.bar(positive_shares.index, positive_shares.values, color=['green','blue','black'])

# Añadir etiquetas arriba de cada barra
for bar in bars:
    yval = bar.get_height()
    plt.annotate(f'{yval:.2%}', 
                 xy=(bar.get_x() + bar.get_width() / 2, yval),
                 xytext=(0, 5),
                 textcoords="offset points",
                 ha='center', va='bottom', fontsize=11, fontweight='bold')

plt.title('Share of Positive-Return Days (USA Assets)')
plt.ylabel('Proporción (0 a 1)')
plt.ylim(0, 0.65)
plt.grid(axis='y', linestyle='--', alpha=0.6)
plt.show()

9. *(Distributions)* Compare the distribution of **Peru Gold** vs **GLD** (histogram).  


10. Add an **ECDF** (if used in the notebook) and comment on what changes vs the histogram.  


11. *(Relationships)* Build `FX_change` and relate it to `SPY_ret` (scatter plot).  


12. Compute the **correlation** between `FX_change` and `SPY_ret` and explain the sign.  


13. Estimate a simple regression `FX_change ~ SPY_ret` and interpret the coefficient.  


14. *(Pandas)* Do a selection exercise: `.iloc` (position-based) vs conditional filtering.  


15. Create missing data on purpose in one series and apply imputation (as in the notebook).  


16. Standardize a variable (z-score) and plot **before vs after**.  


17. Find the day with the largest `|SPY_ret|` and **annotate it** in the returns plot (like the exercise).  


18. Save one figure into `/figures` using `savefig` and verify the file exists.  


19. Build a **summary stats table** for returns (mean, sd, p5, p95, etc.).  


20. Compute **historical 95% VaR** for a **60/40 portfolio (SPY/TLT)** and explain what it means.
