# Lecture 2 — Homework (Instructions Only)

**Goal:** Practice Python containers (lists, dicts, NumPy arrays) and core pandas workflows (Series, DataFrames, filtering, missing values, duplicates, groupby, reshape, merge) using **real financial data** from Peru and the US.

**Rules**
- Submit a single `.ipynb` that runs top-to-bottom with **Run All**.
- Use **real data** via the loaders you built in the Lecture 2 practice notebook (BCRP API + Yahoo Finance).
- Do **not** include plots in this homework.

---

## Part 1 — Series (containers → Series)

### Task 1.1 — From list to Series (Peru FX)
1. Fetch Peru FX buy and sell (PEN/USD) from BCRP for a recent 2–5 year window.
2. Compute the mid-rate: `mid = (buy + sell) / 2`.
3. Take the **last 20 values** of `mid` as a Python list.
4. Build a `pd.Series` from that list and name it `PENUSD_mid_last20`.
5. Report (in markdown): number of observations, mean, min, max.

### Task 1.2 — From NumPy array to Series (US market)
1. Download daily data for `GLD` from Yahoo Finance for a recent 3–7 year window.
2. Extract the `close` column as a NumPy array.
3. Build a `pd.Series` indexed by the corresponding dates.
4. Report (in markdown): mean, min, max, and the date range.

### Task 1.3 — From dict to Series (last available close)
1. Choose **three** Yahoo Finance tickers (example: `SPY`, `QQQ`, `IWM`).
2. For each ticker, compute the **last available close** in your sample.
3. Store them in a dict `{ticker: last_close}`.
4. Convert to a `pd.Series`, sort descending, and report the top ticker.

### Task 1.4 — Why alignment matters (short explanation)
Write a short markdown explanation (5–8 lines) comparing:
- a pandas merge/concat that aligns on dates (safe), versus
- truncating two NumPy arrays to the same length (unsafe).

Use your own examples from Tasks 1.1 and 1.2.

---

## Part 2 — DataFrames (core operations)

### Task 2.1 — Ticker metadata table
Create a DataFrame with:
- `ticker`
- `last_close`
- `market` (set to `"US"` for all rows)

Use the same tickers you picked in Task 1.3.

### Task 2.2 — Filtering (percentiles within each ticker)
Using your US market dataset with at least 2 tickers:
1. For each ticker, compute the **95th percentile** of `close`.
2. Keep only rows where `close` is strictly above the ticker’s 95th percentile.
3. Report: how many rows remain per ticker (a small table is enough).

### Task 2.3 — Missing values (controlled experiment)
1. Make a copy of your US market DataFrame.
2. With a fixed random seed, set **1% of `volume`** to missing (`NaN`).
3. Create two cleaned versions:
   - dropped rows with missing `volume`
   - filled missing `volume` using the **ticker-specific median**
4. Compare shapes and report (in markdown): which method keeps more information and why.

### Task 2.4 — Duplicates
1. Create a new DataFrame by stacking the last 5 rows twice.
2. Detect duplicates with `.duplicated()`.
3. Remove duplicates with `.drop_duplicates()`.
4. Verify row counts before/after.

### Task 2.5 — Groupby summary
Compute a table grouped by `ticker` with:
- `mean_close`
- `std_close`
- `max_volume`

Sort by `mean_close` descending and write 3 bullet points interpreting the table.

### Task 2.6 — Reshape (long ↔ wide)
1. Create a wide pivot table of `close` with:
   - index = `date`
   - columns = `ticker`
   - values = `close`
2. Keep only the first 30 dates.
3. Melt the wide table back to long format with columns `date`, `ticker`, `close`.

---

## Part 3 — Merge (Peru macro + US market)

### Task 3.1 — Monthly merge (policy rate vs commodity ETF)
1. Fetch BCRP **policy rate** for a multi-year window (daily series is fine).
2. Convert both datasets to **monthly frequency** using:
   - `year` and `month` extracted from dates
   - monthly mean aggregation
3. Build a monthly average close for **GLD**.
4. Merge policy rate monthly with GLD monthly on `(year, month)`.
5. Save your final table to:
   - `outputs/lecture2_hw_policy_gld_monthly.csv`

In markdown: explain (3–6 lines) what each column represents and what a student could explore next (no plots required).

---

## Deliverables checklist
- All tasks completed with short markdown notes.
- Notebook runs with **Run All**.
- CSV saved to `outputs/lecture2_hw_policy_gld_monthly.csv`.
