# 📘 Pandas Homework (7 Days) — With Self‑Checks

**Generated:** 2025-09-21

How to use:
- Solve each exercise in the **solution cell** provided using the requested variable names.
- Then run the **Self‑check** cell below it. If it runs without errors → you passed.
- If an `AssertionError` appears, read the message and fix your work.


## Setup

In [None]:
import pandas as pd
import numpy as np
pd.__version__

## Day 1 — Series

Create a Series named **`prices_s`** of 5 daily prices with a date index (strings ok). Compute simple returns **`rets_s`** with `pct_change()`.

**Use these variable names:** see bold names in the prompt.

In [None]:
# Your solution here


### Self‑check (run after your solution)

In [None]:
assert 'prices_s' in globals(), "Define prices_s (length 5 Series)."
assert isinstance(prices_s, pd.Series) and len(prices_s)==5, "prices_s must be a Series of length 5."
assert prices_s.index.is_unique, "Use a unique index (e.g., dates)."
assert 'rets_s' in globals(), "Compute rets_s = prices_s.pct_change()."
assert isinstance(rets_s, pd.Series) and len(rets_s)==5, "rets_s must be a Series of length 5."
print("✅ Day 1 passed!")

## Day 2 — DataFrame Basics

Build a DataFrame **`df_prices`** with two tickers as columns over **5 rows** (days). Add a column **`Return_AAPL`** using `pct_change()` on the **AAPL** column.

**Use these variable names:** see bold names in the prompt.

In [None]:
# Your solution here


### Self‑check (run after your solution)

In [None]:
assert 'df_prices' in globals(), "Define df_prices."
assert isinstance(df_prices, pd.DataFrame) and df_prices.shape[0]==5, "df_prices must have 5 rows."
assert 'AAPL' in df_prices.columns, "Include an AAPL price column."
assert 'Return_AAPL' in df_prices.columns, "Add Return_AAPL column."
print("✅ Day 2 passed!")

## Day 3 — Selection & Indexing

Given a DataFrame **`df`** with a DateTimeIndex and columns `AAPL`, `MSFT` (create one if needed): select the last 3 rows and both columns using `.loc` as **`last3_loc`** and using `.iloc` as **`last3_iloc`**.

**Use these variable names:** see bold names in the prompt.

In [None]:
# Your solution here


### Self‑check (run after your solution)

In [None]:
assert 'last3_loc' in globals() and isinstance(last3_loc, pd.DataFrame) and last3_loc.shape==(3,2), "last3_loc must be (3,2)."
assert 'last3_iloc' in globals() and isinstance(last3_iloc, pd.DataFrame) and last3_iloc.shape==(3,2), "last3_iloc must be (3,2)."
pd.testing.assert_frame_equal(last3_loc, last3_iloc)
print("✅ Day 3 passed!")

## Day 4 — Boolean Filtering

Create a DataFrame **`df`** with columns `Close` and `Volume` (7 rows). Filter rows where **Close** is above its own **mean** *and* **Volume** is above its **median**. Store in **`filtered_df`**.

**Use these variable names:** see bold names in the prompt.

In [None]:
# Your solution here


### Self‑check (run after your solution)

In [None]:
assert 'filtered_df' in globals() and isinstance(filtered_df, pd.DataFrame), "Define filtered_df."
assert set(filtered_df.columns)=={'Close','Volume'}, "Keep same columns."
assert filtered_df.index.isin(df.index).all(), "Indices must come from df."
print("✅ Day 4 passed!")

## Day 5 — Handling Missing Data

Create a DataFrame **`df`** with NaNs in `Close` and `Volume` over 7 rows. Forward‑fill **Close** into **`df_ffill`**, interpolate **Volume** into **`df_interp`**, then produce **`df_clean`** where remaining NaNs are dropped.

**Use these variable names:** see bold names in the prompt.

In [None]:
# Your solution here


### Self‑check (run after your solution)

In [None]:
assert 'df_ffill' in globals() and isinstance(df_ffill, pd.DataFrame), "Define df_ffill."
assert 'df_interp' in globals() and isinstance(df_interp, pd.DataFrame), "Define df_interp."
assert 'df_clean' in globals() and isinstance(df_clean, pd.DataFrame), "Define df_clean."
assert df_ffill['Close'].isna().sum() < df['Close'].isna().sum(), "Close forward-fill should reduce NaNs."
assert df_interp['Volume'].isna().sum() < df['Volume'].isna().sum(), "Volume interpolate should reduce NaNs."
assert df_clean.isna().sum().sum() == 0, "df_clean should have no NaNs."
print("✅ Day 5 passed!")

## Day 6 — GroupBy Basics

Create a small DataFrame with columns `Ticker`, `Return`, `Volume` for at least 2 rows per ticker. Group by **Ticker** and compute: mean Return, std Return, and total Volume in a single **`.agg(...)`** call. Store as **`gb`**.

**Use these variable names:** see bold names in the prompt.

In [None]:
# Your solution here


### Self‑check (run after your solution)

In [None]:
assert 'gb' in globals() and isinstance(gb, pd.DataFrame), "Define gb."
# Check for presence of Return and Volume stats (allow MultiIndex columns)
cols = [c if isinstance(c, str) else c[0] for c in gb.columns]
assert 'Return' in cols and 'Volume' in cols, "Columns should include Return and Volume stats."
print("✅ Day 6 passed!")

## Day 7 — Mini‑Project: Stock Filtering & Cleaning

Download (or simulate) 2–3 tickers for ~3 months into **`data`**. Create **`close`** and **`volume`** tables. Forward‑fill **close**, interpolate **volume**. Filter days where any volume > 5M → **`filtered`**. Compute daily returns **`returns`** with `pct_change()` and save CSVs.

**Use these variable names:** see bold names in the prompt.

In [None]:
# Your solution here


### Self‑check (run after your solution)

In [None]:
assert 'filtered' in globals(), "Define filtered DataFrame."
assert 'returns' in globals(), "Define returns DataFrame."
assert isinstance(filtered, pd.DataFrame) and isinstance(returns, pd.DataFrame), "Both must be DataFrames."
assert returns.index.isin(filtered.index).all(), "returns index should come from filtered (after dropna)."
print("✅ Day 7 passed!")