---
title: "Lab 1: FinTech Evolution & Python"
subtitle: "Fee compression and banking spreads"
format:
  html:
    toc: false
    number-sections: true
execute:
  echo: true
  warning: false
  message: false
---

::: callout-note
### Expected Time
- FIN510: Seminar hands‑on ≈ 60 min; 
- Directed learning extensions ≈ 90–120 min
- FIN720: Computer lab ≈ 120 min
:::

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/quinfer/financial-data-science/blob/main/labs/notebooks/lab01_fintech.ipynb)

## Setup (Colab‑only installs)

In [None]:
try:
    import matplotlib
except Exception:
    !pip -q install matplotlib

## Objectives

- Recreate fee compression and NIM visuals
- Interpret persistence vs compression in finance

## Session Flow (≈ 60 minutes)

:::: callout-note
### Suggested Timing

- Warm‑up patterns and asserts (10 minutes)
- Task 1: Fund fee trends (15 minutes)
- Task 2: Banking NIM persistence (15 minutes)
- Quality gate for figures + reproducibility note (10 minutes)
- Reflection and short write‑up (10 minutes)
::::

This plan keeps you moving between coding, light quality checks, and interpretation. You can extend any part in directed learning.

## Mini Exercise — Returns vs Prices (5–10 minutes)

This quick exercise shows why we analyse returns rather than prices.

In [None]:
import pandas as pd, numpy as np

# Try real data; fall back to synthetic if unavailable
try:
    import yfinance as yf
    px = yf.download(["AAPL"], start="2022-01-01", end="2023-01-01", progress=False)["Close"]["AAPL"].dropna()
except Exception:
    px = pd.Series(dtype=float)

if px.shape[0] == 0:
    # Fallback synthetic random walk (business days)
    idx = pd.date_range("2022-01-01", "2023-01-01", freq="B")
    np.random.seed(42)
    r = np.random.normal(0.0005, 0.02, size=len(idx))
    px = pd.Series(150.0 * np.exp(np.cumsum(r)), index=idx, name="AAPL")

# Daily prices, then aggregate to monthly and compute returns
rx_d = px.pct_change().dropna()
px_m = px.resample('M').last()
rx_m = px_m.pct_change().dropna()

px.tail(), rx_d.tail(), rx_m.tail()

Notes
- Price levels are typically non‑stationary; returns are closer to stationary and comparable across assets.
- In projects, regressions and forecasts target returns (or excess returns), not price levels.

### Quick patterns and asserts

Introduce defensive checks early so issues are visible and actionable.

In [None]:
import numpy as np

# Guard against empty downloads
assert px.shape[0] > 0, "No price data returned — check ticker or network"

# Confirm monotone date index
assert px.index.is_monotonic_increasing, "Timestamps not monotone — resample/sort before analysis"

# Sanity checks on returns
assert rx_d.abs().lt(1.0).all(), "Extreme daily return detected (>100%) — inspect corporate actions/adjustments"

## Task 1 — Fund Fee Trends

In [None]:
import matplotlib.pyplot as plt

years = [2000, 2005, 2010, 2015, 2020, 2024]
equity_mf = [1.00, 0.85, 0.79, 0.68, 0.50, 0.40]
bond_mf   = [0.85, 0.75, 0.70, 0.58, 0.45, 0.38]
index_etf = [0.30, 0.25, 0.20, 0.18, 0.15, 0.14]

plt.figure(figsize=(9,5))
plt.plot(years, equity_mf, marker='o', label='Equity MF')
plt.plot(years, bond_mf,   marker='o', label='Bond MF')
plt.plot(years, index_etf, marker='o', label='Index ETF')
plt.title('Decline in Fund Fees (Illustrative)')
plt.xlabel('Year'); plt.ylabel('Average Expense Ratio (%)')
plt.grid(alpha=0.3); plt.legend(); plt.tight_layout()
assert len(years) == len(equity_mf) == len(bond_mf) == len(index_etf)
assert min(equity_mf) < max(equity_mf)
print("Fee chart checks passed ✔")

Pedagogical note: label axes clearly, prefer consistent styles, and test that trends are monotone when theory implies it.

## Task 2 — Banking NIM Persistence

In [None]:
years = [2000, 2005, 2010, 2015, 2020, 2024]
nim   = [3.9, 3.5, 3.6, 2.9, 2.6, 3.0]

plt.figure(figsize=(8,4.5))
plt.plot(years, nim, marker='o', color='tab:red', label='Bank NIM')
plt.title('Bank Net Interest Margin (Illustrative)')
plt.xlabel('Year'); plt.ylabel('NIM (%)')
plt.grid(alpha=0.3); plt.legend(); plt.tight_layout()
assert len(years) == len(nim)
print("NIM chart checks passed ✔")

## Quality Gate for Figures (10 minutes)

Before interpretation, run a tiny “figure quality gate” on your inputs and outputs.

In [None]:
import pandas as pd

# Wrap your plotting inputs into DataFrames for simple checks
fees_df = pd.DataFrame({
    'year': years,
    'equity_mf': equity_mf,
    'bond_mf': bond_mf,
    'index_etf': index_etf
})

nim_df = pd.DataFrame({
    'year': years,
    'nim': nim
})

checks = {
    'fees_monotone_decline_equity': pd.Series(equity_mf).is_monotonic_decreasing,
    'fees_monotone_decline_bond': pd.Series(bond_mf).is_monotonic_decreasing,
    'nim_reasonable_range': (pd.Series(nim).between(0.0, 10.0).all()),
    'year_sorted': pd.Series(years).is_monotonic_increasing
}

checks

Briefly explain any violations (e.g., non‑monotonic segments in bond fees) rather than forcing the data to fit an assumption.

## Task 3 — Adjusted vs Unadjusted Prices (10–15 minutes)

Corporate actions change the interpretation of price series. Compare adjusted and unadjusted series and discuss implications for analysis.

In [None]:
import yfinance as yf, pandas as pd, matplotlib.pyplot as plt

try:
    raw = yf.download(["AAPL"], start="2022-01-01", end="2023-01-01", progress=False)
    close = raw["Close"]["AAPL"].dropna()
    adj   = raw["Adj Close"]["AAPL"].dropna()
except Exception:
    # Fallback synthetic series with a one‑off adjustment event
    idx = pd.date_range('2022-01-01','2023-01-01', freq='B')
    close = pd.Series(100.0, index=idx).add(pd.Series(range(len(idx)), index=idx)).astype(float)
    adj = close.copy()
    adj.iloc[len(adj)//2:] = adj.iloc[len(adj)//2:] * 0.5  # simulate split

ret_close = close.pct_change().dropna()
ret_adj   = adj.pct_change().dropna()

plt.figure(figsize=(9,5))
plt.plot(close.index, close, label='Close (unadjusted)')
plt.plot(adj.index, adj, label='Adj Close')
plt.title('Adjusted vs Unadjusted Prices')
plt.legend(); plt.grid(alpha=0.3); plt.tight_layout(); plt.show()

print(f"Mean daily return (Close): {ret_close.mean()*100:.3f}%")
print(f"Mean daily return (Adj):   {ret_adj.mean()*100:.3f}%")

Notes

- Adjusted series backfills splits/dividends, making return calculations coherent.
- For cross‑section and long horizons, prefer adjusted prices; document the choice.

## Reproducibility Snapshot (2 minutes)

Record minimal environment and metadata with your submission.

In [None]:
import sys, platform, json
meta = {
    'python': sys.version.split()[0],
    'platform': platform.platform(),
    'data_sources': ['Illustrative lists', 'yfinance (optional)'],
}
json.dumps(meta, indent=2)

## Reflection Prompts (10 minutes)

Write 150–250 words addressing:

- Why did fees compress while NIMs persisted? Tie to functions of finance and market structure.
- What risks arise from using unadjusted prices in analysis? Give one concrete example.
- One specific check you would add to a production data pipeline and why.

:::: callout-note
### Brief Rubric (FIN510)

- Evidence use: references figures and economic mechanisms
- Clarity: concise, logically structured, avoids jargon where unnecessary
- Method awareness: acknowledges data/measurement limitations
::::

## Stretch (FIN720 or extensions)

Optional deeper tasks:

- Replicate the NIM series from a public source (e.g., FRED) and compare levels/trends.
- Extend fee analysis to include advisory fees vs fund fees; discuss scope differences.
- Draft a short memo (200–300 words) critiquing a fintech claim using your figures.

## Submission Checklist

- Two figures (fees, NIM) rendered and saved
- Quality gate output shown and discussed
- Reflection (150–250 words) attached
- Reproducibility snapshot included

::: callout-tip
### Troubleshooting
- If lines don’t show, ensure values are numeric floats (not strings).
- If legends overlap: reduce figure size or call `plt.tight_layout()`.
:::

## Save Outputs (optional)

In [None]:
import matplotlib.pyplot as plt
plt.savefig('lab01_last_figure.png', dpi=150)
"Saved: lab01_last_figure.png"

::: callout-note
### Further Reading (Hilpisch 2019)
- See: [Hilpisch Code Resources](../resources/hilpisch-code.qmd) — Week 1
- Chapter 03 (data structures & plotting) provides additional plotting patterns you can adapt for fee/NIM visuals.
:::