# The Choice of Frequency and Annualization of Returns
## üéØ Learning Objectives

By the end of this notebook, you will be able to:

1. **Understand frequency choices** ‚Äî Why we use daily or monthly data in practice
2. **Apply standard annualization** ‚Äî Convert monthly/daily statistics to annual terms
3. **Aggregate returns with groupby** ‚Äî Compute exact multi-period returns without approximation
4. **Compare methods** ‚Äî Know when approximation is acceptable vs. exact aggregation

## üìã Table of Contents

1. [Setup](#setup)
2. [Why Frequency Matters](#why-frequency-matters)
3. [Standard Annualization](#standard-annualization)
4. [Exact Aggregation with Groupby](#exact-aggregation-with-groupby)
5. [Exercises](#exercises)
6. [Key Takeaways](#key-takeaways)

---

## üõ†Ô∏è Setup <a id="setup"></a>

In [None]:
#@title üõ†Ô∏è Setup: Run this cell first (click to expand)

# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

# Set consistent plot style
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['font.size'] = 12

# Suppress warnings for cleaner output
import warnings
warnings.filterwarnings('ignore')

print("‚úÖ Libraries loaded successfully!")

---

## Why Frequency Matters <a id="why-frequency-matters"></a>

### Data Comes at a Specific Frequency

Financial data is always structured at a particular frequency:

| Frequency | What it measures |
|-----------|------------------|
| Daily | Return from one closing price to the next |
| Monthly | Return from last trading day of one month to the next |
| Annual | Return over a calendar or fiscal year |

This choice is **arbitrary** ‚Äî transactions happen every millisecond!

### Why Monthly or Daily?

In this course (and in practice), we work with **monthly** or **daily** data:

- **Manageable size** ‚Äî Higher frequencies create massive datasets
- **Industry standard** ‚Äî Most practitioners use these frequencies
- **Sufficient data** ‚Äî Lower frequencies (annual) give too few observations

> **üí° Key Insight:**
>
> We analyze at monthly frequency, then **annualize** results.
> Annual numbers are easier to interpret and compare.

### Load Monthly Data

Let's load a dataset of monthly global financial returns.

In [None]:
# Load monthly global financial data
url = "https://raw.githubusercontent.com/amoreira2/UG54/main/assets/data/GlobalFinMonthly.csv"
Data = pd.read_csv(url, na_values=-99)
Data['Date'] = pd.to_datetime(Data['Date'])
Data = Data.set_index(['Date'])

print(f"Data range: {Data.index.min().date()} to {Data.index.max().date()}")
print(f"Columns: {list(Data.columns)}")
Data.head()

---

## Standard Annualization <a id="standard-annualization"></a>

### The Quick and Dirty Method

**Standard annualization formulas** (from monthly data):

| Statistic | Formula |
|-----------|--------|
| Mean | $\hat{\mu}_A = 12 \times \hat{\mu}_M$ |
| Variance | $\hat{\sigma}^2_A = 12 \times \hat{\sigma}^2_M$ |
| Std Dev | $\hat{\sigma}_A = \sqrt{12} \times \hat{\sigma}_M$ |

These assume returns are **i.i.d.** (independent and identically distributed).

In [None]:
# Monthly statistics for market returns
mean_monthly = Data['MKT'].mean()
std_monthly = Data['MKT'].std()
var_monthly = Data['MKT'].var()

# Annualize using standard formulas
mean_annual = mean_monthly * 12
var_annual = var_monthly * 12
std_annual = std_monthly * np.sqrt(12)

print("‚îÅ" * 50)
print("Market Return Statistics")
print("‚îÅ" * 50)
print(f"Monthly mean:      {mean_monthly:>10.4%}")
print(f"Annualized mean:   {mean_annual:>10.2%}")
print("‚îÅ" * 50)
print(f"Monthly std:       {std_monthly:>10.4%}")
print(f"Annualized std:    {std_annual:>10.2%}")
print("‚îÅ" * 50)

### Why Is This an Approximation?

Annual returns **compound**, they don't simply add:

$$R_A = (1+R_1)(1+R_2)\cdots(1+R_{12}) - 1$$

If returns were truly i.i.d., the exact formulas would be:

$$\mu_A = (1+\mu_M)^{12} - 1$$

$$\sigma_A^2 = [\sigma^2_M + (1+\mu_M)^2]^{12} - (1+\mu_M)^{24}$$

> **üìå Remember:**
>
> The standard annualization is an **approximation** that works well when:
> - Monthly returns are small (so $(1+r) \approx 1$)
> - You're comparing assets at the same frequency
>
> **Always use standard annualization unless told otherwise.**

### Why Use the Approximation?

Despite being technically "wrong," we use it because:

1. **Industry standard** ‚Äî Everyone uses it, so results are comparable
2. **Good intuition** ‚Äî Gives correct order of magnitude
3. **Easy t-statistics** ‚Äî Works well with monthly data for inference
4. **Consistent comparisons** ‚Äî Fine if you don't mix frequencies

---

## Exact Aggregation with Groupby <a id="exact-aggregation-with-groupby"></a>

### When You Need Exact Results

Sometimes you want **actual annual returns**, not approximations.

To get exact annual returns, we must **compound** monthly returns:

$$R_{year} = \prod_{t \in year}(1 + R_t) - 1$$

This requires grouping data by year and multiplying gross returns.

> **üêç Python Insight: `groupby()`**
>
> The pandas `groupby()` method is one of the most powerful tools for data analysis. It follows the **Split ‚Üí Apply ‚Üí Combine** pattern:
>
> ```python
> df.groupby(grouping_key).aggregate_function()
> ```
>
> | Step | Action | Example |
> |------|--------|---------|
> | **Split** | Divide data into groups | `df.groupby(df.index.year)` |
> | **Apply** | Apply function to each group | `.mean()`, `.sum()`, `.prod()` |
> | **Combine** | Merge results back together | Returns one row per group |
>
> We'll use this extensively throughout the course!

### The Groupby Method

Pandas `groupby` lets us aggregate data by groups. Here's the logic:

| Step | Code | What it does |
|------|------|-------------|
| 1 | `(Data + 1)` | Convert net returns to gross returns |
| 2 | `.groupby(Data.index.year)` | Group by calendar year |
| 3 | `.prod()` | Multiply all values within each group |
| 4 | `- 1` | Convert back to net returns |

In [None]:
# Aggregate monthly returns to annual returns (exact method)
DataYear = (Data + 1).groupby(Data.index.year).prod() - 1

print("Annual returns (first 5 years):")
DataYear.head()

### Comparing Approximation vs. Exact

In [None]:
# Compare the two methods
approx_mean = Data['MKT'].mean() * 12
exact_mean = DataYear['MKT'].mean()

approx_std = Data['MKT'].std() * np.sqrt(12)
exact_std = DataYear['MKT'].std()

print("‚îÅ" * 50)
print("Comparison: Approximation vs. Exact Aggregation")
print("‚îÅ" * 50)
print(f"Mean (approx):     {approx_mean:>10.2%}")
print(f"Mean (exact):      {exact_mean:>10.2%}")
print(f"Difference:        {abs(approx_mean - exact_mean):>10.2%}")
print("‚îÅ" * 50)
print(f"Std (approx):      {approx_std:>10.2%}")
print(f"Std (exact):       {exact_std:>10.2%}")
print(f"Difference:        {abs(approx_std - exact_std):>10.2%}")
print("‚îÅ" * 50)

> **üí° Key Insight:**
>
> The approximation and exact methods give **similar results** for typical returns.
> Use the approximation for quick analysis; use exact aggregation for final reports.

### Visualizing Annual Returns

In [None]:
# Plot annual market returns
fig, ax = plt.subplots(figsize=(12, 5))

colors = ['green' if x >= 0 else 'red' for x in DataYear['MKT']]
ax.bar(DataYear.index, DataYear['MKT'], color=colors, alpha=0.7)

ax.axhline(0, color='black', linewidth=0.5)
ax.axhline(DataYear['MKT'].mean(), color='blue', linestyle='--', 
           label=f"Mean = {DataYear['MKT'].mean():.1%}")

ax.set_xlabel('Year')
ax.set_ylabel('Annual Return')
ax.set_title('Market Annual Returns', fontsize=14, fontweight='bold')
ax.legend()

plt.tight_layout()
plt.show()

---

## üìù Exercises <a id="exercises"></a>

### Exercise 1: Warm-up ‚Äî Annualization Practice

> **üîß Exercise:**
>
> A stock has the following **daily** statistics:
> - Mean daily return: 0.04%
> - Daily standard deviation: 1.8%
>
> Using standard annualization (252 trading days):
> 1. Compute the annualized mean return
> 2. Compute the annualized volatility
> 3. Compute the annualized Sharpe Ratio (assume rf = 0)

In [None]:
# Your code here
mean_daily = 0.0004  # 0.04%
std_daily = 0.018    # 1.8%

# Annualize

<details>
<summary>üí° Click to see solution</summary>

```python
mean_daily = 0.0004  # 0.04%
std_daily = 0.018    # 1.8%

# Annualize
mean_annual = mean_daily * 252
std_annual = std_daily * np.sqrt(252)
sharpe_annual = mean_annual / std_annual

print(f"Annualized mean: {mean_annual:.2%}")
print(f"Annualized std: {std_annual:.2%}")
print(f"Annualized Sharpe: {sharpe_annual:.2f}")
```
</details>

### Exercise 2: Extension ‚Äî Aggregate to Quarterly

> **ü§î Think and Code:**
>
> Instead of annual returns, compute **quarterly** returns:
>
> 1. Use `Data.index.to_period('Q')` to group by quarter
> 2. Compute exact quarterly returns using the compounding formula
> 3. What is the mean and std of quarterly market returns?
> 4. How do these compare to monthly mean √ó 3 and monthly std √ó ‚àö3?

In [None]:
# Your code here

<details>
<summary>üí° Click to see solution</summary>

```python
# Aggregate to quarterly
DataQuarter = (Data + 1).groupby(Data.index.to_period('Q')).prod() - 1

# Exact quarterly statistics
exact_q_mean = DataQuarter['MKT'].mean()
exact_q_std = DataQuarter['MKT'].std()

# Approximation from monthly
approx_q_mean = Data['MKT'].mean() * 3
approx_q_std = Data['MKT'].std() * np.sqrt(3)

print(f"Quarterly mean (exact): {exact_q_mean:.2%}")
print(f"Quarterly mean (approx): {approx_q_mean:.2%}")
print(f"Quarterly std (exact): {exact_q_std:.2%}")
print(f"Quarterly std (approx): {approx_q_std:.2%}")
```
</details>

### Exercise 3: Open-ended ‚Äî Best and Worst Years

> **ü§î Think and Code:**
>
> Using the annual returns data (`DataYear`):
>
> 1. Find the 5 best and 5 worst years for market returns
> 2. Create a bar chart showing only these 10 extreme years
> 3. Research: What major events caused the worst years?
> 4. Calculate: What fraction of years had negative returns?

In [None]:
# Your code here

<details>
<summary>üí° Click to see solution</summary>

```python
# Best and worst years
best_5 = DataYear['MKT'].nlargest(5)
worst_5 = DataYear['MKT'].nsmallest(5)

print("Best 5 years:")
print(best_5)
print("\nWorst 5 years:")
print(worst_5)

# Combine for plotting
extreme_years = pd.concat([worst_5, best_5]).sort_index()

fig, ax = plt.subplots(figsize=(10, 5))
colors = ['green' if x >= 0 else 'red' for x in extreme_years]
ax.bar(extreme_years.index.astype(str), extreme_years, color=colors)
ax.set_title('Most Extreme Market Years')
ax.set_ylabel('Annual Return')
plt.xticks(rotation=45)
plt.show()

# Fraction negative
pct_negative = (DataYear['MKT'] < 0).mean()
print(f"\nFraction of negative years: {pct_negative:.1%}")
```
</details>

---

## üß† Key Takeaways <a id="key-takeaways"></a>

1. **Frequency is arbitrary** ‚Äî We use monthly/daily for practical reasons (data size, industry standard)

2. **Standard annualization**: Mean √ó 12, Std √ó ‚àö12 (from monthly) ‚Äî An approximation, but the industry standard

3. **Exact aggregation** uses `groupby` and compounding: $(1+R_1)(1+R_2)\cdots - 1$

4. **Use approximation** for quick analysis; **use exact** when precision matters

5. **Never mix frequencies** ‚Äî Don't compare annual real estate returns to monthly stock returns using approximation

---

**Next Notebook:** We'll explore how to access financial data through APIs ‚Äî FRED, Ken French, and more.