# Week 1 - Tasks
- Use data from `AAPL, MSFT, GOOG, AMZN, TSLA`
- For long term data - `start="2015-01-01", end="2024-01-01"`
- For medium term data - `start="2021-01-01", end="2024-01-01"`

## Task 1 - Setup
- Fetch long term historical data.
- Extract the adjusted close values and deal with missing values and rows
- Use pandas describe to extract key stats about the data
- Extract the medium term data as well

## Task 2 - Basic Trends
- Take any one asset and compute (on medium term):
  - 1-day, 5-day, 20-day simple returns
  - 1-day, 5-day, 20-day log returns
  - 5-day, 20-day, 60-day volatility (based on log returns)
  
- **Bonus:** Plot (long term):
  - Scatter plot of |return| vs Volume
  - Returns and volatility data grouped by month

- Plot each pointer on a separate graph
- Analyze the trends you observe

## Task 3 - Stationarity
- Take any one asset and compute (on long term):
  - Rolling mean of log returns with window sizes 20, 60, 120
- Plot them together and visually inspect - can you verify "stationarity"?
- Run an ADF test for stationarity on the log return data series
  - Use `from statsmodels.tsa.stattools import adfuller`
  - The null hypothesis here is that the series is non-stationary.

You may refer to: https://www.statsmodels.org/dev/examples/notebooks/generated/stationarity_detrending_adf_kpss.html.

Basic knowledge of p-values in hypothesis testing may be required.

## Task 4 - Volatility Regimes
We try to study the trend in volatility more deeply. From the ACF/PACF plot we know that volatility tends to cluster that is the market is either in a high volatility state (greater daily fluctuations) or low volatility state (stable prices, quiet markets).

Formally we can treat volatility as a *conditional standard deviation of returns*, conditioned on the history observed so far. We *cannot* predict *true* volatility for a given day based on the single return value, hence we try to estimate it using the past information. For this we look at two indicators of volatility:

- Rolling Window Volatility (*Smooth Moving Average (SMA)*)
- EWMA Volatility (*Exponentially Weighted Moving Average (EWMA)*)

Steps:
- Pick an asset and medium/short term data preferably surrounding the Feb-Mar 2020 COVID crash
- Compute two volatility estimates
  - Rolling 20-day volatility
  - 20-day EWMA volatility (*RiskMetrics*) with $\alpha=0.94$.
    It is described by the formula: $\sigma_t^2 = (1-\lambda)r_t^2 + \lambda \sigma_{t-1}^2$.
(look up standard pandas methods for these)

- Plot Both on the same chart
- Analyze their effectiveness:
  - Which is smoother?
  - Which estimator reacts faster to sudden changes in volatility?
  - Which one would you choose for estimating risk in fast changing markets?

- Compute the 60th percentile of EWMA volatility
- Shade periods where EWMA volatility exceeds this level
- **Bonus:** A good metric of a volatility estimator is whether it co-moves with the returns. That is, $Var(z_t)\approx 1$ where $z_t = r_t/\hat\sigma_t$. Test this out.

## Task 5 - Smart investing
- How many RTX 4090s could you afford *today* if you invested $\$1000$ in NVIDIA stock when you were born?