# Exercises for Lecture 11 - Time Series Analysis

## 1. Exploring Data Import and Summary Statistics
Using a dataset of your choice, import it and display its summary statistics, focusing on the range, mean, and standard deviation of each variable. Then, use `.describe()` to validate your calculations.

## 2. Plotting High-Frequency Data
Download a dataset with high-frequency financial data (e.g., minute-by-minute or tick data for a stock). Create a time series plot of the data, focusing on a specific trading day. Discuss any patterns or anomalies that you observe.

## 3. Calculating Daily Returns
For each stock in a dataset of end-of-day prices, calculate the daily absolute and percentage changes. Identify which stock shows the highest daily return and which shows the lowest, then plot these daily returns to analyze volatility over time.

## 4. Exploring Logarithmic Returns and Cumulative Returns
Using a single stock from your dataset, calculate the daily logarithmic returns. Then, compute cumulative returns by summing these log returns and apply the exponential function to the cumulative sum. Plot both the daily and cumulative returns.

## 5. Applying Rolling Statistics
Select a financial time series (e.g., stock prices or exchange rates). Apply a 30-day rolling window to calculate the minimum, maximum, mean, and standard deviation. Plot these rolling statistics along with the original time series to observe how they capture short-term trends.

## 6. Exponentially Weighted Moving Average (EWMA) Analysis
Calculate the 20-day exponentially weighted moving average (EWMA) of a financial time series. Experiment with different `halflife` values (e.g., 0.1, 0.5, and 1.0) and compare the results. Explain how changes in `halflife` affect the responsiveness of the EWMA to recent data.

## 7. Correlation Between Two Stocks
Choose two stocks that are known to have a strong correlation (positive or negative). Calculate the Pearson correlation coefficient and create a scatter plot of their daily returns. Overlay the best-fit line using linear regression. Comment on the relationship you observe.

## 8. Rolling Correlation
For the same two stocks from the previous exercise, calculate their rolling 50-day correlation. Plot the rolling correlation and interpret the results, especially in terms of how the relationship between the two stocks changes over time.

## 9. Resampling to Different Time Intervals
Using high-frequency data, resample it to 10-minute, hourly, and daily intervals. For each resampled dataset, plot the "close" prices and compare the visual differences. Discuss how resampling affects the amount of detail in the data.

## 10. Analyzing SMA Crossovers for Trading Strategy
Using a financial time series, calculate two simple moving averages (SMAs), one with a 30-day window and the other with a 100-day window. Plot both SMAs along with the original time series, and identify the points where the short-term SMA crosses the long-term SMA. Develop a basic trading strategy based on these crossover points and calculate the theoretical returns from the strategy.

---

# Solutions

## Solution 1
Use the `.describe()` function on the imported dataset to display summary statistics. Then manually calculate mean, range, and standard deviation for comparison.

## Solution 2
Plot high-frequency data using `.plot()`. Patterns may reveal rapid fluctuations within a single day, potentially with anomalies like unexpected jumps or drops.

## Solution 3
Using `.diff()` and `.pct_change()` on each stock's prices, compute absolute and percentage daily changes. Plot these changes to visualize volatility, identifying the highest and lowest returns.

## Solution 4
Calculate log returns with `np.log(data / data.shift(1))`. Cumulative returns can be calculated with `.cumsum()` followed by `np.exp()`. Plot both the daily and cumulative returns to see long-term growth trends.

## Solution 5
Apply `.rolling(window=30)` with aggregation functions (`.min()`, `.max()`, `.mean()`, `.std()`). Plot the rolling statistics and original series to observe trend lines.

## Solution 6
Using `.ewm(halflife=x).mean()`, test different `halflife` values to see their impact on the smoothness of the EWMA line in relation to the time series data.

## Solution 7
Use `.corr()` to find the Pearson correlation and `np.polyfit()` to fit a line in a scatter plot. The correlation indicates the strength of the relationship, visible in the trend of the best-fit line.

## Solution 8
Calculate rolling correlation with `.rolling(window=50).corr()` on two stocks and plot it. Look for periods of higher and lower correlation, especially during market events.

## Solution 9
Apply `.resample()` with different rules ('10min', 'H', 'D') and plot. Notice how shorter intervals retain more details, while longer intervals smooth the data and show overall trends.

## Solution 10
Calculate SMAs and use `np.where()` to set trading positions based on SMA crossovers. Backtest this strategy by calculating returns each time positions change and summing the results.

---

# Advanced Exercises for Lecture 11 - Time Series Analysis

## 1. Advanced Data Import and Analysis
Using a dataset with financial time series data, write a function to dynamically load, clean, and standardize the data (i.e., scale each series to a mean of 0 and a standard deviation of 1). Additionally, identify any outliers using a robust method like the Interquartile Range (IQR) method, and visualize the standardized series with outliers marked. Discuss how outliers affect the analysis.

## 2. Multi-Frequency Aggregation Analysis
Choose a high-frequency dataset (e.g., tick or minute-level data) and resample it into multiple frequencies: 5-minute, 30-minute, and daily intervals. For each frequency, apply aggregation functions (mean, median, min, max, and standard deviation). Compare the results, discussing how different time intervals affect the summary statistics and what insights can be gained from this multi-frequency analysis.

## 3. Multi-Period Returns and Autocorrelation
For a single stock, calculate returns over multiple periods, including daily, weekly, and monthly. For each frequency, compute the lag-1 autocorrelation coefficient and interpret its implications for potential autocorrelation in the returns. Then, perform a Ljung-Box test to determine the presence of autocorrelation over a range of lags (e.g., 1 to 10). Explain the implications of your findings for short-term vs. long-term trading.

## 4. Rolling Window with Advanced Metrics
Using a time series dataset of daily returns, calculate and plot a 60-day rolling Sharpe ratio. Additionally, plot a rolling beta against a relevant benchmark (e.g., S&P 500 for stocks). Compare the two rolling statistics, analyzing periods when the Sharpe ratio and beta exhibit significant changes, and discuss the implications for investment risk and performance.

## 5. Exponentially Weighted Moving Variance
For a given time series, calculate the exponentially weighted moving variance using an exponentially weighted moving function. Adjust the halflife parameter to 0.2, 0.5, and 1.0 and compare how each value impacts the responsiveness of the variance. Plot the results and discuss how the changing halflife affects your view of risk in the time series.

## 6. Dynamic Correlation Analysis
Choose two correlated financial time series (e.g., a stock index and its volatility index) and calculate the rolling 252-day correlation. Then, segment the data into two periods (e.g., a period of low volatility and a period of high volatility) and calculate the average correlation during each period. Use these findings to hypothesize about how correlations between assets might change during different market conditions.

## 7. High-Frequency Data Resampling and Market Impact Analysis
Using high-frequency tick data, resample it to both minute-level and 10-minute-level intervals. Calculate the mid-price at each interval, then compute the percentage change in mid-price across intervals. Identify intervals with unusually high percentage changes and investigate whether these changes coincide with higher trading volume. Discuss potential explanations for these spikes in volatility and their implications for market microstructure.

## 8. Multiple Exponential Moving Averages and Trend Analysis
For a chosen stock or index, compute multiple exponential moving averages (EMAs) with different spans (e.g., 10-day, 30-day, 90-day) and overlay them on the original time series. Identify crossover points between shorter and longer EMAs, then analyze the time between crossovers and their connection to major trends. Create a function to dynamically flag crossovers and discuss how these signals could be used in a trading strategy.

## 9. Rolling Window PCA for Dimensionality Reduction
Use a dataset with multiple time series (e.g., prices of different stocks in a sector). Implement a rolling 60-day Principal Component Analysis (PCA) to reduce the data to its first two principal components. Visualize the time series data projected onto these components and explain any observable patterns or clusters. Discuss how dimensionality reduction could assist in understanding sector-wide trends.

## 10. Volatility Estimation and Forecasting
For a single time series (e.g., daily returns), estimate its historical volatility using a 252-day rolling standard deviation. Then, use a GARCH(1,1) model to forecast future volatility based on recent data. Compare the GARCH model’s forecast with the rolling standard deviation and explain situations where one might be more effective than the other for volatility estimation.

---

# Solutions

## Solution 1
Function implementation for loading, standardizing, and identifying outliers using the IQR method. Outliers can be visualized in a time series plot by marking points above/below the specified thresholds.

## Solution 2
Use `.resample()` with different rules (5T, 30T, D) and apply aggregations (mean, median, min, max, std). Observe how frequency changes impact variability, which affects insights derived for high-frequency vs. low-frequency trading strategies.

## Solution 3
Calculate daily, weekly, and monthly returns, using `.shift()` for lagged autocorrelation analysis. Conduct Ljung-Box tests using `statsmodels.stats.diagnostic.acorr_ljungbox` to verify autocorrelation significance at different lags.

## Solution 4
Calculate the 60-day rolling Sharpe ratio using `mean / std` on the rolling window and compute rolling beta using `.cov() / .var()` against the benchmark. Interpret periods with high Sharpe ratios and analyze beta for insight into systematic risk vs. reward.

## Solution 5
Apply `.ewm()` with halflife set to 0.2, 0.5, and 1.0 for variance. Lower halflife values increase responsiveness to new data points, showing more frequent spikes in risk estimation for fast-moving time series.

## Solution 6
Calculate 252-day rolling correlations and analyze average correlation during pre-defined volatility periods. Hypothesize about potential breakdowns in correlation under stress and periods of high vs. low volatility.

## Solution 7
Calculate mid-price and percentage change for minute and 10-minute intervals using `.resample()` and `.pct_change()`. Identify high-change intervals and check against volume data to assess possible causes like order imbalance or news events.

## Solution 8
Compute multiple EMAs with `.ewm(span=n)` and create a function to flag crossover points. Observe time between crossovers and their relation to longer-term trends. Discuss how crossovers may indicate momentum shifts or trend changes.

## Solution 9
Implement PCA with `sklearn.decomposition.PCA` on a rolling basis for a set of time series. Project data onto the first two components, analyzing sector trends or anomalies over time.

## Solution 10
Calculate a 252-day rolling standard deviation as a volatility estimate, then apply a GARCH(1,1) model using `arch` or `statsmodels` libraries to forecast volatility. Discuss GARCH’s strength in reacting to recent volatility patterns vs. the rolling std’s focus on historical data.

---