# Python Assignment: Time Series Trends and Seasonal Decomposition

This assignment will challenge your understanding of time series analysis by focusing on visualizing underlying trends, identifying seasonal patterns, and formally decomposing a time series into its core components. You will work with synthetic data to ensure clear observation of these components, which is a fundamental step before advanced forecasting.

## Part 1: Data Generation and Initial Visualization (30 points)

We'll start by creating a synthetic time series dataset that exhibits clear trend and seasonal patterns, along with some noise.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose
import warnings

warnings.filterwarnings("ignore") # Suppress warnings for cleaner output
np.random.seed(42) # for reproducibility

# 1.1 Generate Synthetic Time Series Data
#    Create a Pandas Series `ts_data` with a DatetimeIndex.
#    The series should represent at least 3-5 years of daily data (e.g., from '2020-01-01').
#    The data should incorporate:
#    - **A clear increasing linear trend.**
#    - **Annual seasonality:** A sinusoidal pattern repeating yearly (e.g., peaks in summer, troughs in winter).
#    - **Weekly seasonality (optional, for extra challenge):** A smaller sinusoidal pattern repeating weekly.
#    - **Random noise.**
#    Make sure the seasonal amplitudes are significant enough to be visible.

n_days = 365 * 4 # 4 years of daily data
dates = pd.date_range(start='2020-01-01', periods=n_days, freq='D')

# TODO: Implement the trend component
trend = # Your trend component here (e.g., linear increase over time)

# TODO: Implement annual seasonality (sinusoidal, e.g., using np.sin or np.cos based on day of year)
# Hint: day_of_year = dates.dayofyear
annual_seasonality = # Your annual seasonality component here

# TODO: Implement weekly seasonality (optional, using np.sin or np.cos based on day of week)
# Hint: day_of_week = dates.dayofweek
weekly_seasonality = # Your weekly seasonality component here (set to 0 if skipping this part)

# Random noise
noise = np.random.normal(loc=0, scale=5, size=n_days)

# Combine components
value = trend + annual_seasonality + weekly_seasonality + noise

ts_data = pd.Series(value, index=dates)

print("Time Series Data Head:\n", ts_data.head())
print("Time Series Data Info:")
ts_data.info()

# 1.2 Plot the Raw Time Series
#    Create a line plot of your `ts_data`.
#    Ensure the plot has a clear title, x-axis (Date), and y-axis (Value) labels.

plt.figure(figsize=(15, 6))
# TODO: Plot the time series
# plt.plot(ts_data)
# plt.title("Synthetic Time Series Data")
# plt.xlabel("Date")
# plt.ylabel("Value")
# plt.grid(True)
# plt.show()


## Part 2: Trend and Seasonality Visualization (25 points)

This section focuses on more direct visualization methods to reveal trend and seasonality before formal decomposition.

In [None]:
# 2.1 Visualize Trend using Moving Averages
#    Calculate and plot a rolling mean (moving average) for your `ts_data`.
#    Experiment with at least two different window sizes (e.g., 30 days for short-term, 90 or 180 days for longer-term).
#    Plot the original series and the moving averages on the same graph.

window_size_short = 30 # days
window_size_long = 90 # days

# TODO: Calculate rolling means
rolling_mean_short = # Your code for short rolling mean
rolling_mean_long = # Your code for long rolling mean

plt.figure(figsize=(15, 7))
# TODO: Plot original series and rolling means
# plt.plot(ts_data, label='Original Data', alpha=0.7)
# plt.plot(rolling_mean_short, label=f'Rolling Mean ({window_size_short} days)', color='orange')
# plt.plot(rolling_mean_long, label=f'Rolling Mean ({window_size_long} days)', color='red')
# plt.title("Time Series with Moving Averages")
# plt.xlabel("Date")
# plt.ylabel("Value")
# plt.legend()
# plt.grid(True)
# plt.show()

# 2.2 Visualize Seasonality using Box Plots (Monthly Seasonality)
#    Extract the month from the DatetimeIndex of your `ts_data`.
#    Create a box plot showing the distribution of values for each month across all years.
#    This helps visualize the annual seasonal pattern.

data_monthly = ts_data.copy()
data_monthly = data_monthly.to_frame(name='Value')
data_monthly['Month'] = data_monthly.index.month

plt.figure(figsize=(12, 7))
# TODO: Create box plot by month
# data_monthly.boxplot(column='Value', by='Month', ax=plt.gca())
# plt.title("Monthly Seasonality (Box Plot)")
# plt.suptitle('') # Suppress the default suptitle from boxplot
# plt.xlabel("Month")
# plt.ylabel("Value")
# plt.grid(axis='y')
# plt.show()

# 2.3 (Optional) Visualize Seasonality using Seasonal Subseries Plot (Weekly Seasonality if included)
#    If you included weekly seasonality, create a similar plot (e.g., box plot by day of week).
#    If not, you can skip this or create another relevant seasonal plot if applicable.

if weekly_seasonality.sum() != 0: # Only run if weekly seasonality was added
    data_weekly = ts_data.copy()
    data_weekly = data_weekly.to_frame(name='Value')
    data_weekly['DayOfWeek'] = data_weekly.index.dayofweek # Monday=0, Sunday=6

    plt.figure(figsize=(10, 6))
    # TODO: Create box plot by day of week
    # data_weekly.boxplot(column='Value', by='DayOfWeek', ax=plt.gca())
    # plt.title("Weekly Seasonality (Box Plot)")
    # plt.suptitle('')
    # plt.xlabel("Day of Week (0=Monday, 6=Sunday)")
    # plt.ylabel("Value")
    # plt.grid(axis='y')
    # plt.show()


## Part 3: Time Series Decomposition (30 points)

Now, we will formally decompose the time series into its trend, seasonal, and residual components using `statsmodels`.

In [None]:
# 3.1 Perform Seasonal Decomposition
#    Use `statsmodels.tsa.seasonal.seasonal_decompose`.
#    Consider the `model` argument: 'additive' vs 'multiplicative'.
#    - **Explain the difference between additive and multiplicative models and when each is appropriate.**
#    - **Choose the appropriate model for your synthetic data and justify your choice.**
#    - Set the `period` argument appropriately for your annual seasonality (e.g., 365 for daily data with annual seasonality).

### Explanation: Additive vs. Multiplicative Decomposition
*(Write your explanation here: when to use additive, when to use multiplicative. Justify your choice for this dataset.)*

decomposition_model = 'additive' # or 'multiplicative', choose based on your explanation
seasonal_period = 365 # or 7 if weekly is dominant, or both in a complex way

print(f"Performing {decomposition_model} decomposition with period={seasonal_period}...")
# TODO: Perform decomposition
# result = seasonal_decompose(ts_data, model=decomposition_model, period=seasonal_period)


# 3.2 Plot the Decomposed Components
#    Use `result.plot()` to visualize the original, trend, seasonal, and residual components.
#    Ensure the plot is clearly displayed.

plt.figure(figsize=(12, 10))
# TODO: Plot the decomposition results
# result.plot()
# plt.suptitle(f"Time Series Decomposition ({decomposition_model} model)", y=1.02)
# plt.tight_layout(rect=[0, 0, 1, 0.98]) # Adjust layout to prevent title overlap
# plt.show()


## Part 4: Interpretation and Reflection (15 points)

Analyze the results of your decomposition and reflect on its implications.

### Your Answers to Reflection Questions:

1.  **Based on the decomposed plots, describe the identified trend, seasonal, and residual components of your synthetic time series. Do they match your expectations from how you generated the data?**

    * **Trend:** _(Your description here)_
    * **Seasonality:** _(Your description here)_
    * **Residuals:** _(Your description here)_

2.  **How would you identify potential outliers or unusual events in a real-world time series using the residual component?**

    _(Your answer here)_

3.  **What are some limitations of this `seasonal_decompose` method (e.g., regarding non-linear trends, multiple seasonalities, or end effects)?**

    _(Your answer here)_

4.  **Suggest two more advanced time series analysis or forecasting techniques that could address some of the limitations mentioned above.**

    * **Technique 1:** _(Name and brief explanation)_
    * **Technique 2:** _(Name and brief explanation)_


## Deliverables:

1.  This completed Jupyter Notebook (`time_series_decomposition_assignment.ipynb`) with all code cells executed and reflection questions answered.
2.  Ensure all plots are clearly visible and well-labeled within the notebook.