# Goals
If a time series data of a physical process is given the student should be able to analyze and forecast the data. The student should also be able to evaluate the error of the method.

## Problem
Suppose we have a damped harmonic oscillator and its position co-ordinates are measured at every time interval $dt $.

The time points \( t \) are generated from 0 to \( T \) with a step size \( dt \):
$t = 0, \, dt, \, 2dt, \, \ldots, \, (n-1)dt$
where $( n = \frac{T}{dt} $).

The position \( x(t) \) of the damped harmonic oscillator is given by:
$x(t) = A e^{-bt} \cos(\omega t)$

where:
\begin{align*}
A; \text{ is the amplitude}, \\
b; \text{ is the damping coefficient}, \\
\omega; \text{ is the angular frequency}.
\end{align*}

The position data is the result of measurements, which by definition come with limited precision. 
The measurement errors act as Gaussian noise over thre true positions. 
The noisy position data $( x_{\text{noisy}}(t) $) is therefore obtained by adding Gaussian noise to the true position data:

$x_{\text{noisy}}(t) = x(t) + \epsilon(t)$

where $( \epsilon(t) \sim \mathcal{N}(0, \sigma^2) $) is normally distributed noise with mean 0 and standard deviation $( \sigma $). 

The code given in the cell below stores the noisy time dependent position and the time information of the oscillator to an excel file.

In [None]:
# Import necessary libraries
import numpy as np
import pandas as pd

# Set the random seed for reproducibility
np.random.seed(42)

# Define parameters for the damped harmonic oscillator
A = 1.0    # Amplitude
b = 0.05   # Damping coefficient
omega = 2 * np.pi / 5  # Angular frequency
T = 100    # Total time
dt = 0.1   # Time step

# Generate time points
t = np.arange(0, T, dt)

# Generate the damped oscillation data
x = A * np.exp(-b * t) * np.cos(omega * t)

# Add some noise to the data
noise = np.random.normal(scale=0.1, size=len(t))
x_noisy = x + noise

# Create a DataFrame
df = pd.DataFrame({'Time': t, 'Position': x_noisy})

df.to_excel('damped_oscillator.xlsx','series')

### Data importation and visualization
Import the excel file and visulaize the data with different techniques

In [None]:
# Exercise 1
# Upload the excel file damped_oscillator.xlsx using pandas to a dataframe object


In [None]:
# Exercise 2 
# Plot time (x-axis) versus position (y-ais) using matplotlib pyplot. Add appropriate labels


In [None]:
# Exercise 3 
# Use statsmodels.graphics.tsaplots library to:
# Plot the autocorrelation function time series of the position data (50 lags for example)

# Plot the partial autocorrelation function time series of the position data (50 lags for example)


### Modeling and forecasting
After basic drawing and visulazation of the data try to analyze the position data and do some forcasting using the SARIMAX model

In [None]:
# Exercise 4 
# import itertools, and use it to generate different combinations of parameters for the SARIMAX model


# Define the p, q parameters to take any value between 1 and 2, and the d parameter to take any value between 0 and 1


# Generate all different combinations of p, d and q triplets


# Generate all different combinations of seasonal p, d and q triplets (i.e., P, D, Q)


In [None]:
# Exercise 5 
# Define a SARIMAX the model and identify the optimal parameters according to the AIC criterion


import warnings
warnings.filterwarnings("ignore") # specify to ignore warning messages

# Indentification of best model from different combinations of pdq and seasonal_pdq



In [None]:
# Exercise 6 
# Print the best set of parameters according to the AIC


In [None]:
# Exercise 7 
# Fit the model once again using the optimal parameters, and print a summary of the results


In [None]:
# Exercise 8 
# Forecast future values for 50 steps


In [None]:
# Exercise 9 
# Plot the results, the predicted trend according to the model, and the forecast

# Plot the original data i.e. time (x-axis) vs position

# Plot the time (x-axis) vs predictions 

# Plot the forecast index (x-axis) vs forecast values




### Error analysis
Analyse the errors of the predictions

In [None]:
# # Exercise 10 
# Compute the:
#   Sum of Squares Total (SST),
#   Summ of Squares Error (SSE),
#   Sum of Squares Regression (SSR),
# and then the coefficient of determination (R²) 


# Print the results


In [None]:
# # Exercise 11 
# The degrees of freedom are 
n = len(observed)  # Number of observations
p = len(results.params)  # Number of predictors in SARIMAX

# Calculate the MST, MSE, MSR, and F-score


# Print the results


### Holt's linear model

While SARIMAX provides a good forecast for the future points, other methods such as Holt's linear metod do not work as well in the case of this data. Let's verify that.

In [None]:
# Exercise 12 
# Fit the Holt's linear trend model to same data


In [None]:
# Exercise 13
# Make in-sample predictions and forecasts for future points (50 steps)


In [None]:
# Exercise 14 
# Plot the original data, the predicted trend according to Holt's linear model, and the forecast for future points
