# Introduction to Time Series Analysis

Time series analysis is a family of techniques used to analyze data that is collected over time. 
- In time series analysis, data is collected at equally spaced intervals, such as hourly, daily, weekly, or monthly intervals. 
- The goal of time series analysis is to identify patterns or trends in the data, and use these patterns to make predictions about future values.

In [None]:
import pandas as pd
df = pd.read_csv('data/monthly_milk_production.csv',index_col='Date', parse_dates=True)
df.index.freq = "MS"
df.head(5)

In [None]:
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
df.plot(marker='o')
plt.title('Time Series Example')
plt.xlabel('Date')
plt.ylabel('Value')
plt.grid(True)
plt.show()

Time series analysis is used in many different fields, including finance, economics, engineering, and environmental science. 


- Finance: Predict stock prices or other financial indicators. 

- Economics: Trends in economic indicators such as inflation, gross domestic product, or unemployment rates. 

- Engineering: Monitor performance of machines or systems over time. Identify patterns in the vibrations of an engine, which could indicate a potential failure.

- Environmental science: Study trends in environmental factors such as temperature, rainfall, or air pollution levels. Decompose a time series of global temperature into its component frequency components, in order to identify long-term trends and cyclical patterns.

- Marketing: Forecast sales or customer behavior over time. Predict future sales of a product based on its past sales history so you have enough items on the stock.

## Time serie decomposition

Time series decomposition is a statistical technique that breaks down a time series into its constituent components. The primary goal of decomposition is to understand and analyze the underlying patterns in the time series data. The three main components usually are:
- Trend: The long-term movement or general direction in the data. 
- Seasonality: The repetitive, periodic fluctuations in the data that occur at regular intervals. 
- Residual (or Error): The random noise or irregular component that remains after removing the trend and seasonality. 

In [None]:
df.plot(figsize=(10,6))

In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose

In [None]:
results = seasonal_decompose(df['Production'])
results.plot();

In [None]:
trend = results.trend
seasonal = results.seasonal
residual = results.resid

In [None]:
trend.plot()
(seasonal + trend).plot()

## Time serie operations in Pandas

In [None]:
# Resample
df_yearly = df['Production'].resample('Y').mean()
df_yearly.head(5)

In [None]:
df_yearly.plot()

In [None]:
# Rolling average
df_rolling = df['Production'].rolling(window=3).mean()
df_rolling.plot()

In [None]:
df_rolling = df['Production'].rolling(window=12).mean()
df_rolling.plot()

In [None]:
# Shifting
df['shifted_1'] = df['Production'].shift(1)
df['shifted_m1'] = df['Production'].shift(-1)
df.head(10)

In [None]:
df['diff1'] = df['Production'].diff(1)
df.head(10)

In [None]:
df['Production'].plot()
df['diff1'].plot()