## Why organizations use time series data analysis?

Time series analysis helps organizations understand the underlying causes of trends or systemic patterns over time. Using data visualizations, business users can see seasonal trends and dig deeper into why these trends occur. With modern analytics platforms, these visualizations can go far beyond line graphs. 

When organizations analyze data over consistent intervals, they can also use time series forecasting to predict the likelihood of future events. Time series forecasting is part of predictive analytics. It can show likely changes in the data, like seasonality or cyclic behavior, which provides a better understanding of data variables and helps forecast better. 

For example, Des Moines Public Schools analyzed five years of student achievement data to identify at-risk students and track progress over time. Today’s technology allows us to collect massive amounts of data every day and it’s easier than ever to gather enough consistent data for comprehensive analysis.

## Data Classification for time series
Further, time series data can be classified into two main categories:
- **Stock time series data** means measuring attributes at a certain point in time, like a static snapshot of the information as it was.
- **Flow time series data** means measuring the activity of the attributes over a certain period, which is generally part of the total whole and makes up a portion of the results.

## Important Considerations for Time Series Analysis
While time series data is collected over time, there are different types of data that describe how and when that time data was recorded. For example:
- **Time series data** is the data that is recorded over consistent intervals of time.
- **Cross-sectional data** consists of several variables recorded at the same time.
- **Pooled data** is a combination of both time series data and cross-sectional data.

## Types of Time Series Data (same as above with extended types)

Time series data can be categorized into several types based on the characteristics and behavior of the data. Understanding these types is crucial for selecting appropriate modeling and forecasting techniques. Here are the primary types of time series data:

### 1. Univariate Time Series
This type consists of observations of a single variable recorded sequentially over equal time intervals.
- **Example**: The daily closing prices of a stock.

### 2. Multivariate Time Series
This type involves multiple variables observed simultaneously at each time point.
- **Example**: The daily closing prices of multiple stocks.

### 3. Stationary Time Series
A time series is stationary if its statistical properties such as mean, variance, and autocorrelation are constant over time. Stationarity is crucial for many time series forecasting methods.
- **Example**: Daily temperature readings over a short period in a controlled environment.

### 4. Non-Stationary Time Series
This type of time series has properties that change over time. Non-stationarity can manifest as trends, seasonal patterns, or other structural changes in the data. Methods such as differencing or transformation can be used to make a series stationary.
- **Example**: Monthly sales data that shows an increasing trend over several years.

### 5. Seasonal Time Series
This type exhibits regular and predictable patterns that repeat over a specific period, such as daily, monthly, or yearly.
- **Example**: Retail sales often increase during the holiday season every year.

### 6. Non-Seasonal Time Series
A time series without regular seasonal patterns. Any periodicity or trend in the data is not driven by a fixed period.
- **Example**: Random stock market fluctuations.

### 7. Cyclical Time Series
This type shows long-term fluctuations not of a fixed period, typically influenced by economic or business cycles. Unlike seasonal variations, cyclical patterns have durations that are not fixed and can span multiple years.
- **Example**: Economic growth and recession cycles.

### 8. Irregular Time Series
Data points are not collected at regular intervals. This can happen due to various reasons, such as missing data points or varying frequency of observation.
- **Example**: Recorded times of customer arrivals at a service center.

Understanding the type of time series data is crucial for selecting appropriate modeling and forecasting techniques. For example, ARIMA models require stationary data, while other models like Exponential Smoothing can handle non-stationary data


### Talking more about Stationary and non-stationary data types.
Let's discuss the time series' data types and their influence. While discussing TS data types, there are two major types - stationary and non-stationary.
- **Stationary:** A dataset should follow the below thumb rules without having Trend, Seasonality, Cyclical and Irregularity components of the time sereis.
    - The **mean** value of them should be completely constant in the data during the analysis.
    - The **variance** should be constant with respect to the time-frame.
    - **Covariance** measures the relationship between two variables.

- **Non-Stationary:** If either the mean-variance or covariance is changing with respect to time, the dataset is called non=stationary.
![image.png](attachment:48286b7b-00ec-486c-b575-54ee34e8e99a.png)

## Methods to Check Stationarity

During the Time Series Analysis (TSA) model preparation workflow, it is essential to assess whether the dataset is stationary. This is done using Statistical Tests. Two tests are commonly used to determine if the dataset is stationary:

1. **Augmented Dickey-Fuller (ADF) Test**
2. **Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test**

### Augmented Dickey-Fuller (ADF) Test or Unit Root Test
The ADF test is the most popular statistical test for stationarity. It is conducted with the following assumptions:

- **Null Hypothesis (H0)**: Series is non-stationary
- **Alternate Hypothesis (HA)**: Series is stationary

Interpretation:
- **p-value > 0.05**: Fail to reject (H0)
- **p-value ≤ 0.05**: Accept (H1)

### Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test
The KPSS test is used to test a NULL Hypothesis (H0) that perceives the time series as stationary around a deterministic trend against the alternative of a unit root. Since TSA requires stationary data for further analysis, it is crucial to ensure that the dataset is stationary.

### Converting Non-Stationary Into Stationary
To perform effective time series modeling, it may be necessary to convert non-stationary data into stationary data. There are three methods available for this conversion:

#### 1. Detrending
- Detrending involves removing the trend effects from the dataset and showing only the differences in values from the trend. It allows cyclical patterns to be identified. Detrending can be done using regression analysis and other statistical techniques. Detrending shows a different aspect of time series data removing deterministic and stochastic trends.
- There are typically two classes/types of trends: deterministic and stochastic. Deterministic trends show consistent and sustained increases and decreases, while stochastic trends increase and decrease without any consistency.
- Before detrending can occur, the type of trend needs to be identified. 
- A detrended price oscillator is a common method of detrending price action that is used by traders.

![image.png](attachment:03af99a6-5901-4c81-83a1-5a8273fd91d0.png)


#### 2. Differencing
Differencing is a simple transformation of the series into a new time series. This method removes the series dependence on time and stabilizes the mean of the time series, reducing trend and seasonality during this transformation.

The differenced value $( Y_t )$ is calculated as:

$ Y_t = Y_t - Y_{t-1}$

Where $( Y_t )$ is the value at time $( t )$.

![image.png](attachment:075f2748-7d5d-4700-b3b6-0beb55eb100c.png)

#### 3. Transformation
This method includes three different techniques: Power Transform, Square Root, and Log Transform. The most commonly used technique is Log Transform.

- **Power Transform**: Applies a power function to stabilize variance.
- **Square Root**: Applies a square root function to stabilize variance.
- **Log Transform**: Applies a logarithmic function to stabilize variance.

Each of these transformations can help achieve stationarity in time series data.

By using these methods and tests, we can ensure the dataset is stationary and suitable for time series analysis and modeling.
