# Time Series Anomaly Detection

Time series data is common in various domains, such as user behavior, stock prices, and more. In this notebook, we will explore different types of anomalies, detection methods, and practical techniques for handling them.

## Objectives
1. Understand different types of anomalies in time series data.
2. Explore methods for detecting anomalies, including:
    - STL Decomposition
    - Isolation Forest
    - Forecasting with Prophet
    - Clustering-based methods
    - Autoencoders
3. Learn how to handle anomalies after detection.

## 1.0 Setup and Installation
Ensure the necessary libraries are installed.

In [None]:
!pip install numpy pandas matplotlib scikit-learn pyod statsmodels fbprophet

## 2.0 Types of Anomalies in Time Series
- **Point Outlier**: Unusual behavior at a specific time instance.
- **Subsequence Outlier**: A sequence of points showing unusual behavior.

### Visualization Example
We will use synthetic data for demonstration.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Generating synthetic time series data
np.random.seed(42)
dates = pd.date_range(start='2023-01-01', periods=100)
values = np.sin(np.linspace(0, 20, 100)) + np.random.normal(scale=0.2, size=100)
values[20] = 3  # Point anomaly
values[50:55] = values[50:55] + 2  # Subsequence anomaly

time_series = pd.DataFrame({'Date': dates, 'Value': values})

# Plot
plt.figure(figsize=(10, 5))
plt.plot(time_series['Date'], time_series['Value'], label='Time Series')
plt.axvline(time_series['Date'][20], color='red', linestyle='--', label='Point Anomaly')
plt.axvspan(time_series['Date'][50], time_series['Date'][54], color='yellow', alpha=0.3, label='Subsequence Anomaly')
plt.legend()
plt.show()

## 3.0 STL Decomposition
STL decomposition splits time series data into seasonal, trend, and residue components for anomaly detection.

In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose

# STL decomposition
result = seasonal_decompose(time_series['Value'], model='additive', period=10)
fig = result.plot()
plt.show()

# Residue visualization
residuals = result.resid
plt.figure(figsize=(10, 5))
plt.plot(time_series['Date'], residuals, label='Residuals')
plt.axhline(0, color='black', linestyle='--')
plt.legend()
plt.show()

## 4.0 Isolation Forest
An unsupervised method to detect anomalies based on isolation techniques.

In [None]:
from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import StandardScaler

# Preparing data
scaler = StandardScaler()
values_scaled = scaler.fit_transform(time_series['Value'].values.reshape(-1, 1))

# Isolation Forest
model = IsolationForest(contamination=0.05, random_state=42)
time_series['Anomaly'] = model.fit_predict(values_scaled)

# Visualization
anomalies = time_series[time_series['Anomaly'] == -1]
plt.figure(figsize=(10, 5))
plt.plot(time_series['Date'], time_series['Value'], label='Time Series')
plt.scatter(anomalies['Date'], anomalies['Value'], color='red', label='Anomaly')
plt.legend()
plt.show()

## 5.0 Forecasting with Prophet
Using `fbprophet` for anomaly detection by leveraging forecasting capabilities.

In [None]:
from fbprophet import Prophet

# Prepare data for Prophet
prophet_data = time_series.rename(columns={'Date': 'ds', 'Value': 'y'})

# Fit Prophet model
model = Prophet()
model.fit(prophet_data)

# Forecast
future = model.make_future_dataframe(periods=0)
forecast = model.predict(future)

# Anomaly Detection
prophet_data['yhat'] = forecast['yhat']
prophet_data['lower'] = forecast['yhat_lower']
prophet_data['upper'] = forecast['yhat_upper']
prophet_data['Anomaly'] = 0
prophet_data.loc[prophet_data['y'] > prophet_data['upper'], 'Anomaly'] = 1
prophet_data.loc[prophet_data['y'] < prophet_data['lower'], 'Anomaly'] = -1

# Visualization
anomalies = prophet_data[prophet_data['Anomaly'] != 0]
plt.figure(figsize=(10, 5))
plt.plot(prophet_data['ds'], prophet_data['y'], label='Time Series')
plt.fill_between(prophet_data['ds'], prophet_data['lower'], prophet_data['upper'], color='gray', alpha=0.2, label='Confidence Interval')
plt.scatter(anomalies['ds'], anomalies['y'], color='red', label='Anomaly')
plt.legend()
plt.show()

## 6.0 Clustering-Based Anomaly Detection
Using K-Means clustering to identify anomalies.

## 7.0 Autoencoders
Deep learning-based anomaly detection for high-dimensional datasets.

## 8.0 Handling Anomalies
- **Understanding the business case**: Analyze the root cause.
- **Adjusting with statistical methods**: Use means or interpolations.
- **Removing anomalies**: Drop outliers if they are irrelevant.