# Machine Learning â€“ Anomaly Detection & Time Series


## Q1: What is Anomaly Detection?

Anomaly Detection identifies data points that deviate significantly from normal patterns. Point anomalies are individual unusual values, contextual anomalies depend on context (e.g., time), and collective anomalies occur when a group of points is abnormal together.

## Q2: Isolation Forest vs DBSCAN vs LOF

Isolation Forest isolates anomalies using random splits and works well for high-dimensional data. DBSCAN identifies dense regions and treats sparse points as anomalies. LOF detects anomalies by comparing local density with neighbors.

## Q3: Components of a Time Series

Trend shows long-term movement, seasonality shows repeating patterns, and residuals represent random noise.

## Q4: Stationarity in Time Series

A stationary series has constant mean and variance over time. Non-stationary data can be transformed using differencing or log transformation.

## Q5: AR, MA, ARIMA, SARIMA, SARIMAX

AR uses past values, MA uses past errors, ARIMA combines both with differencing. SARIMA handles seasonality, while SARIMAX includes external variables.

In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt

decomp = seasonal_decompose(air, model="multiplicative")
decomp.plot()
plt.show()

In [None]:
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt

iso = IsolationForest(contamination=0.05, random_state=42)
labels = iso.fit_predict(X_blob)

plt.scatter(X_blob[:,0], X_blob[:,1], c=labels)
plt.show()

In [None]:
from statsmodels.tsa.statespace.sarimax import SARIMAX
import matplotlib.pyplot as plt

model = SARIMAX(air, order=(1,1,1), seasonal_order=(1,1,1,12))
fit = model.fit(disp=False)
forecast = fit.forecast(12)

plt.plot(air, label="Actual")
plt.plot(forecast, label="Forecast")
plt.legend()
plt.show()

In [None]:
from sklearn.neighbors import LocalOutlierFactor
import matplotlib.pyplot as plt

lof = LocalOutlierFactor(n_neighbors=20, contamination=0.05)
labels = lof.fit_predict(X_blob)

plt.scatter(X_blob[:,0], X_blob[:,1], c=labels)
plt.show()

## Q10: Real-time anomaly detection & forecasting workflow

Streaming data anomalies can be detected using Isolation Forest or LOF. SARIMA or SARIMAX is suitable for short-term forecasting. Models are monitored using rolling validation and alerts help operations react quickly.