# Anomalies detection via Quantile regression

Anton Antonov    
[PythonForPrediction at WordPress](https://pythonforprediction.wordpress.com)   
August 2024

## Introduction

------

## Setup

Load the "Regressionizer" and other "standard" packages:

In [None]:
from Regressionizer import *
from OutlierIdentifiers import *

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

In [None]:
template='plotly_dark'
data_color='darkgray'

-----

## Get data

In [None]:
url = "https://raw.githubusercontent.com/antononcube/SimplifiedMachineLearningWorkflows-book/master/R/ChampaignUrbanaDataScienceUserGroup-Meetup-February-2021/data/dfAppleMobilityLongForm.csv"
dfMobilityData = pd.read_csv(url)
dfMobilityData['DateObject'] = pd.to_datetime(dfMobilityData['Date'], format='%Y-%m-%d')
dfMobilityData = dfMobilityData.sort_values(by="Date")
dfMobilityData

Convert to "numpy" array: 

In [None]:
usage_data = dfMobilityData[['Date', 'Value']].to_numpy()
usage_data[:,0] = dates_to_seconds(usage_data[:,0], epoch_start="1900-01-01")
#usage_data = usage_data[usage_data[:, 0].argsort()]
usage_data.shape

Here is pipeline for Quantile Regression computation and making of a corresponding plot:

In [None]:
obj = (
    Regressionizer(usage_data)
    .echo_data_summary()
    .quantile_regression(knots=50, probs=[0.2])
    .date_list_plot(title="Apple mobility data", template=template, data_color=data_color, width = 1200)
)

Show the obtained plot:

In [None]:
fig = obj.take_value()
#fig.add_trace(go.Scatter(x=to_datetime_index(usage_data[:,0]), y=usage_data[:,1], mode='lines', name='Data time series'))
fig.show()


In [None]:
outliers = (obj
.find_anomalies_by_residuals(
    relative_errors=True,
    threshold=None, 
    outlier_identifier=quartile_identifier_parameters)
.take_value());

fig.add_trace(go.Scatter(x=to_datetime_index(outliers[:,0]), y=outliers[:,1], mode='markers', name='Outliers', marker_color = "orange"))