Table of Contents
Amodely is an anomaly detection dashboard that I built during my time as a Pricing Intern at Auto & General. It is used to identify anomalies in time-series data using (primarily) a Seasonal-Trend decomposition with LOESS (STL) algorithm.
The teck stack consists wholy of Python and various Python frameworks;
The required dependencies can be found under /requirements.txt
.
To install and set up the dashboard, open up Windows PowerShell or Git Bash and follow the steps below:
-
Clone the repo and enter the directory
git clone https://github.com/LimaoC/amodely.git cd amodely
-
Create a virtual environment
python -m virtualenv venv
-
Enter the virtual environment
Windows PowerShell:
venv/Scripts/activate
Git Bash:
source venv/Scripts/activate
-
Install the required dependencies
pip install -r requirements.txt
-
Create a
.env
file in the root directory with the following variable pointing to the path of the dataset:DATASET_PATH="C:/Path/To/Dataset/"
-
Change the
DATASET_NAME
variable in/src/lib/lib.py
(the default isdataset.xlsx
):DATASET_NAME = "dataset.xlsx"
-
Run the dashboard on localhost
Dashboard:
python -m src.dash-app.app
Anomaly detection model (for debugging purposes):
python -i -m src.amodely
Examples of how the dashboard can be used (note that the data below was randomly generated):
General Plot.ly features
- Hover over data points to see info (date, category, conversion rate)
- Adjust graph axes dynamically
- Zoom in on a particular region
- Download plot as a png
Master dashboard
- Configurations:
- Graphing
Conversion Rate
vs.Quote Date
- Categorising data by
Dimension 1
- All categories displayed (
CATEGORY_1A
,CATEGORY_1B
, ...) - Filtering for
Dimension 2
data that are either in the categoryCATEGORY_2A
orCATEGORY_2B
- Removing categories that have less than 100 entries
- Filtering for 2020 data
- Graphing
Anomaly detection dashboard
- Image 1 configurations:
- Graphing
Quote Volume
vs.Quote Date
- Categorising data by
All
(combining all dimensions) - No filter applied
- Categories with less than 100 entries removed automatically to avoid interfering with the anomaly detection algorithm
- Anomaly detection algorithm running at a confidence interval of 95% (default)
- Filtering for all data (2020 - 2021)
- Hovering over Sep 06, 2021 week to inspect daily data points from that week
- Graphing
- Image 2 configurations:
- Graphing
Conversion Rate
vs.Quote Date
- Categorising data by
Dimension 1
- Isolating second and third category in
Dimension 1
- No filter applied
- Categories with less than 100 entries removed automatically to avoid interfering with the anomaly detection algorithm
- Anomaly detection algorithm running at a confidence interval of 80% (smaller threshold for standard deviations, more outliers)
- Filtering for all data (2020 - 2021)
- Graphing
Anomaly detection output table
- Table of anomalies based on the current configurations of the anomaly detection dashboard
- Updates dynamically when settings/configurations are changed
- Export as CSV
The documentation can be viewed here.