Skip to content

Commit

Permalink
Merge pull request #104 from Thomas-George-T/feature_machine_learning
Browse files Browse the repository at this point in the history
  • Loading branch information
Thomas-George-T committed Dec 13, 2023
2 parents f8e59ee + 10338bc commit 8edb111
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 3 deletions.
33 changes: 30 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,12 @@ Pictured: Our data files tracked by DVC in GCP

<hr>

# Data Pipeline
# Overall ML Project PipeLine

![ML Project Pipeline](assets/Ecommerce-Overall-Pipeline.jpeg)


## Data Pipeline

Our data pipeline is modularized right from data ingestion to preprocessing to make our data ready for modeling. It is made sure that every module functions as expected by following Test Driven Development (TDD). This is achieved through enforcing tests for every module.

Expand Down Expand Up @@ -231,13 +236,16 @@ It is to serve the K_Means_Clustering on Vertex AI after training.

For tracking our experimental machine learning pipeline, we use MLflow, Docker, and Python.

We chose the three metrics Davies-Bouldin Inedx(lower the better), Calinski-Harabasz Index(higher the better) and primarily Silhouette score(higher the better) to choose our final model parameters from the plot below.
We chose the three metrics Davies-Bouldin Index(lower the better), Calinski-Harabasz Index(higher the better) and primarily Silhouette score(higher the better) to choose our final model parameters from the plot below.

![MLFlow Parallel Plot Image](assets/KMeans_Parallelplot.png)
Pictured: Parallel Plot for visualizing the parameter-metrics combinations for our model

## Staging, Production and Archived models (MLFLOW)
In managing models for Staging, Production, and Archiving, we rely on MLflow.
We rely on MLflow for managing models for Archiving, Staging, and Production as it allows us to reuse the models from artifacts regietry and serve it on a predefined port on-the-go. Our

![MLFlow Dashboard](assets/MLFlow_dashboard.png)
Pictured: Existing Logs on MLFlow for all the Experimental Models

## Model Pipeline
#### Train the model
Expand Down Expand Up @@ -265,6 +273,16 @@ In managing models for Staging, Production, and Archiving, we rely on MLflow.

<p align="center">The plot above visualises the distribution of customers into clusters.</p>

<hr>

# Deployment Pipeline

We have deployed the K-Means Model on a Vertex-AI Endpoint, which uses Flask API to receive requests. We have implemented Model and Traffic Monitoring using Big Query, and integrated this with the Looker Dashboard that helps evaluate the latency for server load. We also use Big Query to check the features' min-max values for determining any data drifts.

![Deployment Pipeline](assets/Deployment-Pipeline.jpeg)

<hr>

# Model Insights

## Segmentation Clusters
Expand Down Expand Up @@ -309,6 +327,15 @@ Profile: Sporadic Shoppers with a Proclivity for Weekend Shopping

![Customer Trends Histogram](data/plots/histogram_analysis.png)

<hr>

# Monitoring

![Monitoring Dashboard](assets/Model_Monitoring_Graph.png)

We create a Monitoring Dashboard to monitor the extend of data or concept drift (if any). We use BigQuery to capture input values of features, the predicted cluster and timstamp. We also calculate and store important metrics like Latency between prediction.

View the multipage dashbord on [Looker](https://lookerstudio.google.com/s/tsXALSpVJ3w)

<hr>

Expand Down
Binary file added assets/Deployment-Pipeline.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/Ecommerce-Overall-Pipeline.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/Model_Monitoring_Graph.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 8edb111

Please sign in to comment.