Skip to content

bassimeledath/bizkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bizkit

bizkit

bizkit is a Python package to help streamlining business analytics data mining tasks. This package provides models for market basket analysis, anomaly detection, time-to-event modelling, customer segmentation, and uplift modelling (in progress). Implemented algorithms include mlxtend.apriori, sklearn.IsolationForest, lifelines.KaplanMeierFitter, k-means, and catboost. bizkit focuses on ease of use by providing a well-documented and consistent interface. The results are presented in the interactive visualization via libraries of bokeh, d3fgraph, and plotly.

Overview

Market Basket Analysis:

  • Market basket analysis allows retailers to identify relationships between the items that customer purchase. This form of association analysis is much simpler to implement than many traditional types of ML (clustering, regression, Neural Networks, etc.) and the results are relatively easy to interpret. Association analysis will focus on identifying association rules using the MLxtend Library.

Anomaly Detection:

  • Anomaly detection (unsupervised) helps e-commerce businesses to identify the busiest time of customer purchase and web browsing. Sklearn Isolation Forest, an ensemble regressor, is applied as the classifier for the identification. The results are plotted via bokeh into an interactive time series visualization with red dots representing anomalies.

Time-to-Event Modelling:

  • Time-to-Event is a prediction of the net profit attributed to the entire future relationship with a customer. As it typically follows a time-to-event data structure, we can implement a survival model (specifically, the inverse of the model) to make inferences and predictions.

Customer Segmentation:

  • Customer Segmentation is the activity of grouping data points, namely business customers, into clusters using unsupervised learning techniques. Analyzing the features of various clusters allows businesses to better understand their customers in the process of driving business growth. We implement a model that utilizes the KMeans algorithm to cluster customers into groups and produce visualizations such as radar plots, 3d scatter plots, feature distribution plots across clusters, and rug plots. To ease the process of tuning hyperparameter k, we introduce helper functions that use the elbow method.

Reference

Market Basket Analysis:

Anomaly detection:

Time-to-event modelling:

Customer segmentation:

Uplift Modelling (in progress):

Author

Bassim Eledath, Lynn He, Christine Zhu, Amanda Ma

About

Streamlining business analytics data mining tasks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •