-
What is the likelihood of a customer reordering a specific product?
-
How should the company recommend relevant products to individual users
-
Collected and compiled 3 million daily transaction-level data from a popular-online grocery delivery company in 2017
-
Utilized data to strategize decision-making to improve performance and sustain competitive advantage
-
Further understanding the customer base and their purchasing behaviors
- Python
- Jupyter
- PostgreSQL
- Amazon S3
- Amazon Sagemaker
-
Random Forest Classifier yields the best accuracy of 82.48% in comparison to Logistic Regression and Naives Bayes
-
Product order share was the most important feature in comparison to User Total Order, User Total Product Order, User Product Total Reordered and User Median Days Since Prior Order
-
Recommendations for the users are unaffected by the K-Nearest Neighbors (KNN) model and is strictly reliant on the dataset
-
Recommendation system occasionally outputs irrelevant products. Example: Recommending toiletries when ordering produce
-
The result allows us to categorize the users into different target groups based on their purchasing behaviors
-
Promoting products that are at the tipping point between reordering and not-reordering for each target groups
-
The model looks at every user that purchases the same product and depending on the popularity of the product, the recommendation system may vary in accuracy. Example: Almost everybody purchases bananas with a variety of different products making the recommendation more random
-
K-Nearest Neighbors (KNN) does not seem to be the best model for the recommendation system due to it’s high dependency on the dataset
-
Further improve collaborative filtering algorithm by looking into Matrix Factorization, Deep Learning, and Neural Networks
-
Create business strategies based on the results of research question 1