The purpose of the analysis is to extract insights with business value for a webshop selling all-occasion gifts using R, tidyverse, sparklyr, and Spark.
A large part of the analysis consists of data cleaning and basic exploratory analysis, as usually is the case with data science projects. After those basic steps, I employ machine learning algorithms on Spark to uncover more complex customer behavior patterns, like which products are frequently purchased together. I also train a recommender engine.
Project deliverables are publicly available data science notebook and an interactive web application, aimed at bringing the project results quickly to the business users.
Due to using Spark, this project can be set up on Linux machines only.