This project includes two parts: user churn rate prediction and song recommendation
-
Performed ETL to clean and feature engineer unstructured app user behavior events data
-
Built ensemble models e.g., random forest
-
Conducted cost-benefit analysis for best retention strategy
-
Cleaned 14G unstructured log files from smartphones (1.6 billion rows) utilizing text mining
-
Constructed implicit rating scores with a user ad-hoc quantile algorithm and rectified cold start issues by using content-based and popularity-based methods