Exploratory Data Analysis and Machine learning to predict click through
This repository contains files submitted as a result of the Avazu CTR prediction competition submission
================================================================================ Classifiers and the models developed have been saved as pickles in order to avoid training on a recurring basis
Models I have taken into consideration and implemented:
- Random Forest
- XG boosting
- Logicstic Regression(for feature selection)
A summary of the different stages involved in the predictive modeling assignment and a corresponding record of the related insights generated before diving deep into the modeling phase finally closing with a study of the results stemming from the related metrics that have been used to assess the model performance.
-
Exploratory Data Analysis ~ Sampling the data, memory optimization, analysing the click through distribution with the help of data visualizations and various other Key Performance Indicators(KPIs)
-
Model development - USing Logistic Regression ~ choosing among L1/L2 regularization techniques to in order to reduce the dimensional space and to keep a check on overfitting of the model
-
Evaluating the results -! Considering various metrics such as Classficiation reports, accuracy scores, confusion matrices and ROC/AUC scores to evaluate the performance of the built model
I have documented the whole project as a blog entry at https://medium.com/@rachit.mishra94/predicting-click-probabilities-on-a-leading-advertising-platform-7582633e6e78
================================================================================