epic stuff
the following metric was used for performance Evaluation metric: Normalized Discounted Cumulative Gain
model - Gradient Boosting Machines (GBM) Two types of models ± without EXP features (A) � 5000 elementary trees � 30 hours to train ± with EXP features (B) � 2500 elementary trees � 20 hours to train
Most important features: ± Position ± Price ± Location desirability (ver. 2)
Down sampling negative instances improves training time and predictive performance
model: LambdaMART LambdaMART is a learning to rank algorithm based on Multiple Additive Regression Tree (MART).
1.) Scan through all values which have a Nan count greater than 60% of the total number of rows https://www.kaggle.com/code/vishalkasa/feature-engineering-k-means
2.)Remove the users who did not booked the hotel https://www.kaggle.com/code/jiaofenx/expedia-hotel-recommendations
3.) Look at when the booking were made i.e weekdays vs Saturday
4.) Example using K means and various plots for data understanding https://www.kaggle.com/code/putdejudomthai/expedia-exploratory-data-destination-search
5.) Using chi-squared feature analysis as well as PCA analysis https://medium.com/@zander.b.tedjo/expedia-hotel-recommendations-using-machine-learning-9a8eccd4ecba
how to evaluate scores?
- from sklearn.metrics import ndcg_score
- booking