Skip to content

Stmsmj/hotel-booking-predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

hotel booking predictor

in this i tried to find the best model for a imbalanced dataset with 36 million rows. for this dataset we are gonna do some classification. unfortunately link of this dataset is not accessible at the time of writing this. main challenges of this code was:

  • data was imbalanced
  • data wasnt representative of what we are asked for

for preprocessing of this code i did lots of visualizations and based on some of them i removed what seemed to be outliers according to its z-score or IQR. after removing outliers i tried to tackle issues caused by being a imbalanced dataset. i tried different solutions and tried to find a good threshold for our predictions. with our final dataset we going after checking different models to see which will perform the best.

lets look at dataset:

user search_date channel is_mobile is_package destination checkIn_date checkOut_date n_adults n_children n_rooms hotel_category is_booking
0 u461899 2019-01-07 00:00:02 c9 False False d669 2019-03-14 2019-03-15 2 1 1 g41 False
1 u13796 2019-01-07 00:00:06 c9 False False d8821 2019-01-19 2019-01-26 1 0 1 g58 False
2 u1128575 2019-01-07 00:00:06 c9 False False d25064 2019-01-19 2019-01-22 1 0 1 g91 False
3 u1080476 2019-01-07 00:00:09 c9 False True d7635 2019-05-29 2019-06-05 2 0 1 g10 False
4 u1080476 2019-01-07 00:00:17 c9 False True d7635 2019-05-29 2019-06-05 2 0 1 g10 False
... ... ... ... ... ... ... ... ... ... ... ... ... ...
34742970 u553256 2020-11-30 23:59:48 c2 True True d45532 2020-12-07 2020-12-08 2 0 1 g48 False
34742971 u529472 2020-11-30 23:59:49 c9 False False d8279 2020-12-27 2021-01-02 2 2 1 g18 False
34742972 u18236 2020-11-30 23:59:53 c4 False False d20275 2021-04-22 2021-04-25 1 0 1 g5 False
34742973 u10888 2020-11-30 23:59:54 c9 False False d19371 2020-12-29 2020-12-30 2 0 1 g17 False
34742974 u233344 2020-11-30 23:59:55 c9 False False d22862 2021-08-16 2021-08-18 2 0 1 g44 False

34742975 rows × 13 columns

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published