Skip to content

Data Analytics Models run on Expedia dataset using Python. Includes Regression, Clustering, and Classification Algorithms.

Notifications You must be signed in to change notification settings

BethHilbert/Python-Expedia

Repository files navigation

Expedia Dataset (using Python)

Data modeling using Python Jupyter Notebook. Created July 2017.

Goal

Uses Kaggle dataset to model Regression, Clustering, and Classification algorithms.

Data

Expedia has provided a dataset of customer searches, some of which includes what they searched for (number rooms, number people, dates, location), location where they the search from (site, channel, country where they initiated search), and the result (clicks and whether it resulted in a booking). Expeida grouped hotels into 100 clusters based on hotel popularity, rating, user review rating, price, distance from city center, and amenities.The goal of the Kaggle Competition is to predict which hotel cluster an Expedia user will book, based on their searching attributes and hotel information. I used the dataset to demonstrate 3 algorithms.

Dataset is available from Kaggle: https://www.kaggle.com/c/expedia-hotel-recommendations.

Process

Part 1 involved exploring the dataset for meaningful patterns using statitical analysis, graphical, and numeric summeries.

Link to Part 1 Notebook: https://github.com/BethHilbert/Python-Expedia/blob/master/Expedia%20(part%201)%20Data%20Exploration.ipynb

Part 2 involved models for Regression, Clustering, and Classification Algorithms. The code is designed to easily manipulate the variables in order to experiement with different inputs (such as assigning a different target or number of clusters).

Link to Part 2 Notebook: https://github.com/BethHilbert/Python-Expedia/blob/master/Expedia%20(part2)%20Data%20Analytics%20Models.ipynb

About

Data Analytics Models run on Expedia dataset using Python. Includes Regression, Clustering, and Classification Algorithms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published