# Hotel Booking Example

### Objective

#### Background

This hotel booking dataset, sourced from Datacamp, is dummy data with booking date including specific information such as: city, hotel and a resort hotel, room specifications, duration of stay, the time between the booking and the stay, whether the booking was canceled, how the booking was made, etc. The time from of the data is between July 2015 and August 2017.

#### Business Understanding

#### Data Understanding

Data Dictionary

| Column                                    | Explanation                                                                                             |
|-------------------------------------------|---------------------------------------------------------------------------------------------------------|
| `is_canceled`                             | Binary variable indicating whether a booking was canceled                                               |
| `lead_time`                               | Number of days between booking date and arrival date                                                    |
| `arrival_date_week_number`, `arrival_date_day_of_month`, `arrival_date_month` | Week number, day date, and month number of arrival date                          |
| `stays_in_weekend_nights`, `stays_in_week_nights`         | Number of weekend nights (Saturday and Sunday) and weeknights (Monday to Friday) the customer booked   |
| `adults`, `children`, `babies`            | Number of adults, children, babies booked for the stay                                                  |
| `is_repeated_guest`                       | Binary variable indicating whether the customer was a repeat guest                                      |
| `previous_cancellations`                  | Number of prior bookings that were canceled by the customer                                             |
| `previous_bookings_not_canceled`          | Number of prior bookings that were not canceled by the customer                                         |
| `required_car_parking_spaces`             | Number of parking spaces requested by the customer                                                      |
| `total_of_special_requests`               | Number of special requests made by the customer                                                         |
| `avg_daily_rate`                          | Average daily rate, as defined by dividing the sum of all lodging transactions by the total number of staying nights |
| `booked_by_company`                       | Binary variable indicating whether a company booked the booking                                         |
| `booked_by_agent`                         | Binary variable indicating whether an agent booked the booking                                          |
| `hotel_City`                              | Binary variable indicating whether the booked hotel is a "City Hotel"                                   |
| `hotel_Resort`                            | Binary variable indicating whether the booked hotel is a "Resort Hotel"                                 |
| `meal_BB`                                 | Binary variable indicating whether a bed & breakfast meal was booked                                    |
| `meal_HB`                                 | Binary variable indicating whether a half board meal was booked                                         |
| `meal_FB`                                 | Binary variable indicating whether a full board meal was booked                                         |
| `meal_No_meal`                            | Binary variable indicating whether there was no meal package booked                                     |
| `market_segment_...`                      | Indicates market segment designation with a value of 1. "TA"= travel agent, "TO"= tour operators        |
| `distribution_channel_...`                | Indicates booking distribution channel with a value of 1. "TA"= travel agent, "TO"= tour operators, "GDS" = Global Distribution System |
| `reserved_room_type_...`                  | Indicates code of room type reserved with a value of 1. Code is presented instead of designation for anonymity reasons |
| `deposit_type_No_Deposit`                 | Binary variable indicating whether a deposit was made                                                   |
| `deposit_type_Non_Refund`                 | Binary variable indicating whether a deposit was made in the value of the total stay cost               |
| `deposit_type_Refundable`                 | Binary variable indicating whether a deposit was made with a value under the total stay cost            |
| `customer_type_...`                       | Binary variables indicating the type of booking based on the customer's relationship with the booking   |

Data Sources: Datacamp

#### Data Preperation
#### Data Preperation/System Design

Exploratory Data Analysis (EDA)

In [16]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import datetime
from datetime import datetime
from datetime import timedelta
from dateutil import rrule
import time

In [17]:
df = pd.read_csv('./Datasets/datacamp_hotel_booking_demand_data.csv')
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 53 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   is_canceled                     100 non-null    int64  
 1   lead_time                       100 non-null    int64  
 2   arrival_date_week_number        100 non-null    int64  
 3   arrival_date_day_of_month       100 non-null    int64  
 4   arrival_date_month              100 non-null    int64  
 5   stays_in_weekend_nights         100 non-null    int64  
 6   stays_in_week_nights            100 non-null    int64  
 7   adults                          100 non-null    int64  
 8   children                        100 non-null    int64  
 9   babies                          100 non-null    int64  
 10  is_repeated_guest               100 non-null    int64  
 11  previous_cancellations          100 non-null    int64  
 12  previous_bookings_not_canceled  100 n

In [18]:
df.head(5)

Unnamed: 0,is_canceled,lead_time,arrival_date_week_number,arrival_date_day_of_month,arrival_date_month,stays_in_weekend_nights,stays_in_week_nights,adults,children,babies,...,reserved_room_type_G,reserved_room_type_H,reserved_room_type_L,deposit_type_No_Deposit,deposit_type_Non_Refund,deposit_type_Refundable,customer_type_Contract,customer_type_Group,customer_type_Transient,customer_type_Transient-Party
0,0,342,27,1,7,0,0,2,0,0,...,0,0,0,1,0,0,0,0,1,0
1,0,737,27,1,7,0,0,2,0,0,...,0,0,0,1,0,0,0,0,1,0
2,0,7,27,1,7,0,1,1,0,0,...,0,0,0,1,0,0,0,0,1,0
3,0,13,27,1,7,0,1,1,0,0,...,0,0,0,1,0,0,0,0,1,0
4,0,14,27,1,7,0,2,2,0,0,...,0,0,0,1,0,0,0,0,1,0
