In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Recap of Data Science Problem:
For the following capstone, we aim to build a predicitive model to determine whether a hotel booking will be cancelled. Cancellations are crucial for hotels as it affects revenue and disrupts their operational planning and flow. 

# Overview of Data Set:
The dataset presents a multitude of booking-related features, including lead time, deposit type, and special requests, contributing to the model's intricacy. The primary hurdle resides in the preprocessing phase, necessitating tasks such as feature selection, engineering, addressing missing values, and mitigating data noise. Furthermore, our approach involves training diverse models, assessing their efficacy using appropriate metrics, and make clear model insights through the examination of pivotal features within the context of hotel booking cancellations.

# Variable Descriptions

**Variables and Descriptions**

    hotel:	Type of hotel (Resort Hotel, City Hotel)
    is_canceled: Reservation cancellation status (0 = not canceled, 1 = canceled)
    lead_time: Number of days between booking and arrival
    arrival_date_year: Year of arrival
    arrival_date_month: Month of arrival
    arrival_date_week_number: Week number of the year for arrival
    arrival_date_day_of_month: Day of the month of arrival
    stays_in_weekend_nights: Number of weekend nights (Saturday and Sunday) the guest stayed or booked
    stays_in_week_nights: Number of week nights the guest stayed or booked
    adults: Number of adults
    children: Number of children
    babies: Number of babies
    meal: Type of meal booked (BB, FB, HB, SC, Undefined)
    country: Country of origin of the guest
    market_segment: Market segment designation
    distribution_channel: Booking distribution channel
    is_repeated_guest: If the guest is a repeat customer (0 = not repeated, 1 = repeated)
    previous_cancellations: Number of previous bookings that were canceled by the customer
    previous_bookings_not_canceled: Number of previous bookings that were not canceled by the customer
    reserved_room_type: Type of reserved room
    assigned_room_type: Type of assigned room
    booking_changes: Number of changes made to the booking
    deposit_type: Type of deposit made (No Deposit, Refundable, Non Refund)
    agent: ID of the travel agent responsible for the booking
    company: ID of the company responsible for the booking
    days_in_waiting_list: Number of days the booking was in the waiting list
    customer_type: Type of customer (Transient, Contract, Transient-Party, Group)
    adr: Average Daily Rate
    required_car_parking_spaces: Number of car parking spaces required
    total_of_special_requests: Number of special requests made
    reservation_status: Last reservation status (Check-Out, Canceled, No-Show)
    reservation_status_date: Date of the last reservation status

# Loading the Dataset

In [6]:
hotel_demand = pd.read_csv('hotel_bookings2.csv')
hotel_demand.head()

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,...,deposit_type,agent,company,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
0,Resort Hotel,0,342,2015,July,27,1,0,0,2,...,No Deposit,,,0,Transient,0.0,0,0,Check-Out,2015-07-01
1,Resort Hotel,0,737,2015,July,27,1,0,0,2,...,No Deposit,,,0,Transient,0.0,0,0,Check-Out,2015-07-01
2,Resort Hotel,0,7,2015,July,27,1,0,1,1,...,No Deposit,,,0,Transient,75.0,0,0,Check-Out,2015-07-02
3,Resort Hotel,0,13,2015,July,27,1,0,1,1,...,No Deposit,304.0,,0,Transient,75.0,0,0,Check-Out,2015-07-02
4,Resort Hotel,0,14,2015,July,27,1,0,2,2,...,No Deposit,240.0,,0,Transient,98.0,0,1,Check-Out,2015-07-03


Lets take a look into the variables we have in the data set. 

In [34]:
hotel_demand.columns

Index(['hotel', 'is_canceled', 'lead_time', 'arrival_date_year',
       'arrival_date_month', 'arrival_date_week_number',
       'arrival_date_day_of_month', 'stays_in_weekend_nights',
       'stays_in_week_nights', 'adults', 'children', 'babies', 'meal',
       'country', 'market_segment', 'distribution_channel',
       'is_repeated_guest', 'previous_cancellations',
       'previous_bookings_not_canceled', 'reserved_room_type',
       'assigned_room_type', 'booking_changes', 'deposit_type', 'agent',
       'company', 'days_in_waiting_list', 'customer_type', 'adr',
       'required_car_parking_spaces', 'total_of_special_requests',
       'reservation_status', 'reservation_status_date'],
      dtype='object')

In [35]:
hotel_demand.shape

(119390, 32)

In [36]:
hotel_demand.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 119390 entries, 0 to 119389
Data columns (total 32 columns):
 #   Column                          Non-Null Count   Dtype  
---  ------                          --------------   -----  
 0   hotel                           119390 non-null  object 
 1   is_canceled                     119390 non-null  int64  
 2   lead_time                       119390 non-null  int64  
 3   arrival_date_year               119390 non-null  int64  
 4   arrival_date_month              119390 non-null  object 
 5   arrival_date_week_number        119390 non-null  int64  
 6   arrival_date_day_of_month       119390 non-null  int64  
 7   stays_in_weekend_nights         119390 non-null  int64  
 8   stays_in_week_nights            119390 non-null  int64  
 9   adults                          119390 non-null  int64  
 10  children                        119386 non-null  float64
 11  babies                          119390 non-null  int64  
 12  meal            

We can see that there are 12 object Dtypes, 4 floats and the rest as integers.
We will load the object Dtypes from the data set:

In [37]:
hotel_demand.select_dtypes(object)

Unnamed: 0,hotel,arrival_date_month,meal,country,market_segment,distribution_channel,reserved_room_type,assigned_room_type,deposit_type,customer_type,reservation_status,reservation_status_date
0,Resort Hotel,July,BB,PRT,Direct,Direct,C,C,No Deposit,Transient,Check-Out,2015-07-01
1,Resort Hotel,July,BB,PRT,Direct,Direct,C,C,No Deposit,Transient,Check-Out,2015-07-01
2,Resort Hotel,July,BB,GBR,Direct,Direct,A,C,No Deposit,Transient,Check-Out,2015-07-02
3,Resort Hotel,July,BB,GBR,Corporate,Corporate,A,A,No Deposit,Transient,Check-Out,2015-07-02
4,Resort Hotel,July,BB,GBR,Online TA,TA/TO,A,A,No Deposit,Transient,Check-Out,2015-07-03
...,...,...,...,...,...,...,...,...,...,...,...,...
119385,City Hotel,August,BB,BEL,Offline TA/TO,TA/TO,A,A,No Deposit,Transient,Check-Out,2017-09-06
119386,City Hotel,August,BB,FRA,Online TA,TA/TO,E,E,No Deposit,Transient,Check-Out,2017-09-07
119387,City Hotel,August,BB,DEU,Online TA,TA/TO,D,D,No Deposit,Transient,Check-Out,2017-09-07
119388,City Hotel,August,BB,GBR,Online TA,TA/TO,A,A,No Deposit,Transient,Check-Out,2017-09-07


Seeing how many rows for each hotel type:
'City Hotel' and 'Resort Hotel'

In [21]:
hotel_counts = hotel_demand['hotel'].value_counts()
hotel_counts.head()

City Hotel      79330
Resort Hotel    40060
Name: hotel, dtype: int64

Currently 79330 rows of 'City Hotel' and 40060 rows of 'Resort Hotel'

Next, lets see the describe method for the dataset:

In [58]:
hotel_demand.describe()

Unnamed: 0,is_canceled,lead_time,arrival_date_year,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,children,babies,is_repeated_guest,previous_cancellations,previous_bookings_not_canceled,booking_changes,days_in_waiting_list,adr,required_car_parking_spaces,total_of_special_requests
count,119390.0,119390.0,119390.0,119390.0,119390.0,119390.0,119390.0,119390.0,119386.0,119390.0,119390.0,119390.0,119390.0,119390.0,119390.0,119390.0,119390.0,119390.0
mean,0.370416,104.011416,2016.156554,27.165173,15.798241,0.927599,2.500302,1.856403,0.10389,0.007949,0.031912,0.087118,0.137097,0.221124,2.321149,101.831122,0.062518,0.571363
std,0.482918,106.863097,0.707476,13.605138,8.780829,0.998613,1.908286,0.579261,0.398561,0.097436,0.175767,0.844336,1.497437,0.652306,17.594721,50.53579,0.245291,0.792798
min,0.0,0.0,2015.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-6.38,0.0,0.0
25%,0.0,18.0,2016.0,16.0,8.0,0.0,1.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,69.29,0.0,0.0
50%,0.0,69.0,2016.0,28.0,16.0,1.0,2.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,94.575,0.0,0.0
75%,1.0,160.0,2017.0,38.0,23.0,2.0,3.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,126.0,0.0,1.0
max,1.0,737.0,2017.0,53.0,31.0,19.0,50.0,55.0,10.0,10.0,1.0,26.0,72.0,21.0,391.0,5400.0,8.0,5.0


# Checking for Null Values

In [31]:
hotel_demand.isnull().any()

hotel                             False
is_canceled                       False
lead_time                         False
arrival_date_year                 False
arrival_date_month                False
arrival_date_week_number          False
arrival_date_day_of_month         False
stays_in_weekend_nights           False
stays_in_week_nights              False
adults                            False
children                           True
babies                            False
meal                              False
country                            True
market_segment                    False
distribution_channel              False
is_repeated_guest                 False
previous_cancellations            False
previous_bookings_not_canceled    False
reserved_room_type                False
assigned_room_type                False
booking_changes                   False
deposit_type                      False
agent                              True
company                            True


We know there are null values for variables 'children', 'country', 'agent' and 'company'
We have to see how many there are to decide whether to drop null values.

In [32]:
hotel_demand.isnull().sum()

hotel                                  0
is_canceled                            0
lead_time                              0
arrival_date_year                      0
arrival_date_month                     0
arrival_date_week_number               0
arrival_date_day_of_month              0
stays_in_weekend_nights                0
stays_in_week_nights                   0
adults                                 0
children                               4
babies                                 0
meal                                   0
country                              488
market_segment                         0
distribution_channel                   0
is_repeated_guest                      0
previous_cancellations                 0
previous_bookings_not_canceled         0
reserved_room_type                     0
assigned_room_type                     0
booking_changes                        0
deposit_type                           0
agent                              16340
company         

Null Value Count

children: 4
country: 488
agent: 16340
company: 112593

In [33]:
missing = pd.concat([hotel_demand.isnull().sum(), 100 * hotel_demand.isnull().mean()], axis=1)
missing.columns=['count', '%']
missing.sort_values(by='%', ascending=False)

Unnamed: 0,count,%
company,112593,94.306893
agent,16340,13.686238
country,488,0.408744
children,4,0.00335
reserved_room_type,0,0.0
assigned_room_type,0,0.0
booking_changes,0,0.0
deposit_type,0,0.0
hotel,0,0.0
previous_cancellations,0,0.0


In [47]:
hotel_demand.columns

Index(['hotel', 'is_canceled', 'lead_time', 'arrival_date_year',
       'arrival_date_month', 'arrival_date_week_number',
       'arrival_date_day_of_month', 'stays_in_weekend_nights',
       'stays_in_week_nights', 'adults', 'children', 'babies', 'meal',
       'country', 'market_segment', 'distribution_channel',
       'is_repeated_guest', 'previous_cancellations',
       'previous_bookings_not_canceled', 'reserved_room_type',
       'assigned_room_type', 'booking_changes', 'deposit_type', 'agent',
       'days_in_waiting_list', 'customer_type', 'adr',
       'required_car_parking_spaces', 'total_of_special_requests',
       'reservation_status', 'reservation_status_date'],
      dtype='object')

# Dropping Duplicates and Null Values

There seems to be a significant amount of null values for the company column with roughly 94% of its data is missing. 
We will drop this variable column.

In [None]:
hotel_demand[hotel_demand.duplicated(keep=False)]

In [None]:
hotel_demand.drop(columns=['company'], inplace=True)

In [48]:
hotel_demand.drop(columns=['agent'], inplace=True)

In [49]:
hotel_demand.columns

Index(['hotel', 'is_canceled', 'lead_time', 'arrival_date_year',
       'arrival_date_month', 'arrival_date_week_number',
       'arrival_date_day_of_month', 'stays_in_weekend_nights',
       'stays_in_week_nights', 'adults', 'children', 'babies', 'meal',
       'country', 'market_segment', 'distribution_channel',
       'is_repeated_guest', 'previous_cancellations',
       'previous_bookings_not_canceled', 'reserved_room_type',
       'assigned_room_type', 'booking_changes', 'deposit_type',
       'days_in_waiting_list', 'customer_type', 'adr',
       'required_car_parking_spaces', 'total_of_special_requests',
       'reservation_status', 'reservation_status_date'],
      dtype='object')

In [50]:
hotel_demand.shape

(119390, 30)

We made sure that the two variables have been dropped for the upcoming EDA analysis and modeling. 

# Changing the null values from 'country' into 0 values

In [52]:
hotel_demand['country'].fillna(hotel_demand['country'].mode()[0],inplace=True)

In [59]:
hotel_demand.isna().sum()

hotel                             0
is_canceled                       0
lead_time                         0
arrival_date_year                 0
arrival_date_month                0
arrival_date_week_number          0
arrival_date_day_of_month         0
stays_in_weekend_nights           0
stays_in_week_nights              0
adults                            0
children                          4
babies                            0
meal                              0
country                           0
market_segment                    0
distribution_channel              0
is_repeated_guest                 0
previous_cancellations            0
previous_bookings_not_canceled    0
reserved_room_type                0
assigned_room_type                0
booking_changes                   0
deposit_type                      0
days_in_waiting_list              0
customer_type                     0
adr                               0
required_car_parking_spaces       0
total_of_special_requests   

# Checking for other inconsistencies 

In [54]:
hotel_demand['hotel'].value_counts()

City Hotel      79330
Resort Hotel    40060
Name: hotel, dtype: int64

In [57]:
hotel_demand['meal'].value_counts()

BB           92310
HB           14463
SC           10650
Undefined     1169
FB             798
Name: meal, dtype: int64

In [64]:
hotel_demand['meal'].unique()

array(['BB', 'FB', 'HB', 'SC', 'Undefined'], dtype=object)

We can see that there is an undefined value, but we will keep it as is and not change this part of the data.

In [65]:
hotel_demand['children'].value_counts()

0.0     110796
1.0       4861
2.0       3652
3.0         76
10.0         1
Name: children, dtype: int64

In [68]:
hotel_demand['babies'].value_counts()

0     118473
1        900
2         15
10         1
9          1
Name: babies, dtype: int64

In [72]:
hotel_demand.sort_values('babies', ascending=False)

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,...,assigned_room_type,booking_changes,deposit_type,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
46619,City Hotel,0,37,2016,January,3,12,0,2,2,...,D,1,No Deposit,0,Transient,84.45,0,1,Check-Out,2016-01-14
78656,City Hotel,0,11,2015,October,42,11,2,1,1,...,B,1,No Deposit,0,Transient-Party,95.00,0,0,Check-Out,2015-10-14
94063,City Hotel,0,4,2016,July,31,27,0,4,2,...,D,2,No Deposit,0,Transient,176.00,0,2,Check-Out,2016-07-31
104351,City Hotel,0,19,2017,January,2,8,2,2,2,...,E,1,No Deposit,0,Transient,119.50,0,0,Check-Out,2017-01-12
33332,Resort Hotel,0,31,2017,February,8,19,1,0,2,...,C,2,No Deposit,0,Transient-Party,50.00,1,2,Check-Out,2017-02-20
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
40138,City Hotel,0,3,2015,July,28,11,0,1,3,...,A,0,No Deposit,0,Transient-Party,104.00,0,0,Check-Out,2015-07-12
40137,City Hotel,0,71,2015,July,28,11,0,1,1,...,A,1,No Deposit,0,Transient,51.96,0,0,Check-Out,2015-07-12
40136,City Hotel,0,3,2015,July,28,11,0,1,2,...,A,0,No Deposit,0,Transient-Party,75.00,0,0,Check-Out,2015-07-12
40135,City Hotel,0,71,2015,July,28,11,0,1,1,...,A,1,No Deposit,0,Transient,51.96,0,0,Check-Out,2015-07-12


In [73]:
hotel_demand.sort_values('children', ascending=False)

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,...,assigned_room_type,booking_changes,deposit_type,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
328,Resort Hotel,1,55,2015,July,29,12,4,10,2,...,D,2,No Deposit,0,Contract,133.16,0,1,No-Show,2015-07-12
116832,City Hotel,0,14,2017,July,30,24,1,1,0,...,A,1,No Deposit,0,Transient-Party,0.00,0,3,Check-Out,2017-07-26
40984,City Hotel,0,1,2015,August,33,10,1,1,0,...,B,1,No Deposit,0,Transient-Party,9.00,0,0,Check-Out,2015-08-12
119070,City Hotel,0,0,2017,August,35,29,0,1,2,...,G,3,No Deposit,0,Group,270.00,0,1,Check-Out,2017-08-30
106387,City Hotel,0,3,2017,February,8,22,0,3,2,...,F,1,No Deposit,0,Transient,275.00,0,2,Check-Out,2017-02-25
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
119389,City Hotel,0,205,2017,August,35,29,2,7,2,...,A,0,No Deposit,0,Transient,151.20,0,2,Check-Out,2017-09-07
40600,City Hotel,1,2,2015,August,32,3,1,0,2,...,B,0,No Deposit,0,Transient-Party,12.00,0,1,Canceled,2015-08-01
40667,City Hotel,1,1,2015,August,32,5,0,2,2,...,B,0,No Deposit,0,Transient-Party,12.00,0,1,Canceled,2015-08-04
40679,City Hotel,1,1,2015,August,32,5,0,2,3,...,B,0,No Deposit,0,Transient-Party,18.00,0,2,Canceled,2015-08-04


In [75]:
filter = (hotel_demand['adults'] == 0) & (hotel_demand['children'] == 0) & (hotel_demand['babies'] == 0)
hotel_demand[filter]

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,...,assigned_room_type,booking_changes,deposit_type,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
2224,Resort Hotel,0,1,2015,October,41,6,0,3,0,...,I,1,No Deposit,0,Transient-Party,0.00,0,0,Check-Out,2015-10-06
2409,Resort Hotel,0,0,2015,October,42,12,0,0,0,...,I,0,No Deposit,0,Transient,0.00,0,0,Check-Out,2015-10-12
3181,Resort Hotel,0,36,2015,November,47,20,1,2,0,...,C,0,No Deposit,0,Transient-Party,0.00,0,0,Check-Out,2015-11-23
3684,Resort Hotel,0,165,2015,December,53,30,1,4,0,...,A,1,No Deposit,122,Transient-Party,0.00,0,0,Check-Out,2016-01-04
3708,Resort Hotel,0,165,2015,December,53,30,2,4,0,...,C,1,No Deposit,122,Transient-Party,0.00,0,0,Check-Out,2016-01-05
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
115029,City Hotel,0,107,2017,June,26,27,0,3,0,...,A,1,No Deposit,0,Transient,100.80,0,0,Check-Out,2017-06-30
115091,City Hotel,0,1,2017,June,26,30,0,1,0,...,K,0,No Deposit,0,Transient,0.00,1,1,Check-Out,2017-07-01
116251,City Hotel,0,44,2017,July,28,15,1,1,0,...,K,2,No Deposit,0,Transient,73.80,0,0,Check-Out,2017-07-17
116534,City Hotel,0,2,2017,July,28,15,2,5,0,...,K,1,No Deposit,0,Transient-Party,22.86,0,1,Check-Out,2017-07-22


From the above code, we can see that there are 180 rows of data where all adults, children and babies are 0. This does not make sense, so we will drop these rows. 

In [70]:
hotel_demand_cleaned = hotel_demand[~filter]
hotel_demand_cleaned

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,...,assigned_room_type,booking_changes,deposit_type,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
0,Resort Hotel,0,342,2015,July,27,1,0,0,2,...,C,3,No Deposit,0,Transient,0.00,0,0,Check-Out,2015-07-01
1,Resort Hotel,0,737,2015,July,27,1,0,0,2,...,C,4,No Deposit,0,Transient,0.00,0,0,Check-Out,2015-07-01
2,Resort Hotel,0,7,2015,July,27,1,0,1,1,...,C,0,No Deposit,0,Transient,75.00,0,0,Check-Out,2015-07-02
3,Resort Hotel,0,13,2015,July,27,1,0,1,1,...,A,0,No Deposit,0,Transient,75.00,0,0,Check-Out,2015-07-02
4,Resort Hotel,0,14,2015,July,27,1,0,2,2,...,A,0,No Deposit,0,Transient,98.00,0,1,Check-Out,2015-07-03
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
119385,City Hotel,0,23,2017,August,35,30,2,5,2,...,A,0,No Deposit,0,Transient,96.14,0,0,Check-Out,2017-09-06
119386,City Hotel,0,102,2017,August,35,31,2,5,3,...,E,0,No Deposit,0,Transient,225.43,0,2,Check-Out,2017-09-07
119387,City Hotel,0,34,2017,August,35,31,2,5,2,...,D,0,No Deposit,0,Transient,157.71,0,4,Check-Out,2017-09-07
119388,City Hotel,0,109,2017,August,35,31,2,5,2,...,A,0,No Deposit,0,Transient,104.40,0,0,Check-Out,2017-09-07


Lets save it as df.

In [76]:
df = hotel_demand_cleaned

In [77]:
df.head()

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,...,assigned_room_type,booking_changes,deposit_type,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
0,Resort Hotel,0,342,2015,July,27,1,0,0,2,...,C,3,No Deposit,0,Transient,0.0,0,0,Check-Out,2015-07-01
1,Resort Hotel,0,737,2015,July,27,1,0,0,2,...,C,4,No Deposit,0,Transient,0.0,0,0,Check-Out,2015-07-01
2,Resort Hotel,0,7,2015,July,27,1,0,1,1,...,C,0,No Deposit,0,Transient,75.0,0,0,Check-Out,2015-07-02
3,Resort Hotel,0,13,2015,July,27,1,0,1,1,...,A,0,No Deposit,0,Transient,75.0,0,0,Check-Out,2015-07-02
4,Resort Hotel,0,14,2015,July,27,1,0,2,2,...,A,0,No Deposit,0,Transient,98.0,0,1,Check-Out,2015-07-03


# Saving file as csv

In [79]:
df.to_csv('df', index=False)