# Studi Kasus: Pembersihan dan Pengolahan Data Pemesanan Hotel

## Latar Belakang
Anda adalah seorang analis data di sebuah perusahaan manajemen hotel yang ingin menganalisis data pemesanan untuk memahami pola pembatalan dan preferensi pelanggan. Dataset yang digunakan adalah `hotel_bookings.csv`, yang berisi informasi tentang pemesanan hotel, seperti nama pelanggan, tanggal pemesanan, tipe kamar, status pembatalan, dan lainnya. Namun, dataset ini memiliki masalah seperti nilai kosong (missing values) dan format data yang tidak konsisten, terutama pada kolom `customer_name` dan `reservation_status`. Anda diminta untuk melakukan pembersihan data, penggabungan informasi, dan ekstraksi pola menggunakan regular expression untuk mempersiapkan data untuk analisis lebih lanjut.

Dataset Sumber
- **`hotel_bookings.csv`**: Berisi kolom seperti `hotel`, `is_canceled`, `lead_time`, `arrival_date_year`, `arrival_date_month`, `arrival_date_week_number`, `arrival_date_day_of_month`, `stays_in_weekend_nights`, `stays_in_week_nights`, `adults`, `children`, `babies`, `meal`, `country`, `market_segment`, `distribution_channel`, `is_repeated_guest`, `previous_cancellations`, `previous_bookings_not_canceled`, `reserved_room_type`, `assigned_room_type`, `booking_changes`, `deposit_type`, `agent`, `company`, `days_in_waiting_list`, `customer_type`, `adr`, `required_car_parking_spaces`, `total_of_special_requests`, `reservation_status`, `reservation_status_date`.
- **`hotel_type_info.csv`**: Dataset tambahan yang berisi informasi tentang tipe hotel (`hotel`) dengan kolom `hotel` dan `hotel_description`.

Instruksi Latihan
Latihan ini dibagi menjadi dua bagian utama: **Data Join dan Validasi** serta **Pembersihan Data dan Regular Expression**. Ikuti langkah-langkah berikut dan lengkapi kode yang diberikan.

## Bagian 1: Data Join dan Validasi


In [35]:
import warnings
warnings.filterwarnings("ignore")
pd.set_option('display.max_columns', None)

In [45]:
import pandas as pd

# Load dataset
hotel_df = pd.read_csv(r'D:\JCDS\Dataset\hotel_bookings.csv')
hotel_type_info = pd.read_csv(r'D:\JCDS\Dataset\hotel_type_info.csv')
display(hotel_df)
display(hotel_type_info)

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,children,babies,meal,country,market_segment,distribution_channel,is_repeated_guest,previous_cancellations,previous_bookings_not_canceled,reserved_room_type,assigned_room_type,booking_changes,deposit_type,agent,company,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
0,Resort Hotel,0,342,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,3,No Deposit,,,0,Transient,0.00,0,0,Check-Out,01-07-15
1,Resort Hotel,0,737,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,4,No Deposit,,,0,Transient,0.00,0,0,Check-Out,01-07-15
2,Resort Hotel,0,7,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Direct,Direct,0,0,0,A,C,0,No Deposit,,,0,Transient,75.00,0,0,Check-Out,02-07-15
3,Resort Hotel,0,13,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Corporate,Corporate,0,0,0,A,A,0,No Deposit,304.0,,0,Transient,75.00,0,0,Check-Out,02-07-15
4,Resort Hotel,0,14,2015,July,27,1,0,2,2,0.0,0,BB,GBR,Online TA,TA/TO,0,0,0,A,A,0,No Deposit,240.0,,0,Transient,98.00,0,1,Check-Out,03-07-15
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
119385,City Hotel,0,23,2017,August,35,30,2,5,2,0.0,0,BB,BEL,Offline TA/TO,TA/TO,0,0,0,A,A,0,No Deposit,394.0,,0,Transient,96.14,0,0,Check-Out,06-09-17
119386,City Hotel,0,102,2017,August,35,31,2,5,3,0.0,0,BB,FRA,Online TA,TA/TO,0,0,0,E,E,0,No Deposit,9.0,,0,Transient,225.43,0,2,Check-Out,07-09-17
119387,City Hotel,0,34,2017,August,35,31,2,5,2,0.0,0,BB,DEU,Online TA,TA/TO,0,0,0,D,D,0,No Deposit,9.0,,0,Transient,157.71,0,4,Check-Out,07-09-17
119388,City Hotel,0,109,2017,August,35,31,2,5,2,0.0,0,BB,GBR,Online TA,TA/TO,0,0,0,A,A,0,No Deposit,89.0,,0,Transient,104.40,0,0,Check-Out,07-09-17


Unnamed: 0,hotel,hotel_description
0,Resort Hotel,Beachfront Resort
1,City Hotel,Urban Business Hotel


In [46]:
# Langkah 1: Gabungkan dataset
# Gunakan pd.merge() untuk melakukan inner join
# Tulis kode Anda di sini

df = pd.merge(right=hotel_df, left=hotel_type_info, on="hotel")
df.head()

Unnamed: 0,hotel,hotel_description,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,children,babies,meal,country,market_segment,distribution_channel,is_repeated_guest,previous_cancellations,previous_bookings_not_canceled,reserved_room_type,assigned_room_type,booking_changes,deposit_type,agent,company,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date
0,Resort Hotel,Beachfront Resort,0,342,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,3,No Deposit,,,0,Transient,0.0,0,0,Check-Out,01-07-15
1,Resort Hotel,Beachfront Resort,0,737,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,4,No Deposit,,,0,Transient,0.0,0,0,Check-Out,01-07-15
2,Resort Hotel,Beachfront Resort,0,7,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Direct,Direct,0,0,0,A,C,0,No Deposit,,,0,Transient,75.0,0,0,Check-Out,02-07-15
3,Resort Hotel,Beachfront Resort,0,13,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Corporate,Corporate,0,0,0,A,A,0,No Deposit,304.0,,0,Transient,75.0,0,0,Check-Out,02-07-15
4,Resort Hotel,Beachfront Resort,0,14,2015,July,27,1,0,2,2,0.0,0,BB,GBR,Online TA,TA/TO,0,0,0,A,A,0,No Deposit,240.0,,0,Transient,98.0,0,1,Check-Out,03-07-15


In [47]:
# Langkah 2: Validasi hasil gabungan
# Periksa nilai kosong pada kolom hotel_description
# Tulis kode Anda di sini
df["hotel_description"].isna().sum()

np.int64(0)

In [48]:
# Langkah 3: Buat laporan jumlah pemesanan per tipe hotel
# Tulis kode Anda di sini
df.groupby(["hotel", "hotel_description"])["hotel"].count().reset_index(name="jumlah pemesanan")

Unnamed: 0,hotel,hotel_description,jumlah pemesanan
0,City Hotel,Urban Business Hotel,79330
1,Resort Hotel,Beachfront Resort,40060


## Bagian 2: Pembersihan Data dan Regular Expression

In [49]:
import re

In [50]:
# Langkah 1: Penanganan missing values
# Identifikasi nilai kosong dan lakukan imputasi
# Tulis kode Anda di sini
df.isna().sum() / len(df) * 100

hotel                              0.000000
hotel_description                  0.000000
is_canceled                        0.000000
lead_time                          0.000000
arrival_date_year                  0.000000
arrival_date_month                 0.000000
arrival_date_week_number           0.000000
arrival_date_day_of_month          0.000000
stays_in_weekend_nights            0.000000
stays_in_week_nights               0.000000
adults                             0.000000
children                           0.003350
babies                             0.000000
meal                               0.000000
country                            0.408744
market_segment                     0.000000
distribution_channel               0.000000
is_repeated_guest                  0.000000
previous_cancellations             0.000000
previous_bookings_not_canceled     0.000000
reserved_room_type                 0.000000
assigned_room_type                 0.000000
booking_changes                 

In [51]:
df.drop(columns=["company"], inplace=True)

In [52]:
df.isna().sum() / len(df) * 100

hotel                              0.000000
hotel_description                  0.000000
is_canceled                        0.000000
lead_time                          0.000000
arrival_date_year                  0.000000
arrival_date_month                 0.000000
arrival_date_week_number           0.000000
arrival_date_day_of_month          0.000000
stays_in_weekend_nights            0.000000
stays_in_week_nights               0.000000
adults                             0.000000
children                           0.003350
babies                             0.000000
meal                               0.000000
country                            0.408744
market_segment                     0.000000
distribution_channel               0.000000
is_repeated_guest                  0.000000
previous_cancellations             0.000000
previous_bookings_not_canceled     0.000000
reserved_room_type                 0.000000
assigned_room_type                 0.000000
booking_changes                 

In [53]:
# kalau tidak ada outliner, pake rata rata
# kalau ada, pakai median

df["agent"].describe()

count    103050.000000
mean         86.693382
std         110.774548
min           1.000000
25%           9.000000
50%          14.000000
75%         229.000000
max         535.000000
Name: agent, dtype: float64

In [54]:
df["agent"].fillna(df["agent"].median(), inplace=True)
df.isna().sum()

hotel                               0
hotel_description                   0
is_canceled                         0
lead_time                           0
arrival_date_year                   0
arrival_date_month                  0
arrival_date_week_number            0
arrival_date_day_of_month           0
stays_in_weekend_nights             0
stays_in_week_nights                0
adults                              0
children                            4
babies                              0
meal                                0
country                           488
market_segment                      0
distribution_channel                0
is_repeated_guest                   0
previous_cancellations              0
previous_bookings_not_canceled      0
reserved_room_type                  0
assigned_room_type                  0
booking_changes                     0
deposit_type                        0
agent                               0
days_in_waiting_list                0
customer_typ

In [56]:
df.dropna(subset=["country", "children"], axis = 0, inplace=True)

In [58]:
df.isna().sum() / len(df) * 100

hotel                             0.0
hotel_description                 0.0
is_canceled                       0.0
lead_time                         0.0
arrival_date_year                 0.0
arrival_date_month                0.0
arrival_date_week_number          0.0
arrival_date_day_of_month         0.0
stays_in_weekend_nights           0.0
stays_in_week_nights              0.0
adults                            0.0
children                          0.0
babies                            0.0
meal                              0.0
country                           0.0
market_segment                    0.0
distribution_channel              0.0
is_repeated_guest                 0.0
previous_cancellations            0.0
previous_bookings_not_canceled    0.0
reserved_room_type                0.0
assigned_room_type                0.0
booking_changes                   0.0
deposit_type                      0.0
agent                             0.0
days_in_waiting_list              0.0
customer_typ

In [60]:
df.country.unique()

array(['PRT', 'GBR', 'USA', 'ESP', 'IRL', 'FRA', 'ROU', 'NOR', 'OMN',
       'ARG', 'POL', 'DEU', 'BEL', 'CHE', 'CN', 'GRC', 'ITA', 'NLD',
       'DNK', 'RUS', 'SWE', 'AUS', 'EST', 'CZE', 'BRA', 'FIN', 'MOZ',
       'BWA', 'LUX', 'SVN', 'ALB', 'IND', 'CHN', 'MEX', 'MAR', 'UKR',
       'SMR', 'LVA', 'PRI', 'SRB', 'CHL', 'AUT', 'BLR', 'LTU', 'TUR',
       'ZAF', 'AGO', 'ISR', 'CYM', 'ZMB', 'CPV', 'ZWE', 'DZA', 'KOR',
       'CRI', 'HUN', 'ARE', 'TUN', 'JAM', 'HRV', 'HKG', 'IRN', 'GEO',
       'AND', 'GIB', 'URY', 'JEY', 'CAF', 'CYP', 'COL', 'GGY', 'KWT',
       'NGA', 'MDV', 'VEN', 'SVK', 'FJI', 'KAZ', 'PAK', 'IDN', 'LBN',
       'PHL', 'SEN', 'SYC', 'AZE', 'BHR', 'NZL', 'THA', 'DOM', 'MKD',
       'MYS', 'ARM', 'JPN', 'LKA', 'CUB', 'CMR', 'BIH', 'MUS', 'COM',
       'SUR', 'UGA', 'BGR', 'CIV', 'JOR', 'SYR', 'SGP', 'BDI', 'SAU',
       'VNM', 'PLW', 'QAT', 'EGY', 'PER', 'MLT', 'MWI', 'ECU', 'MDG',
       'ISL', 'UZB', 'NPL', 'BHS', 'MAC', 'TGO', 'TWN', 'DJI', 'STP',
       'KNA', 'ETH', 

In [70]:
# Langkah 2: Ekstraksi informasi dengan regex
# Ekstrak region dari country

def country_to_region(country):
    if country in [
    'ARM', 'AZE', 'BDI', 'BGD', 'BHR', 'BRN', 'BTN', 'CHN', 'CYP', 'GEO', 'IDN', 'IND', 'IRN', 'IRQ',
    'ISR', 'JPN', 'JOR', 'KAZ', 'KHM', 'KOR', 'KWT', 'LAO', 'LBN', 'LKA', 'MAC', 'MDV', 'MMR', 'MNG',
    'MYS', 'NPL', 'OMN', 'PAK', 'PHL', 'QAT', 'SAU', 'SGP', 'SYR', 'THA', 'TJK', 'TKM', 'TLS', 'TUR',
    'TWN', 'UZB', 'VNM', 'YEM', 'PSE'
    ]:
        return "asia"

    elif country == [
    'AGO', 'BDI', 'BEN', 'BFA', 'BWA', 'CAF', 'CIV', 'CMR', 'COD', 'COG', 'COM', 'CPV', 'DJI', 'DZA',
    'EGY', 'ERI', 'ETH', 'GAB', 'GHA', 'GIN', 'GMB', 'GNB', 'GNQ', 'KEN', 'LBR', 'LBY', 'LSO', 'MAR',
    'MDG', 'MLI', 'MOZ', 'MRT', 'MWI', 'NAM', 'NER', 'NGA', 'RWA', 'SDN', 'SEN', 'SLE', 'SOM', 'STP',
    'SWZ', 'SYC', 'TCD', 'TGO', 'TUN', 'TZA', 'UGA', 'ZAF', 'ZMB', 'ZWE'
    ]:
        return "afrika"
    
    elif country in ['ALB', 'AND', 'AUT', 'BEL', 'BGR', 'BIH', 'CHE', 'CYP', 'CZE', 'DEU', 'DNK', 'ESP', 'EST', 'FIN',
    'FRA', 'FRO', 'GBR', 'GGY', 'GIB', 'GRC', 'HRV', 'HUN', 'IMN', 'IRL', 'ISL', 'ITA', 'JEY', 'KOS',
    'LIE', 'LTU', 'LUX', 'LVA', 'MCO', 'MKD', 'MLT', 'MNE', 'NLD', 'NOR', 'POL', 'PRT', 'ROU', 'RUS',
    'SMR', 'SRB', 'SVK', 'SVN', 'SWE', 'UKR', 'VAT'
    ]:
        return "eropa"
    
    elif country in ['AIA', 'ANT', 'ATG', 'BHS', 'BLM', 'BLZ', 'BMU', 'CAN', 'CRI', 'CUB', 'CYM', 'DMA', 'DOM', 'GLP',
    'GRD', 'GTM', 'HND', 'HTI', 'JAM', 'KNA', 'LCA', 'MAF', 'MEX', 'MTQ', 'NIC', 'PAN', 'PRI', 'SLV',
    'SPM', 'TTO', 'USA', 'VCT', 'VGB', 'VIR'
    ]:
        return "amerika utara"
    
    elif country in [
    'ARG', 'BOL', 'BRA', 'CHL', 'COL', 'ECU', 'FLK', 'GUY', 'PAR', 'PER', 'PRY', 'SUR', 'URY', 'VEN'
    ]:
        return "amerika selatan"
    
    elif country in [
    'ASM', 'AUS', 'COK', 'FJI', 'FSM', 'GUM', 'KIR', 'MHL', 'MYT', 'NCL', 'NFK', 'NRU', 'NZL', 'PLW',
    'PNG', 'PYF', 'SLB', 'TKL', 'TON', 'TUV', 'UMI', 'VUT', 'WLF', 'WSM'
    ]:
        return "oceania"
    
    else:
        return "antartica"

In [71]:
df["region"] = df["country"].apply(country_to_region)
df["region"].unique()

array(['eropa', 'amerika utara', 'asia', 'amerika selatan', 'antartica',
       'oceania'], dtype=object)

In [73]:
# Identifikasi pola reservation_status
# Tulis kode Anda di sini
df.head()

Unnamed: 0,hotel,hotel_description,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,children,babies,meal,country,market_segment,distribution_channel,is_repeated_guest,previous_cancellations,previous_bookings_not_canceled,reserved_room_type,assigned_room_type,booking_changes,deposit_type,agent,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date,region
0,Resort Hotel,Beachfront Resort,0,342,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,3,No Deposit,14.0,0,Transient,0.0,0,0,Check-Out,01-07-15,eropa
1,Resort Hotel,Beachfront Resort,0,737,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,4,No Deposit,14.0,0,Transient,0.0,0,0,Check-Out,01-07-15,eropa
2,Resort Hotel,Beachfront Resort,0,7,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Direct,Direct,0,0,0,A,C,0,No Deposit,14.0,0,Transient,75.0,0,0,Check-Out,02-07-15,eropa
3,Resort Hotel,Beachfront Resort,0,13,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Corporate,Corporate,0,0,0,A,A,0,No Deposit,304.0,0,Transient,75.0,0,0,Check-Out,02-07-15,eropa
4,Resort Hotel,Beachfront Resort,0,14,2015,July,27,1,0,2,2,0.0,0,BB,GBR,Online TA,TA/TO,0,0,0,A,A,0,No Deposit,240.0,0,Transient,98.0,0,1,Check-Out,03-07-15,eropa


In [99]:
# Filter pemesanan berdasarkan country
# Kode negara dimulai dengan P
# Tulis kode Anda di sini

country_start_p = df[df["country"].str.match(r"^[Pp].+$")]
country_start_p["country"].unique()

array(['PRT', 'POL', 'PRI', 'PAK', 'PHL', 'PLW', 'PER', 'PAN', 'PRY',
       'PYF'], dtype=object)

In [94]:
# Kode negara tanpa R atau T
# Tulis kode Anda di sini

no_rt_countries = df[df["country"].str.match(r"^[^RrTt]+$")]
no_rt_countries["country"].unique()


array(['USA', 'ESP', 'OMN', 'POL', 'DEU', 'BEL', 'CHE', 'CN', 'NLD',
       'DNK', 'SWE', 'AUS', 'CZE', 'FIN', 'MOZ', 'BWA', 'LUX', 'SVN',
       'ALB', 'IND', 'CHN', 'MEX', 'LVA', 'CHL', 'ZAF', 'AGO', 'CYM',
       'ZMB', 'CPV', 'ZWE', 'DZA', 'HUN', 'JAM', 'HKG', 'GEO', 'AND',
       'GIB', 'JEY', 'CAF', 'CYP', 'COL', 'GGY', 'NGA', 'MDV', 'VEN',
       'SVK', 'FJI', 'KAZ', 'PAK', 'IDN', 'LBN', 'PHL', 'SEN', 'SYC',
       'AZE', 'NZL', 'DOM', 'MKD', 'MYS', 'JPN', 'LKA', 'CUB', 'BIH',
       'MUS', 'COM', 'UGA', 'CIV', 'SGP', 'BDI', 'SAU', 'VNM', 'PLW',
       'EGY', 'MWI', 'ECU', 'MDG', 'ISL', 'UZB', 'NPL', 'BHS', 'MAC',
       'DJI', 'KNA', 'HND', 'KHM', 'MCO', 'BGD', 'IMN', 'NIC', 'BEN',
       'VGB', 'GAB', 'GHA', 'GLP', 'KEN', 'LIE', 'GNB', 'MNE', 'UMI',
       'PAN', 'BFA', 'LBY', 'MLI', 'NAM', 'BOL', 'ABW', 'AIA', 'SLV',
       'DMA', 'PYF', 'GUY', 'LCA', 'ASM', 'NCL', 'SDN', 'SLE', 'LAO'],
      dtype=object)

In [101]:
# Langkah 3: Validasi akhir dan simpan dataset
# Tulis kode Anda di sini
# hotel_df.to_csv('hotel_bookings_cleaned.csv', index=False)

# df.to_csv("hotel_bookings_cleaned.csv", index=False)

display(df.head(2))
display(df.info())
display(df.isna().sum() / len(df) * 100)

Unnamed: 0,hotel,hotel_description,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,children,babies,meal,country,market_segment,distribution_channel,is_repeated_guest,previous_cancellations,previous_bookings_not_canceled,reserved_room_type,assigned_room_type,booking_changes,deposit_type,agent,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date,region
0,Resort Hotel,Beachfront Resort,0,342,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,3,No Deposit,14.0,0,Transient,0.0,0,0,Check-Out,01-07-15,eropa
1,Resort Hotel,Beachfront Resort,0,737,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,4,No Deposit,14.0,0,Transient,0.0,0,0,Check-Out,01-07-15,eropa


<class 'pandas.core.frame.DataFrame'>
Index: 118898 entries, 0 to 119389
Data columns (total 33 columns):
 #   Column                          Non-Null Count   Dtype  
---  ------                          --------------   -----  
 0   hotel                           118898 non-null  object 
 1   hotel_description               118898 non-null  object 
 2   is_canceled                     118898 non-null  int64  
 3   lead_time                       118898 non-null  int64  
 4   arrival_date_year               118898 non-null  int64  
 5   arrival_date_month              118898 non-null  object 
 6   arrival_date_week_number        118898 non-null  int64  
 7   arrival_date_day_of_month       118898 non-null  int64  
 8   stays_in_weekend_nights         118898 non-null  int64  
 9   stays_in_week_nights            118898 non-null  int64  
 10  adults                          118898 non-null  int64  
 11  children                        118898 non-null  float64
 12  babies               

None

hotel                             0.0
hotel_description                 0.0
is_canceled                       0.0
lead_time                         0.0
arrival_date_year                 0.0
arrival_date_month                0.0
arrival_date_week_number          0.0
arrival_date_day_of_month         0.0
stays_in_weekend_nights           0.0
stays_in_week_nights              0.0
adults                            0.0
children                          0.0
babies                            0.0
meal                              0.0
country                           0.0
market_segment                    0.0
distribution_channel              0.0
is_repeated_guest                 0.0
previous_cancellations            0.0
previous_bookings_not_canceled    0.0
reserved_room_type                0.0
assigned_room_type                0.0
booking_changes                   0.0
deposit_type                      0.0
agent                             0.0
days_in_waiting_list              0.0
customer_typ