# Reservation annulation project.

La problématique est autour des cancelations de dernière minute dans\
les hotels. Il semblerait que cela porte préjudice du fait que cela est\
gratuit ou peu onéreux. Cela implique des prix.

On cherche un modèle de ML permettant de prédire quelles réservations\
ont une probabilité importante d'être annulées. 

In [5]:
# Installation Packages.
import pandas as pd 
import numpy as np
import skimpy

In [None]:
# Importation de données.
df = pd.read_csv('INNHotelsGroup.csv')

Unnamed: 0,Booking_ID,no_of_adults,no_of_children,no_of_weekend_nights,no_of_week_nights,type_of_meal_plan,required_car_parking_space,room_type_reserved,lead_time,arrival_year,arrival_month,arrival_date,market_segment_type,repeated_guest,no_of_previous_cancellations,no_of_previous_bookings_not_canceled,avg_price_per_room,no_of_special_requests,booking_status
0,INN00001,2,0,1,2,Meal Plan 1,0,Room_Type 1,224,2017,10,2,Offline,0,0,0,65.0,0,Not_Canceled
1,INN00002,2,0,2,3,Not Selected,0,Room_Type 1,5,2018,11,6,Online,0,0,0,106.68,1,Not_Canceled
2,INN00003,1,0,2,1,Meal Plan 1,0,Room_Type 1,1,2018,2,28,Online,0,0,0,60.0,0,Canceled
3,INN00004,2,0,0,2,Meal Plan 1,0,Room_Type 1,211,2018,5,20,Online,0,0,0,100.0,0,Canceled
4,INN00005,2,0,1,1,Not Selected,0,Room_Type 1,48,2018,4,11,Online,0,0,0,94.5,0,Canceled


## Description des variables
- **Booking_ID** : unique identifier of each booking
- **no_of_adults** : Number of adults
- **no_of_children** : Number of Children
- **no_of_weekend_nights** : Number of weekend nights (Saturday or Sunday) the guest stayed or booked to stay at the hotel
- **no_of_week_nights** : Number of week nights (Monday to Friday) the guest stayed or booked to stay at the hotel
- **type_of_meal_plan** : Type of meal plan booked by the customer:
    - Not Selected – No meal plan selected
    - Meal Plan 1 – Breakfast
    - Meal Plan 2 – Half board (breakfast and one other meal)
    - Meal Plan 3 – Full board (breakfast, lunch, and dinner)
- **required_car_parking_space** : Does the customer require a car parking space? (0 - No, 1- Yes)
- **room_type_reserved** : Type of room reserved by the customer. The values are ciphered (encoded) by INN Hotels.
- **lead_time** : Number of days between the date of booking and the arrival date
- **arrival_year** : Year of arrival date
- **arrival_month** : Month of arrival date
- **arrival_date** : Date of the month
- **market_segment_type** : Market segment designation.
- **repeated_guest** : Is the customer a repeated guest? (0 - No, 1- Yes)
- **no_of_previous_cancellations** : Number of previous bookings that were canceled by the customer prior to the current booking
- **no_of_previous_bookings_not_canceled** : Number of previous bookings not canceled by the customer prior to the current booking
- **avg_price_per_room** : Average price per day of the reservation; prices of the rooms are dynamic. (in euros)
- **no_of_special_requests** : Total number of special requests made by the customer (e.g. high floor, view from the room, etc)
- **booking_status** : Flag indicating if the booking was canceled or not.

## Informations de Base

In [8]:
# Affichage des premières lignes du dataset
df.head()

Unnamed: 0,Booking_ID,no_of_adults,no_of_children,no_of_weekend_nights,no_of_week_nights,type_of_meal_plan,required_car_parking_space,room_type_reserved,lead_time,arrival_year,arrival_month,arrival_date,market_segment_type,repeated_guest,no_of_previous_cancellations,no_of_previous_bookings_not_canceled,avg_price_per_room,no_of_special_requests,booking_status
0,INN00001,2,0,1,2,Meal Plan 1,0,Room_Type 1,224,2017,10,2,Offline,0,0,0,65.0,0,Not_Canceled
1,INN00002,2,0,2,3,Not Selected,0,Room_Type 1,5,2018,11,6,Online,0,0,0,106.68,1,Not_Canceled
2,INN00003,1,0,2,1,Meal Plan 1,0,Room_Type 1,1,2018,2,28,Online,0,0,0,60.0,0,Canceled
3,INN00004,2,0,0,2,Meal Plan 1,0,Room_Type 1,211,2018,5,20,Online,0,0,0,100.0,0,Canceled
4,INN00005,2,0,1,1,Not Selected,0,Room_Type 1,48,2018,4,11,Online,0,0,0,94.5,0,Canceled


In [10]:
# Dimensions du dataset 
df.shape

(36275, 19)

In [14]:
# Types de variables 
df.dtypes

Booking_ID                               object
no_of_adults                              int64
no_of_children                            int64
no_of_weekend_nights                      int64
no_of_week_nights                         int64
type_of_meal_plan                        object
required_car_parking_space                int64
room_type_reserved                       object
lead_time                                 int64
arrival_year                              int64
arrival_month                             int64
arrival_date                              int64
market_segment_type                      object
repeated_guest                            int64
no_of_previous_cancellations              int64
no_of_previous_bookings_not_canceled      int64
avg_price_per_room                      float64
no_of_special_requests                    int64
booking_status                           object
dtype: object

In [7]:
# Statistiques Descriptives
skimpy.skim(df)