# We are going to predict:
> `if the customer is going to honor the reservation or cancel it`

we are going to use the dataset: `Hostel Reservation.csv`

The file contains the different attributes of customers' reservation details. The detailed data
dictionary is given below.
Data Dictionary
* Booking_ID: unique identifier of each booking
* no_of_adults: Number of adults
* no_of_children: Number of Children
* no_of_weekend_nights: Number of weekend nights (Saturday or Sunday) the guest
stayed or booked to stay at the hotel
* no_of_week_nights: Number of week nights (Monday to Friday) the guest stayed or
booked to stay at the hotel
* type_of_meal_plan: Type of meal plan booked by the customer:
* required_car_parking_space: Does the customer require a car parking space? (0 -
No, 1- Yes)
* room_type_reserved: Type of room reserved by the customer. The values are
ciphered (encoded) by INN Hotels.
* lead_time: Number of days between the date of booking and the arrival date
* arrival_year: Year of arrival date
* arrival_month: Month of arrival date
* arrival_date: Date of the month
* market_segment_type: Market segment designation.
* repeated_guest: Is the customer a repeated guest? (0 - No, 1- Yes)
* no_of_previous_cancellations: Number of previous bookings that were canceled by
the customer prior to the current booking
* no_of_previous_bookings_not_canceled: Number of previous bookings not canceled
by the customer prior to the current booking
* avg_price_per_room: Average price per day of the reservation; prices of the rooms are
dynamic. (in euros)
* no_of_special_requests: Total number of special requests made by the customer (e.g.
high floor, view from the room, etc)
* booking_status: Flag indicating if the booking was canceled or not.

# Import all Necessary Libraries

In [35]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

## Import the dataset
  * Load the dataset
  * Drop Booking Id for dimensionality reduction technique

In [36]:

data = pd.read_csv('Hotel Reservations.csv')

data = data.drop('Booking_ID', axis=1)

## Separate features and target variable

In [37]:
X = data.drop('booking_status', axis=1)
y = data['booking_status']

## Convert booking_status into dummy variables using label encoding

In [38]:
label_encoder = LabelEncoder()
y = label_encoder.fit_transform(y)

## Split the data into train and test sets

In [39]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Define the columns to be one-hot encoded

In [40]:
categorical_cols = ['type_of_meal_plan', 'room_type_reserved', 'market_segment_type']

## Create a transformer for one-hot encoding

In [41]:
preprocessor = ColumnTransformer(
    transformers=[('cat', OneHotEncoder(), categorical_cols)],
    remainder='passthrough'
)

## Define the model pipeline

In [42]:
model = Pipeline([
    ('preprocessor', preprocessor),
    ('classifier', RandomForestClassifier(random_state=42))
])

## Fit the model on the training data

In [43]:
model.fit(X_train, y_train)

## Make predictions on the test data

In [44]:
y_pred = model.predict(X_test)

## Evaluate the model

In [45]:
accuracy = accuracy_score(y_test, y_pred)
classification_report = classification_report(y_test, y_pred)

## Print the evaluation metrics

In [46]:
print(f"Accuracy: {accuracy}")
print(f"Classification Report:\n{classification_report}")

Accuracy: 0.9044796691936595
Classification Report:
              precision    recall  f1-score   support

           0       0.89      0.82      0.85      2416
           1       0.91      0.95      0.93      4839

    accuracy                           0.90      7255
   macro avg       0.90      0.88      0.89      7255
weighted avg       0.90      0.90      0.90      7255



## Import pickle to save the model

In [47]:
import pickle

## save the model or in pickle language `dump` the model

In [50]:
with open('geekschallange_Chandrashekhar_Robbi.pkl', 'wb') as f:
  pickle.dump(model, f)