# <p style = "font-size : 42px; color : #000000 ; font-family : 'Oregon'; text-align : center; background-color : #dba514; border-radius: 5px 5px;"><strong>Hotel Booking Cancellation Prediction</strong></p>

<p align="center">
  <img style = "border:5px solid #ffb037;" src="https://5.imimg.com/data5/EF/GO/MY-17287433/hotel-bookings-500x500.jpg" alt="Project Banner" width="1000"/>
</p>

### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Problem Statement</strong></p> 

<p style = "font-size : 16px; color : #ff9900; font-family : 'Comic Sans MS';">
    <strong>Problem:</strong> Client wants to reduce the cancellations of hotel bookings.<br>
    <strong>Solution:</strong> Using EDA techniques we have to find relation between features, hidden patterns and actionable insights from data.<br>
    Using machine learning algorithms we have to predict cancellations and implement targeted marketing strategies.
</p> 

### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>About Data</strong></p> 

<p style = "font-size : 16px; color : #ff9900; font-family : 'Comic Sans MS';">
    The data includes detailed information on hotel bookings, covering customer demographics, booking patterns, and reservation specifics.<br> 
    Key attributes include booking status, stay duration, guest count, booking channel, room assignment, and any special requests.<br> 
    It is suitable for analyzing booking trends, customer behaviors, and factors influencing cancellations and modifications.
</p> 

## <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Data Collection</strong></p> 

<p style = "font-size : 16px; color : #ff9900; font-family : 'Comic Sans MS';">
    The Hotel booking data is stored in mysql database we will fetch the data from the database
</p> 

### <a id = '0.1'></a>
<p style = "font-size : 40px; color : #f9858b ; font-family : 'Calibri'; text-align : center; background-color : #bdfff6; border-radius: 5px 5px;"><strong>Importing Libraries</strong></p> 

In [2]:
import warnings
warnings.filterwarnings("ignore")

# database connection
from sqlalchemy import create_engine

# data manipulation
import pandas as pd
import numpy as np

# visualization
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
from folium.plugins import HeatMap

# style
plt.style.use('fivethirtyeight')
%matplotlib inline
pd.set_option('display.max_columns', 40)

### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Load Dataset</strong></p> 

In [3]:
engine = create_engine("mysql+mysqlconnector://projects:AIMLprojects1@127.0.0.1:3306/projects_db")
conn = engine.connect()

In [4]:
query = "SELECT * FROM hotel_booking"

In [5]:
df = pd.read_sql(query, engine)
df.head()

Unnamed: 0,hotel,is_canceled,lead_time,arrival_date_year,arrival_date_month,arrival_date_week_number,arrival_date_day_of_month,stays_in_weekend_nights,stays_in_week_nights,adults,children,babies,meal,country,market_segment,distribution_channel,is_repeated_guest,previous_cancellations,previous_bookings_not_canceled,reserved_room_type,assigned_room_type,booking_changes,deposit_type,agent,company,days_in_waiting_list,customer_type,adr,required_car_parking_spaces,total_of_special_requests,reservation_status,reservation_status_date,name,email,phone-number,credit_card
0,Resort Hotel,0,342,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,3,No Deposit,,,0,Transient,0.0,0,0,Check-Out,2015-07-01,Ernest Barnes,Ernest.Barnes31@outlook.com,669-792-1661,************4322
1,Resort Hotel,0,737,2015,July,27,1,0,0,2,0.0,0,BB,PRT,Direct,Direct,0,0,0,C,C,4,No Deposit,,,0,Transient,0.0,0,0,Check-Out,2015-07-01,Andrea Baker,Andrea_Baker94@aol.com,858-637-6955,************9157
2,Resort Hotel,0,7,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Direct,Direct,0,0,0,A,C,0,No Deposit,,,0,Transient,75.0,0,0,Check-Out,2015-07-02,Rebecca Parker,Rebecca_Parker@comcast.net,652-885-2745,************3734
3,Resort Hotel,0,13,2015,July,27,1,0,1,1,0.0,0,BB,GBR,Corporate,Corporate,0,0,0,A,A,0,No Deposit,304.0,,0,Transient,75.0,0,0,Check-Out,2015-07-02,Laura Murray,Laura_M@gmail.com,364-656-8427,************5677
4,Resort Hotel,0,14,2015,July,27,1,0,2,2,0.0,0,BB,GBR,Online TA,TA/TO,0,0,0,A,A,0,No Deposit,240.0,,0,Transient,98.0,0,1,Check-Out,2015-07-03,Linda Hines,LHines@verizon.com,713-226-5883,************5498


### <div style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Dataset Description</strong>

<div align='center' style="border-radius:10px; padding: 15px; font-size:15px;">

| __Index__ | __Variable__ | __Description__ |
|   :---    |     :---     |       :---      |
| 1 | __hotel__ | Type of hotel (Resort Hotel, City Hotel) |
| 2 | __is_canceled__ | Reservation cancellation status (0 = not canceled, 1 = canceled) |
| 3 | __lead_time__ | Number of days between booking and arrival |
| 4 | __arrival_date_year__ | Year of arrival |
| 5 | __arrival_date_month__ | Month of arrival |
| 6 | __arrival_date_week_number__ | Week number of the year for arrival |
| 7 | __arrival_date_day_of_month__ | Day of the month of arrival |
| 8 | __stays_in_weekend_nights__ | Number of weekend nights (Saturday and Sunday) the guest stayed or booked |
| 9 | __stays_in_week_nights__ | Number of week nights the guest stayed or booked |
| 10 | __adults__ | Number of adults |
| 11 | __children__ | Number of children |
| 12 | __babies__ | Number of babies |
| 13 | __meal__ | Type of meal booked (BB, FB, HB, SC, Undefined) |
| 14 | __country__ | Country of origin of the guest |
| 15 | __market_segment__ | Market segment designation |
| 16 | __distribution_channel__ | Booking distribution channel |
| 17 | __is_repeated_guest__ | If the guest is a repeat customer (0 = not repeated, 1 = repeated) |
| 18 | __previous_cancellations__ | Number of previous bookings that were canceled by the customer |
| 19 | __previous_bookings_not_canceled__ | Number of previous bookings that were not canceled by the customer |
| 20 | __reserved_room_type__ | Type of reserved room |
| 21 | __assigned_room_type__ | Type of assigned room |
| 22 | __booking_changes__ | Number of changes made to the booking |
| 23 | __deposit_type__ | Type of deposit made (No Deposit, Refundable, Non Refund) |
| 24 | __agent__ | ID of the travel agent responsible for the booking |
| 25 | __company__ | ID of the company responsible for the booking |
| 26 | __days_in_waiting_list__ | Number of days the booking was in the waiting list |
| 27 | __customer_type__ | Type of customer (Transient, Contract, Transient-Party, Group) |
| 28 | __adr__ | Average Daily Rate |
| 29 | __required_car_parking_spaces__ | Number of car parking spaces required |
| 30 | __total_of_special_requests__ | Number of special requests made |
| 31 | __reservation_status__ | Last reservation status (Check-Out, Canceled, No-Show) |
| 32 | __reservation_status_date__ | Date of the last reservation status |
| 33 | __name__ | Guest's name |
| 34 | __email__ | Guest's email address |
| 35 | __phone-number__ | Guest's phone number |
| 36 | __credit_card__ | Last four digits of the guest's credit card |
</div>

### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Look into Data</strong></p> 

In [6]:
df.shape

(119386, 36)

In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 119386 entries, 0 to 119385
Data columns (total 36 columns):
 #   Column                          Non-Null Count   Dtype  
---  ------                          --------------   -----  
 0   hotel                           119386 non-null  object 
 1   is_canceled                     119386 non-null  int64  
 2   lead_time                       119386 non-null  int64  
 3   arrival_date_year               119386 non-null  int64  
 4   arrival_date_month              119386 non-null  object 
 5   arrival_date_week_number        119386 non-null  int64  
 6   arrival_date_day_of_month       119386 non-null  int64  
 7   stays_in_weekend_nights         119386 non-null  int64  
 8   stays_in_week_nights            119386 non-null  int64  
 9   adults                          119386 non-null  int64  
 10  children                        119386 non-null  float64
 11  babies                          119386 non-null  int64  
 12  meal            

<div style="font-size: 16px; color: #ff9900; font-family: 'Comic Sans MS';">

<h4 align="left"><font color=bdfff6>Insights:</font></h4>

* __Number of Entries__: The dataset consists of __119,390 entries__.

* __Columns__: The dataset contains __36 columns__, which represent various attributes related to hotel bookings.

* __Data Types__:
    - A majority of the columns, 16 to be precise, are of the __object__ data type (often representing strings or categorical data).
    - 16 columns are of the __int64__ data type, representing integer values.
    - 4 columns are of the __float64__ data type, which typically denotes decimal values.

* __Missing Values__: 
    - The column `children` has 4 missing values.
    - The column `country` has 488 missing values.
    - The column `agent` has 16,340 missing values.
    - The column `company` has a significant number of missing values, totaling 112,593.

These missing values might need addressing, depending on the context of the analysis or the model we intend to build.

<div style="font-size: 16px; color: #ff9900; font-family: 'Comic Sans MS';">

<h4 align="left"><font color=bdfff6>Note:</font></h4>
    
Based on the data types and the feature explanations provided earlier, we identified that __20 columns__ (`hotel`, `is_canceled`, `arrival_date_year`, `arrival_date_month`, `meal`, `country`, `market_segment`, `distribution_channel`, `is_repeated_guest`, `reserved_room_type`, `assigned_room_type`, `deposit_type`, `agent`, `company`, `customer_type`, `reservation_status`, `name`, `email`, `phone-number` and `credit_card`) are __categorical__ in terms of their semantics. These features must have string (__object__) data type to ensure proper analysis and interpretation in subsequent steps:

In [16]:
# Based on the given dataset description, list the columns that are categorical in terms of their semantics
categorical_columns_semantics = [
    'hotel',
    'is_canceled',
    'arrival_date_year',
    'arrival_date_month',
    'meal',
    'country',
    'market_segment',
    'distribution_channel',
    'is_repeated_guest',
    'reserved_room_type',
    'assigned_room_type',
    'deposit_type',
    'agent',
    'company',
    'customer_type',
    'reservation_status',
    'name',
    'email',
    'phone-number',
    'credit_card'
]

# Convert these columns to string (object) data type
for column in categorical_columns_semantics:
    if column in df.columns:
        df[column] = df[column].astype(str)

# Verify the changes in data types
df.dtypes

hotel                              object
is_canceled                        object
lead_time                           int64
arrival_date_year                  object
arrival_date_month                 object
arrival_date_week_number            int64
arrival_date_day_of_month           int64
stays_in_weekend_nights             int64
stays_in_week_nights                int64
adults                              int64
children                          float64
babies                              int64
meal                               object
country                            object
market_segment                     object
distribution_channel               object
is_repeated_guest                  object
previous_cancellations              int64
previous_bookings_not_canceled      int64
reserved_room_type                 object
assigned_room_type                 object
booking_changes                     int64
deposit_type                       object
agent                             

### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Summary of Numrical Features</strong></p> 

In [18]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
lead_time,119386.0,104.014801,106.863286,0.0,18.0,69.0,160.0,737.0
arrival_date_week_number,119386.0,27.165003,13.605334,1.0,16.0,28.0,38.0,53.0
arrival_date_day_of_month,119386.0,15.798553,8.780783,1.0,8.0,16.0,23.0,31.0
stays_in_weekend_nights,119386.0,0.927605,0.998618,0.0,0.0,1.0,2.0,19.0
stays_in_week_nights,119386.0,2.50031,1.908289,0.0,1.0,2.0,3.0,50.0
adults,119386.0,1.85639,0.579261,0.0,2.0,2.0,2.0,55.0
children,119386.0,0.10389,0.398561,0.0,0.0,0.0,0.0,10.0
babies,119386.0,0.007949,0.097438,0.0,0.0,0.0,0.0,10.0
previous_cancellations,119386.0,0.087121,0.84435,0.0,0.0,0.0,0.0,26.0
previous_bookings_not_canceled,119386.0,0.137102,1.497462,0.0,0.0,0.0,0.0,72.0


<div style="font-size: 16px; color: #ff9900; font-family: 'Comic Sans MS';">

<h4 align="left"><font color=bdfff6>Numerical Features:</font></h4>

* __`lead_time`__: The average time between booking and arrival is approximately 104 days, ranging from 0 to 737 days.
    

* __`arrival_date_week_number`__: The average week number of the year for arrival is approximately 27.17, ranging from 1 to 53.

    
* __`arrival_date_day_of_month`__: The average day of the month of arrival is approximately 15.80, ranging from 1 to 31.
    
    
* __`stays_in_weekend_nights`__: On average, guests stay for around 0.93 weekend nights, with some guests staying for as long as 19 weekend nights.
    
    
* __`stays_in_week_nights`__: On average, guests stay for about 2.5 week nights, with stays ranging from 0 to 50 week nights.
    
    
* __`adults`__: The average number of adults per booking is approximately 1.86, with a range from 0 to 55 adults.
    
    
* __`children`__: On average, there are about 0.1 children per booking, with a maximum of 10 children in some bookings.
    
    
* __`babies`__: The average number of babies per booking is very low, approximately 0.008, with a maximum of 10 babies in some bookings.
    
    
* __`previous_cancellations`__: On average, guests have canceled about 0.09 times in the past, with some guests having as many as 26 previous cancellations.
    
    
* __`previous_bookings_not_canceled`__: On average, guests have not canceled about 0.14 bookings in the past, with some guests having as many as 72 previous bookings that they didn't cancel.
    
    
* __`booking_changes`__: Bookings are changed on average about 0.22 times, with some bookings being changed up to 21 times.
    
    
* __`days_in_waiting_list`__: The average time a booking spends on the waiting list is approximately 2.32 days, with some bookings waiting for as long as 391 days.
    
    
* __`adr`__: The Average Daily Rate is about 101.83, with rates ranging from -6.38 (which might indicate errors or special cases) to 5400.
    
    
* __`required_car_parking_spaces`__: On average, guests require about 0.06 parking spaces, with some bookings requiring up to 8 spaces.
    
    
* __`total_of_special_requests`__: On average, guests make about 0.57 special requests, with some making up to 5 requests.

<div style="font-size: 16px; color: #ff9900; font-family: 'Comic Sans MS';">

<h4 align="left"><font color=bdfff6>Noisy Data:</font></h4>

Based on the descriptions and the inferences provided earlier, the following features seem to contain noisy data:

1. __`adr`__: The Average Daily Rate (ADR) represents the average amount a guest pays for a room per day. In this dataset, ADR has a minimum value of -6.38. A negative rate doesn't make sense, indicating potential errors or special cases.

    
2. __`adults`__: The minimum value is 0, which implies bookings without any adults. This could be a data entry error unless there are legitimate cases where only children or babies book a room.

    
3. __`children`__ and __`babies`__: Both have a maximum value of 10, which seems unusually high for a single booking. This could be a potential outlier or data entry error, but it would require more context or domain knowledge to say definitively.

Further investigation, possibly involving domain expertise or additional context, would be necessary to determine the true nature of these data points.

### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Summary of categorical features</strong></p> 

In [None]:
# Summary of categorical features
df.describe(include='object').T

Unnamed: 0,count,unique,top,freq
hotel,119386,2,City Hotel,79326
arrival_date_month,119386,12,August,13873
meal,119386,5,BB,92306
country,119386,178,PRT,48586
market_segment,119386,7,Online TA,56476
distribution_channel,119386,5,TA/TO,97870
reserved_room_type,119386,10,A,85994
assigned_room_type,119386,12,A,74053
deposit_type,119386,3,No Deposit,104637
agent,119386,334,9.0,31960


<div style="font-size: 16px; color: #ff9900; font-family: 'Comic Sans MS';">
    
    
<h4 align="left"><font color=bdfff6>Categorical Features:</font></h4>

* __`hotel`__: There are two types of hotels, with "City Hotel" being the most frequent category, occurring 79,330 times out of 119,390 entries.

    
* __`is_canceled`__: Two unique values (0 for not canceled and 1 for canceled) are present. The most common value is "0" (not canceled), occurring 75,166 times.

    
* __`arrival_date_year`__: There are three unique years present, with 2016 being the most frequent year of arrival, observed 56,707 times.

    
* __`arrival_date_month`__: There are 12 unique months. August is the most frequent month of arrival, observed 13,877 times.

    
* __`meal`__: Five unique types of meals are booked. The most common type is "BB", observed 92,310 times.

    
* __`country`__: 178 unique countries of origin are present. The most frequent country is "PRT" (Portugal), observed 48,590 times.

    
* __`market_segment`__: Eight unique market segments are present. The most common segment is "Online TA", observed 56,477 times.

    
* __`distribution_channel`__: Five unique booking distribution channels are present, with "TA/TO" being the most frequent, observed 97,870 times.

    
* __`is_repeated_guest`__: Two unique values (0 for not repeated and 1 for repeated) are present. The most frequent value is "0" (not repeated), observed in the majority of entries.

    
* __`reserved_room_type`__ and __`assigned_room_type`__: Various room types are present, with some types being more frequent than others.

    
* __`deposit_type`__: Three unique types of deposits are made. The most common type is "No Deposit", observed 104,641 times.


* __`agent`__: There are 334 unique agents, with '9.0' being the most frequent, occurring 31,961 times.
    
    
* __`company`__: There are 353 unique companies, but '__nan__' (missing values) is the most frequent, occurring 112,593 times. This indicates that the 'company' variable has a high percentage of missing values.
    
    
* __`customer_type`__: Four unique customer types are present. The most frequent type is "Transient", observed 89,613 times.

    
* __`reservation_status`__: Three unique reservation statuses are present. The most common status is "Check-Out", observed 75,166 times.


* __`reservation_status_date`__: There are 926 unique values, with '2015-10-21' being the most frequent, occurring 1,461 times. 


* __`name`__: There are 81,503 unique names, with 'Michael Johnson' being the most frequent, occurring 48 times.
    

* __`email`__: There are 115,889 unique email addresses, with 'Michael.C@gmail.com' being the most frequent, occurring 6 times.
    
    
* __`phone-number`__: There are 119,390 unique phone numbers, indicating that almost every guest has a unique phone number.
    
    
* __`credit_card`__: There are 9,000 unique credit card numbers, with '************4923' being the most frequent, occurring 28 times.

### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Explore Data</strong></p> 

In [None]:
# define numerical & categorical columns
numeric_features = [_ for _ in df.columns if df[_].dtype != 'O']
categorical_features = [_ for _ in df.columns if df[_].dtype == 'O']

# print columns
print(f'We have {len(numeric_features)} numerical features : {numeric_features}')
print(f'\nWe have {len(categorical_features)} categorical features : {categorical_features}')

We have 18 numerical features : ['is_canceled', 'lead_time', 'arrival_date_year', 'arrival_date_week_number', 'arrival_date_day_of_month', 'stays_in_weekend_nights', 'stays_in_week_nights', 'adults', 'children', 'babies', 'is_repeated_guest', 'previous_cancellations', 'previous_bookings_not_canceled', 'booking_changes', 'days_in_waiting_list', 'adr', 'required_car_parking_spaces', 'total_of_special_requests']

We have 18 categorical features : ['hotel', 'arrival_date_month', 'meal', 'country', 'market_segment', 'distribution_channel', 'reserved_room_type', 'assigned_room_type', 'deposit_type', 'agent', 'company', 'customer_type', 'reservation_status', 'reservation_status_date', 'name', 'email', 'phone-number', 'credit_card']


In [11]:
# proportion of count data on categorical columns
for col in categorical_features:
    print(df[col].value_counts(normalize=True) * 100)
    print('- ' * 50)

hotel
City Hotel      66.444977
Resort Hotel    33.555023
Name: proportion, dtype: float64
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
arrival_date_month
August       11.620290
July         10.605096
May           9.876367
October       9.347830
April         9.288359
June          9.162716
September     8.801702
March         8.203642
February      6.757911
November      5.690785
December      5.679058
January       4.966244
Name: proportion, dtype: float64
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
meal
BB           77.317273
HB           12.114486
SC            8.920644
Undefined     0.979177
FB            0.668420
Name: proportion, dtype: float64
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
country
PRT    40.696564
GBR    10.159483
FRA     8.723803
ESP     7.176721
DEU     6.103731
         ...    
DJI     0.00083

<div style="font-size: 16px; color: #ff9900; font-family: 'Comic Sans MS';">
<h4 align="left"><font color="#bdfff6">Insights:</font></h4>

* __Hotel Types__: 
    - __`City Hotel`__ bookings dominate at about 66%, while __`Resort Hotel`__ bookings make up the remaining 34%.
    - This suggests more guests prefer city accommodations, likely due to accessibility or travel purposes.

* __Arrival Months__: 
    - The busiest months are __`August`__ (11.6%) and __`July`__ (10.6%), followed closely by __`May`__ and __`October`__.
    - This indicates a peak booking season in summer and a secondary rise in October, possibly due to tourism trends.

* __Meal Preferences__: 
    - __Breakfast-only (`BB`)__ is the most popular meal plan at 77%, followed by __Half-Board (`HB`)__ at 12%.
    - This indicates a preference for light, breakfast-focused dining over more inclusive plans.

* __Country of Origin__: 
    - Most guests are from __`Portugal` (40.7%)__, followed by __`the UK` (10.2%)__, __`France` (8.7%)__, __`Spain` (7.2%)__, and __`Germany` (6.1%)__.
    - This suggests a predominantly local (Portuguese) clientele, with a significant proportion from nearby European countries.

* __Market Segment__: 
    - The largest segment is __`Online Travel Agencies` (47%)__, followed by __`Offline TA/TO` (20%)__ and __`Groups` (16.6%)__.
    - This trend reflects the importance of online booking channels.

* __Distribution Channels__: 
    - The __Travel Agent/Operator (`TA/TO`)__ channel accounts for 82% of bookings, followed by __`Direct`__ bookings (12%) and __`Corporate`__ channels (5.6%).
    - This suggests reliance on travel agencies and operators.

* __Room Types__: 
    - Reserved room type __`A`__ is the most popular at 72%, but assigned room type __`A`__ only makes up 62%, indicating some reassignment.
    - Room types __`D`__ and __`E`__ are also common.

* __Deposit Type__: 
    - __`No Deposit`__ is chosen by 87.6% of guests, while __`Non-Refund`__ options account for 12%.
    - This shows a strong preference for flexibility.

* __Customer Type__: 
    - __`Transient`__ customers (75%) are the most common, likely reflecting short-term travelers.
    - __`Transient-Party`__ (21%) and __`Contract`__ (3.4%) types follow, with very few __`Group`__ bookings (0.5%).

* __Reservation Status__: 
    - __`Check-Out`__ accounts for 63% of reservations, __`Canceled`__ reservations make up 36%, and __`No-Show`__ is around 1%.
    - This indicates a significant number of cancellations.

* __Reservation Dates__: 
    - Some specific dates have a higher booking proportion, such as __`2015-10-21`__ and __`2015-07-06`__.
    - These may reflect high-demand days or holidays.

* __Unique Identifiers__: 
    - __`Name`__, __`email`__, __`phone number`__, and __`credit card`__ details each appear to be highly unique with very few repeating values, as expected for identifiers.

These insights can inform strategies around pricing, marketing (especially in peak months), channel prioritization, and customer engagement. 
For instance, focusing on online channels and understanding preferences for city hotels, breakfast-only options, and flexible booking policies could help optimize service offerings.

</div>


### <p style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Univariate Analysis</strong></p> 

<p style = "font-size : 16px; color : #ff9900; font-family : 'Comic Sans MS';">
    The term univariate analysis refers to the analysis of one variable prefix “uni” means “one”. <br>
    The purpose of univariate analysis is to understand the distribution of values for a single variable.
</p> 

### <div style = "font-size : 25px; color : #ff0099; font-family : 'Comic Sans MS'; "><strong>Numerical Features</strong></div> 