<a href="https://colab.research.google.com/github/Codewith-naved/Capstone-Project-Module-2/blob/main/Hotel_Booking_Analysis_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    - **Hotel Booking Analysis**




##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### **Submitted By**    - Naaved Shaikh
`

# **Project Summary -**

In the dynamic landscape of the hospitality industry, data-driven decision-making is essential for optimizing operations and enhancing customer experiences. This project focuses on the exploratory data analysis (EDA) of a hotel booking dataset from Booking.com, capturing a wide range of information about bookings, customer profiles, and various booking channels. The dataset includes details such as booking lead times, arrival dates, meal plans, room types, and customer preferences.

The primary goal is to identify patterns and trends in booking behaviors, cancellations, and customer preferences to help Booking.com refine its operations. Given the competition in the hospitality sector, companies like Booking.com must continuously improve their booking processes and maximize revenue. By leveraging insights gained from this analysis, the company can streamline operations and tailor marketing efforts.

The project begins with a thorough examination of the dataset to clean and preprocess the data, addressing missing values and ensuring data consistency. This step is critical for ensuring the accuracy of subsequent analyses. Descriptive statistics are employed to explore customer demographics and booking patterns, revealing essential information about factors such as age, booking lead time, and hotel types preferred by different segments.

Key visualizations illustrate important findings, including booking percentages by hotel type, cancellation analysis, and insights into repeating guests. These visualizations help identify how various factors impact booking behavior, enabling targeted interventions to reduce cancellations. The project also investigates market segments that contribute most to bookings, highlighting trends that inform marketing strategies. Monthly booking distribution analysis identifies peak booking periods, assisting Booking.com in optimizing pricing and promotions.

Insights into the top countries by bookings and average lead time can help tailor offerings to meet diverse customer needs. Another critical aspect involves examining the average daily rate (ADR) across different hotel types and meal preferences. Understanding relationships between room types, meal plans, and pricing strategies can guide operational decisions, such as inventory management and dynamic pricing adjustments. The correlation heatmap helps identify relationships between various factors, offering insights into how different variables interact and influence operations.

The final deliverables include a comprehensive report detailing findings, supported by clear visualizations that illustrate key insights. Recommendations based on the analysis will focus on strategies for reducing cancellations, enhancing customer loyalty, and improving service delivery. The project aims to equip Booking.com with the necessary tools and insights to adapt to changing market conditions and anticipate guest needs.

In conclusion, this project contributes to understanding customer booking behaviors in the hospitality industry and provides Booking.com with a roadmap for making informed, data-driven decisions. By leveraging the insights gained from this analysis, the company can enhance its competitive advantage, drive profitability, and continue delivering exceptional service to its guests. ​

# **Problem Statement**


In the highly competitive hospitality industry, optimizing hotel operations is essential for enhancing customer satisfaction and maximizing revenue. Booking.com, a leading player in this market, faces challenges related to booking efficiency, cancellation rates, and understanding customer preferences. The availability of a rich dataset, encompassing details such as booking lead times, guest demographics, reservation types, and special requests, presents an opportunity to analyze booking behaviors and identify key trends. By leveraging this data, the project aims to uncover actionable insights that can inform strategic decision-making, allowing Booking.com to improve its marketing efforts, tailor its offerings, and streamline operational processes. This analysis seeks to provide a deeper understanding of factors influencing customer decisions, thereby enabling the company to better meet market demands and enhance overall guest experiences.

#### **Define Your Business Objective?**

The primary objective of this project is to analyze the hotel booking dataset from Booking.com to derive insights that enhance hotel operations and customer experiences. By examining key metrics such as booking patterns, cancellation rates, and guest preferences, the analysis aims to identify strategies for reducing cancellations, optimizing marketing efforts, and improving operational efficiency. Ultimately, this project seeks to provide Booking.com with actionable recommendations to increase profitability and foster customer loyalty in the competitive hospitality market.

# **General Guidelines** : -  

`1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





## ***1. Know Your Data***

# ***Let's Begin !***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

### Dataset Loading

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Load Dataset
path = '/content/drive/MyDrive/Capstone Project - Naved/Hotel Bookings.csv'
hotel_df = pd.read_csv(path)

In [None]:
df = pd.DataFrame(hotel_df)

### Dataset First View

In [None]:
# Dataset First Look
df.head()

In [None]:
df.tail()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
df.shape

### Dataset Information

In [None]:
df.info()

In [None]:
# Dataset Info
df.describe()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
df.duplicated().sum()

In [None]:
df.drop_duplicates(inplace = True)

In [None]:
df.duplicated().sum()

In [None]:
df.shape

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
df.isnull().sum().sort_values(ascending = False)

In [None]:
# Visualizing the missing values
# Checking Null Value by plotting Heatmap
sns.heatmap(df.isnull(), cbar=False)

### What did you know about your dataset?

In this dataset we had 119390 rows &  32 columns, with 31994 duplicated values. With some missing and null values in columns such as children, agent , country & company.Company had almost 94% missing / null values.

## ***2. Understanding Your Variables***

* Hotel - Hotel or resort
* Is_canceled - If booking was cancelled - 0, If not - 1
* Lead_time - No of days from between booking date and arrival date
* Arrival_date_year - year of arrival date
* arrival_date_month - month of arrival date
* arrival_date_week_number - week number for arrival date
* arrival_dat_day - Day of arrival date
* Satys_in_Weekend_nights - Number of weekend
* stays_in_week_nights - number of week nights
* adults - number of adults
* children - number of children
* babies - number of babies
* meal - kind of meal opted for
* country - country code
* market_segment - which segment the customer belongs to
* Distrubution_channel - method of booking
* is_repeated_guest - first visit or not
* previous_cancellation - any cancellation record before
* previous_booking - count of previous booking
* reserved_room_type - type of room reserved
* assigned_room_type - type of room assigned
* booking_changes - count of changes made to booking
* deposit_type - deposit type
* agent - booking through agent
* days_in_waiting_list - number of days in waiting list
* customer_type - type of customer
* requierd_car_parking - if parking required
* total_of_special_req  - number of special requirements/
* reservation_Status - reservation of status
* reservation_Status_date - Date of specific status


In [None]:
# Dataset Columns
df.columns

In [None]:
# Dataset Describe
df.describe()

### Variables Description

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for i in df.columns.tolist():
  print("No. of unique values in ",i,"is",df[i].nunique(),".")

## 3. ***Data Wrangling***

### Data Cleaning


In [None]:
## Treating the null values
df['children'].fillna(0, inplace=True)  # Filling NaN in 'children' with 0
df['country'].fillna('others', inplace=True)  # Filling NaN in 'country' with 'others'
df['agent'].fillna(0, inplace=True)  # Filling NaN in 'agent' with 0


In [None]:
## Dropping the entire column as it approx 94% missing values
df.drop('company', axis=1, inplace=True)

In [None]:
# checking non null values
df.info()

In [None]:
# checking non null values
df.isnull().sum()

In [None]:
# Write your code to make your dataset analysis ready.

df.info()

Changinng data type of variable Children, agent from float to int64

In [None]:
df[['children', 'agent']] = df[['children', 'agent']].astype('int64')


In [None]:
df.info()

### Adding Columns

In [None]:
## Total night a customer stayed
df['total_stay_nights'] = df['stays_in_week_nights'] + df['stays_in_weekend_nights']

In [None]:
## total no of members including children and babies
df['total_members'] = df['adults'] + df['children'] + df['babies']

In [None]:
## Total revenue
df['total_revenue'] = df['adr'] * df['total_stay_nights']

### Modifying columns

In [None]:
#modifying is_canceled column
df['is_canceled'] = df['is_canceled'].replace({1: 'Canceled', 0: 'Not Canceled'})
df['is_canceled']

In [None]:
#modifying is_repeated_guest column
df['is_repeated_guest'] = df['is_repeated_guest'].replace({1:'repeated', 0:'Not repeated'})
df['is_repeated_guest']

In [None]:
hotel_wise_total_revenue = df.groupby('hotel')['total_revenue'].sum().reset_index()
hotel_wise_total_revenue

### What all manipulations have you done and insights you found?



**Data Cleaning**
* In **children** column filling null values Filling NaN in 'children' with 0
* In **country** column filling null values Filling NaN in 'country' with 'others'
* In **agent** column filling null values Filling NaN in 'agent' with 0
* Dropping the entire **Company** column as it approx 94% missing values

**Adding Columns**
* total_stay_nights = Total nights the customer stayed.
* total_members = Total number of members including children and babies.
* total_revenue = Total revenue generated.

**Modifying Columns**
* In is_canceled replacing [ 1: 'Canceled', 0: 'Not Canceled' ]
* In is_repeated_guest replacing [ 1:'repeated', 0:'Not repeated ]

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### ***Booking Percentage by hotel type***

In [None]:
# Chart - 1 visualization
x = sum(df['hotel'] == 'Resort Hotel')
y = sum(df['hotel'] == 'City Hotel')
labels = ['Resort Hotel', 'City Hotel']
sizes = [x,y]
plt.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.title("Booking Percentage by Hotel type")
plt.show()

##### 1. Why did you pick the specific chart?

Piechart are easy to understand and analyse when comparing small number of variables. Here we can clearly figure out the ratio percentage of booking between resort hoetel and city hotel

##### 2. What is/are the insight(s) found from the chart?

Here, we found that the booking number is Higher in City Hotel which is 61.12% than Resort Hotel which is 38.87%. Hence we can say that City hotel has more consumption

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Resort Hotel** - Here we can see that the bookings are approx 38% which is quite less hence some extra services or marketing strategies should be added in order to get more bookings

**City HOtel** - Here we can see it has approx 61% bookin which is quite impressive but still have to maintain the majority and target to get upto 80% or even more

#### ***Cancelation Anlaysis***

In [None]:
# Chart - 2 visualization code
x = sum(df['is_canceled'] == 'Canceled')
y = sum(df['is_canceled'] == 'Not Canceled')
labels = ['Canceled', 'Not Canceled']
sizes = [x,y]
plt.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.title("Cancelation volume")
plt.show()

##### 1. Why did you pick the specific chart?

As we are seeing the relation between only two variables this chart is the best fit for comparing and analysing the percentage of cancellation and no canceled bookings

##### 2. What is/are the insight(s) found from the chart?

We can clearly see the not canceled percentage is greater which is approx 73% while the canceled percentage is upto 27%

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

As canceled percentage is upto 27% soo may have to dig in the rootcause of the cancelation by the customer and then resolve the issue by bringing down the cancelation percentage and increasing the non canceled percentage.

#### ***Guest Repeating***

In [None]:
# Chart - 4 visualization code
x = sum(df['is_repeated_guest'] == 'repeated')
y = sum(df['is_repeated_guest'] == 'Not repeated')
labels = ['repeated', 'Not repeated']
sizes = [x,y]
plt.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.title("Guest Repeating")
plt.show()

##### 1. Why did you pick the specific chart?

Pie chart is easy to interpret and analyse the distribtuion of the repeated and not repeated customers

##### 2. What is/are the insight(s) found from the chart?

We can clearly see the insights as the repeated customers are down to only 3.9% (approx 4%)/
while the not repeated customers are over 96%

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

We have to focus on the reasons why the repeated customers are soo less and also provide coupon codes or extra benefits or offers to non repeated customers and make their stay comfortable.

#### **Most Reserved Rooms**

In [None]:
# Chart - 5 visualization code
room_counts = df['reserved_room_type'].value_counts().reset_index()
room_counts.columns = ['reserved_room_type', 'count']  # Rename columns

# Create bar plot
sns.barplot(x='reserved_room_type', y='count', data=room_counts, palette='viridis')
plt.title("Counts of Reserved Room Types")
plt.xlabel("Reserved Room Type")
plt.ylabel("Count")
plt.show()

##### 1. Why did you pick the specific chart?

Bar charts are easy to visualize a large number of variables and examine the size difference.

##### 2. What is/are the insight(s) found from the chart?

As we can see the Room A has the most numnber of reservations compared to other rooms

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

For increasing the reservation of other rooms as well we shall have to opt the conditions and facilites as Room type A to others to have an equal chanes of reservation on other rooms as well.

#### ***Market segment booking distribution***

In [None]:
# Chart - 6 visualization code
segment_counts = df['market_segment'].value_counts().reset_index()
segment_counts.columns = ['market_segment', 'count']  # Rename columns

# Create bar plot
sns.barplot(x='market_segment', y='count', data= segment_counts, palette='viridis')
plt.title("Market Segment Distribution")
plt.xlabel("Market Segment")
plt.ylabel("Count")
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?

Bar chart helps us to analyse the diffrence clearly between the different marketing segements

##### 2. What is/are the insight(s) found from the chart?

We can clearly visualize the online TA has more bookings compared to all of the rest

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

It has a positive impact as we can focus more on online TA and increase it further as well as can also increase the offline TA/TO , Direct

#### ***MOnth wise booking***

In [None]:
bookings = df[df['is_canceled'] == 'Not Canceled']

In [None]:
# Chart - 7 visualization code
x = bookings.groupby('arrival_date_month')['hotel'].count()
x = x.sort_values(ascending = False)
x.plot(kind = 'bar')
plt.show()

##### 1. Why did you pick the specific chart?

We used bar plot in desending order to visualize number of bookings in every month and also compare them to other months at the same time.

##### 2. What is/are the insight(s) found from the chart?

The above chart shows that **August, July & May** are having the most amount of bookings all over the year.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, From this information we can clearly create strategies for our high booking months and can also increase the revenue by increasing the profit as these months have high bookings

####  ***Year Distribution***

In [None]:
df.columns

In [None]:
plt.figure(figsize=(10, 6))
sns.countplot(x='arrival_date_year', hue='hotel', data=df)

plt.xlabel('Year',fontsize = 15)
plt.ylabel('Count',fontsize = 12)
plt.title('Distribution of bookings in term of Year')
plt.legend(title='Hotel', loc='upper right')  # Add legend with 'hotel' as title

plt.show()

##### 1. Why did you pick the specific chart?

Here we can understand the bookings distribution among different years,

##### 2. What is/are the insight(s) found from the chart?

We can clearly see that in 2016 we have the highest number of bookings in city hotel as well as resort hotels.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

We can examine the key and strategies of having soo many bookings in 2016 and also examine where we failed to apply those key and strategies in 2017 which will have to comeback in and bring up the bookings in 2018.

####  ***Agent Distribution***

In [None]:
df.columns

In [None]:
top_5_agents = df['agent'].value_counts().head(5)


In [None]:
sns.barplot(x =top_5_agents.index,y = top_5_agents.values,hue=top_5_agents.values,palette ='coolwarm')
#Labeling and title the chart and legend
plt.xlabel('Agent No.')
plt.ylabel('Bookings made by them')
plt.legend(title ='Bookings Count')
plt.title('Top 5 agents in the term of bookings')
plt.show()

##### 1. Why did you pick the specific chart?

Bar chart to check the distribution of number of bookings amoung top 5 agents.

##### 2. What is/are the insight(s) found from the chart?

We found that the Agent NO 9 has the most amount of bookings.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

We can reward the agents according to the amount of bookings they offer, which will motivate them and might get more bookings ,

#### ***Average Lead Time in countries***

In [None]:
# Chart - 9 visualization code
lead_time = df.groupby('country')['lead_time'].mean()

In [None]:
lead_time = lead_time.sort_values( ascending = False)[:10]

In [None]:
lead_time

In [None]:
x = lead_time.index
y = lead_time.values
sns.barplot(x=x, y=y, palette="viridis" )
plt.title('Average Lead time across countries', fontsize=16)
plt.xlabel('Country', fontsize=12)
plt.ylabel('Lead_Time', fontsize=12)
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?

Here we can clearly visualize the trend of average lead time taken by different countries.

##### 2. What is/are the insight(s) found from the chart?

We found the customers from countries like FJI, BEN, LCA have a high lead time and plan their vacation and stay in advance.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

THese insights can help in marketing prespective as we can showcase or suggest the satys as for early plannings to get the confimed bookings in advance.
Also countries with longer lead time are not price sensitive so we can make a decent profits from the countries having a large lead time

#### ***Total Revenue distribution***

In [None]:
# Chart - 10 visualization code
x = df['total_revenue']
y = df['hotel']
sns.barplot(x=x, y=y)
plt.title('Total Revenue by Hotel Type')
plt.xlabel('Total Revenue')
plt.ylabel('Hotel Type')
plt.show()

##### 1. Why did you pick the specific chart?

Bar chart can help us to analyse the total revenue across different hotel types.

##### 2. What is/are the insight(s) found from the chart?

WE can clearly see that the Resort HOtel has more revenue compared to city hotel.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

As we can see the demands are mostly on resort hotels which can help us to gain more bookings by advertising more resort hotel types according to customers search history.

#### ***ADR distribution***

In [None]:
adr_hotels = df[df['adr'] > 0]

In [None]:
# Chart - 11 visualization code
y = adr_hotels.groupby('hotel')['adr'].mean()
x = y.sort_values(ascending = False)
sns.barplot( y=y, x=x.index, palette="coolwarm")
plt.title('Average Daily Rate by Hotel Type')
plt.xlabel('Average Daily Rate')
plt.ylabel('Hotel Type')
plt.show()

##### 1. Why did you pick the specific chart?

To specify the average ADR for both hotels

##### 2. What is/are the insight(s) found from the chart?

As we can see the average ADR of City hotel is higher than Resort hotel, so the profit and revenue will be higher for city hotel

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Here for more profit we can focus on the city hotels and make higher profits compared to resort hotels.

#### ***MEal Preferences***

In [None]:
# Chart - 12 visualization code
hotel_wise_meal = df.groupby(['hotel', 'meal'])['meal'].count().unstack()
hotel_wise_meal.plot(kind ='bar', figsize = (12,8))
hotel_wise_meal

##### 1. Why did you pick the specific chart?

Bar plot to check the specific meal preference of the customers.

##### 2. What is/are the insight(s) found from the chart?

We can see,** BB (Bed & breakfast**) meal is most prefered by guests in both the hotels, followed by **SC (Self Catering)** in city hotels

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Hotel can provide more delisious dishes in these meal to get customer repeat & attaract new customer by their cusines and taste,

#### ***Cancellation Analysis by hotel***

In [None]:
# Filter for 'Resort Hotel' first, then group and count
resort_counts = df[df['hotel'] == 'Resort Hotel'].groupby('is_canceled').size().reset_index(name='count')

print(resort_counts)


In [None]:
# Filter for 'CIty Hotel' first, then group and count
City_counts = df[df['hotel'] == 'City Hotel'].groupby('is_canceled').size().reset_index(name='count')

print(City_counts)


In [None]:
# Chart - 3 visualization code
sns.barplot(x='is_canceled', y='count', data=resort_counts,color='red', label='Resort Hotel')
sns.barplot(x='is_canceled', y='count', data=City_counts,color='orange', label='City Hotel')
plt.title("Cancellation Analysis by hotel")
plt.xlabel("Resort_counts")
plt.ylabel("City_counts")
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

####  ***Correlation Heatmap***

In [None]:
df.columns

In [None]:
# Correlation Heatmap visualization code
corr_df = df[['lead_time','previous_cancellations', 'previous_bookings_not_canceled', 'total_members',
                    'booking_changes', 'days_in_waiting_list', 'adr', 'required_car_parking_spaces', 'total_of_special_requests']].corr()
f, ax = plt.subplots(figsize=(8, 8))
sns.heatmap(corr_df, annot = True, fmt='.2f', annot_kws={'size': 10},  vmax=1, square=True, cmap="YlGnBu")

##### 1. Why did you pick the specific chart?

Here we can understand the coorelation between two diffrent variables and columns easily.

##### 2. What is/are the insight(s) found from the chart?

Highest corelation value between axis is 39% positive & lowest corelation value between the axis is -9% negative.

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

**Booking Percentage by Hotel Type:** Identifies popular hotel types, guiding targeted marketing strategies and inventory management.

**Cancellation Analysis:** Uncovers trends in cancellations to mitigate losses through targeted communication and flexible policies.

**Repeating Guests:** Highlights loyal customer behavior to enhance retention through personalized experiences and loyalty programs.

**Most Reserved Rooms:** Informs inventory management by revealing high-demand room types for better pricing and allocation decisions.

**Market Segment Booking Distribution:** Optimizes marketing strategies by analyzing the customer segments driving bookings.

**Month-wise Booking Trends:** Identifies seasonal booking patterns to prepare staffing and promotions for fluctuating demand.

**Top 10 Countries by Bookings:** Pinpoints key markets for targeted marketing strategies to increase market share and revenue.

**Average Lead Time in Countries:** Helps optimize pricing and promotions by understanding customer booking behaviors across regions.

**Total Revenue Distribution:** Analyzes financial performance across hotel types to guide profitability-enhancing strategies.

**ADR Distribution:** Informs dynamic pricing models to maximize revenue based on demand fluctuations and customer profiles.

**Meal Preferences:** Provides insights for tailoring dining offerings to enhance guest satisfaction and boost food revenue.

**Correlation Heatmap:** Reveals relationships between key variables to implement targeted strategies for improved performance.

# **Conclusion**

* This project conducted a comprehensive exploratory data analysis of the hotel booking dataset from Booking.com, uncovering valuable insights for strategic decision-making.

* Key trends and patterns were identified, highlighting implications for optimizing hotel operations and enhancing customer satisfaction.

* Certain hotel types attracted a higher percentage of bookings, suggesting targeted marketing efforts to maximize occupancy rates.

* Cancellation trends revealed critical periods that require focused strategies to minimize losses, emphasizing improved customer engagement and flexible booking policies.

* The analysis of repeating guests underscored the importance of fostering customer loyalty, which can be enhanced through personalized services and tailored loyalty programs.

* Insights into market segments and top countries by bookings enabled targeted marketing strategies to increase revenue from high-potential markets.

* Understanding average lead times by country allows for the optimization of pricing and promotional strategies, catering to varying customer preferences across regions.

* Findings on room preferences and total revenue distribution guide operational decisions, such as inventory management and dynamic pricing strategies, maximizing profitability.

* Insights into meal preferences and special requests help hotels tailor their offerings, enhancing guest satisfaction and driving additional revenue from food and beverage services.

* Continuous monitoring of booking trends and customer feedback will be essential for adapting to changing market conditions.

* Implementing machine learning algorithms could refine predictive analytics, enhancing the ability to forecast demand and optimize pricing strategies.

* Overall, the insights generated from this analysis support Booking.com in making data-driven decisions, positioning the company to remain competitive in the dynamic hospitality landscape.

* By leveraging these findings, Booking.com can enhance operational efficiency, improve customer experiences, and ultimately drive profitability.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***