# **Project Name**    - EDA Hotel Bookings Capstone Project   -



##### **Project Type**    - EDA
##### **Contribution**    - Team
##### **Team Member 1 -** - Humayel
##### **Team Member 2 -** - Divyanshu Chauhan
##### **Team Member 3 -** - Shubham Singh

# **Project Summary -**

* The Hotel Booking Analysis project is an in-depth exploratory data analysis (EDA) project focused on analyzing hotel booking data to extract valuable insights and make data-driven observations. The project addresses various aspects of hotel bookings, including cancellations, booking ratios between resort and city hotels, percentage of bookings per year, monthly customer counts, guest origins, average monthly rates per person, weekday versus weekend bookings, hotel availability, total bookings and revenue across years, preferred hotels and stay durations, and length of stay.

* The project begins by cleaning and preprocessing the dataset to ensure data quality. Through data exploration and visualization, we determine the number of bookings that were cancelled and examine the factors contributing to cancellations. Next, we calculate the booking ratio between resort hotels and city hotels to understand customer preferences and identify any significant differences in booking patterns. We also analyze the percentage of bookings for each year to identify booking trends and seasonality.

* Furthermore, we investigate the monthly customer counts received by each hotel to identify peak periods and seasonal variations in customer traffic. By determining the country from which most guests come, we gain insights into customer demographics and tailor marketing strategies accordingly. The average monthly rate per person is calculated to understand pricing trends and assess the competitiveness of hotels in the market.

* Additionally, we analyze the number of bookings made on weekdays versus weekends to identify any significant differences in customer behavior and booking patterns. We also assess hotel availability for booking and provide insights to hotel managers to optimize inventory management and improve booking efficiency.

* Moreover, the project explores the total number of bookings across different years, analyzing booking growth and trends. We split the total revenue across years by hotel type to evaluate revenue trends and compare the performance of different hotel types.

* Lastly, we investigate the preferred hotels where guests like to stay and spend more time. By differentiating between weeknight and weekend stays, we gain insights into customer preferences and behavior. Additionally, we analyze the length of stay in hotels to understand typical stay durations and any correlations with factors such as hotel type or booking month.

Through comprehensive data analysis, visualization, and interpretation, this project provides valuable insights into hotel booking patterns, customer behavior, revenue trends, and guest preferences. The findings can guide strategic decision-making, marketing efforts, and overall business optimization for hotel establishments.

# **GitHub Link -**

https://github.com/dcatc?tab=repositories

# **Problem Statement**


**Hotel industry is a very volatile industry and the bookings depend on variety of factors such as type of hotels, seasonality, days of week and many more. This makes analyzing the patterns available in the past data more important to help the hotels plan better. Using the historical data, hotels can perform various campaigns to boost the business. We can use the patterns to predict the future bookings using time series or decision trees.

We will be using the data available to analyze the factors affecting the hotel bookings. These factors can be used for reporting the trends and predict the future bookings.

We will be tackling this problem statement
1. How Many Booking Were Cancelled?
2. What is the booking ratio between Resort Hotel and City Hotel?
3. What is the percentage of booking for each year?
4. What is the monthly customers each hotel is receiving?
5. From which country most guests come?
6. What is the average monthly rate per person?
7. Number of bookings on weekday vs weekends.
8. Hotels available for booking.
9. Total bookings across years.
10. Total revenue across years split by hotel type.
11. Check in which hotel people like to stay and spend more time.
(a). We will check for the week nights stay.
(b). We will check for the weekends stay.
12. How Long People Stay in the hotel?

#### **Define Your Business Objective?**

*Have you ever wondered when the best time of year to book a hotel room is? Or the optimal length of stay in order to get the best daily rate? What if you wanted to predict whether or not a hotel was likely to receive a disproportionately high number of special requests? This hotel booking dataset can help you explore those questions! This data set contains booking information for a city hotel and a resort hotel, and includes information such as when the booking was made, length of stay, the number of adults, children, and/or babies, and the number of available parking spaces, among other things. All personally identifying information has been removed from the data. Explore and analyse the data to discover important factors that govern the bookings.*Answer Here.

# **General Guidelines** : -

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.

     The additional credits will have advantages over other students during Star Student selection.

             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.


```
# Chart visualization code
```


*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
pip install missingno

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import missingno as msno
import matplotlib.ticker as mtick

### Dataset Loading

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')
filepath = "/content/drive/MyDrive/Hotel Bookings.csv"
hotel_data = pd.read_csv(filepath)

### Dataset First View

In [None]:
# Dataset First Look
hotel_data.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count

row_count = len(hotel_data)
column_count = len(hotel_data.columns)
print("Number of rows:", row_count)
print("Number of columns:", column_count)

### Dataset Information

In [None]:
# Dataset Info
hotel_data.info

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
duplicate_count = hotel_data.duplicated().sum()

print("Number of duplicate values:", duplicate_count)

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
missing_values_count = hotel_data.isnull().sum().sort_values(ascending=False)

print("Missing values count:")
print(missing_values_count)

In [None]:
# Visualizing the missing values
msno.matrix(hotel_data)
plt.show()

### What did you know about your dataset?

*  is_cancelled:
If the booking was cancelled(l ) or
not(O)
* Hotel: H1 = Resort Hotel
H2=City Hotel
* lead_time:
Number of days that elapsed between
the entering date of the booking into the
PMS and the arrival date
* arrival-date-year:
Year of arrival date
* arrival_date_month: Month of arrival date
* arrival_date_week_number: Week number for arrival date
* arrival-dat-day: Day of arrival date
* stays_in_weekend_nights:
Number of weekend nights (Saturday or
Sunday) the guest stayed or booked to
stay at the hotel
* stays-in_week_nights:
Number of week nights (Monday to
Friday) the guest stayed or booked to
stay at the hotel
* adults:
Number of adults
* children:
Number of children
* babies:
Number of babies
* meal:
Kind of meal opted for
* country:
Country code
* market-segment:
Which segment the customer belongs
to
* Distribution _channel:
How the customer accessed the stay-
corporate booking/Direct/TA.TO
* is_repeated_guest:
Guest coming for first time or not
* previous_cancellation:
Was there a cancellation before
* previous-bookings:
Count of previous bookings
* reserved_room_type:
Type of room reserved
* assigned-room-type:
Type of room assigned
* booking-changes:
Count of changes made to booking
* deposit_type:
Deposit type
* agent:
Booked through agent
* days-in_waiting-list:
Number of days in waiting list
* customer_type:
Type of customer
* required-car-parking:
If car parking is required
* total_of_special-req:
Number of additional special
requirements
* reservation—status:
Reservation of status
* reservation_status_date:
Date of the specific status
*  The total number of rows in data is 119390. The dataset has 32 columns.
*  Based on the column definitions in data dictionary there are few candidates that might have datatype mismatch. Columns is_canceled and is_repeated_guest.
*  The unique values in hotel column are 0, 1
*  The unique values in is_repeated_guest column are 0, 1
*  The only column which has missing values is children. We can impute these values by 0.
*  Post imputing, the number of missing values in entire dataset are 0
*  There are 2 type of hotels. City Hotel has 79330 number of records and Resort Hotel has 40060 number of records.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
hotel_data.columns

In [None]:
# Dataset Describe
hotel_data.describe

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
hotel_data.nunique()

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
# copying the data set
df=hotel_data.copy()

In [None]:
# In the agent and the company column, we have id_number for each agent or company, so for all the missing values, we will just replace it with 0.
df[["agent","company"]]=df[["agent","company"]].fillna(0.0)

In [None]:
# Children column contains the count of children, so we will replace all the missing values with the rounded mean value.And our country column contains country codes representing different countries. It is a categorical feature so I will also replace it with the mode value. The mode value is the value that appears more than any other value. So, in this case, I am replacing it with the country that appears the most often.
df['country'].fillna(hotel_data.country.mode().to_string(), inplace=True)
df['children'].fillna(round(hotel_data.children.mean()), inplace=True)

In [None]:
# There are many rows that have zero guests including adults, children and babies. These type of rows does not make.
df[(df.adults+df.babies+df.children)==0].shape

In [None]:
# We have 180 such rows. we will just remove these rows
df = df.drop(df[(df.adults+df.babies+df.children)==0].index)

In [None]:
# checking the data type
df.dtypes

In [None]:
# There are some columns like children, company, and agent, that are float type but their values are only in integers.So we will convert them to the integer type.
df[['children', 'company', 'agent']] = df[['children', 'company', 'agent']].astype('int64')

In [None]:
# Let us check the unique values for these columns:

is_canceled_unique = hotel_data["is_canceled"].unique()
print(f"the unique values in is_canceled are:{is_canceled_unique}")
is_repeated_guest_unique = hotel_data["is_repeated_guest"].unique()
print(f"the unique values in is_repeated_guest:{is_repeated_guest_unique}")

In [None]:
# We are selecting only those bookings which did not get canceled
df_not_canceled = df[df['is_canceled'] == 0]
df_not_canceled

In [None]:
# Let's look at the number of countries from which the total bookings are made.
country=df_not_canceled.country.nunique()
country

In [None]:
# This shows there is the total countries from where the bookings are made to these 2 hotels. We can identify the countries with the highest number of bookings.
top_countries_with_codes = df_not_canceled.country.value_counts().head(10)
top_countries_with_codes

### What all manipulations have you done and insights you found?

Several manipulations have been performed on the dataset to make it analysis-ready. Here is a summary of the manipulations and insights found:

* **Data Cleaning**: The dataset was checked for missing values, duplicates, and inconsistent data. Missing values were handled through imputation or removal, duplicates were dropped, and data inconsistencies were resolved.

* **Feature Engineering**: New features were created to derive meaningful insights from the existing data. For example, the 'revenue' feature was calculated by multiplying the average daily rate (ADR) with the number of nights stayed.

* **Data Aggregation**: The data was aggregated using various grouping operations to summarize information at different levels. For example, the total revenue was calculated by grouping the data by year and hotel type.

* **Visualization**: Various visualizations were created to understand the patterns and trends in the data. Insights were derived from these visualizations, such as the booking cancellation rate, the booking ratio between resort and city hotels, the distribution of bookings on weekdays vs. weekends, the revenue trends across years, the country with the highest number of guests, and the average length of stay in hotels.

*  **Insights**: The insights gained from the data wrangling and visualization process can be summarized as follows:

* * **Booking** Cancellation: The dataset revealed the number of bookings that were canceled, providing insights into the cancellation rate and potential areas for improvement in the booking process.

* * **Hotel Type Comparison**: The booking ratio between resort and city hotels provided an understanding of the preference of guests and the popularity of each hotel type.

* * **Weekday vs. Weekend Bookings**: Analyzing the distribution of bookings on weekdays and weekends helped identify booking patterns and optimize hotel operations accordingly.

* * **Revenue Trends**: The visualization of total revenue across years highlighted revenue growth or decline over time, allowing for strategic decision-making and resource allocation.

Country of Origin: Identifying the country from which most guests come provided insights into the hotel's customer base and the potential for targeted marketing efforts.

* * **Length of Stay**: Understanding how long people stay in hotels helped optimize staff scheduling, room availability, and revenue forecasting.

These manipulations and insights have enhanced the understanding of the dataset and provided valuable information for our project.

Answer Here.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code
sns.countplot(x='is_canceled', data=hotel_data)
plt.xlabel('Booking Status')
plt.ylabel('Count')
plt.title('Number of Cancelled Bookings')
plt.xticks([0, 1], ['Not Cancelled', 'Cancelled'])
plt.show()

##### 1. Why did you pick the specific chart?

The Bar PLot is suitable for visualizing the number of cancelled bookings in the hotel dataset for several reasons:

* Categorical Data: The booking status, whether it is cancelled or not, is a categorical variable with only two possible values. A bar plot is a common choice for visualizing the distribution of categorical variables.

* Count Comparison: The plot allows for an easy visual comparison between the count of cancelled and non-cancelled bookings. The two bars represent the two distinct categories, making it clear how many bookings were cancelled and how many were not.

* Clear Representation: The plot's simplicity and clarity make it easy to interpret the data. The x-axis represents the booking status categories, and the y-axis represents the count of bookings. This clear representation helps viewers understand the cancellation pattern in the dataset

Overall, the chosen bar plot effectively visualizes the number of cancelled bookings, allowing for a quick comparison and clear understanding of the cancellation pattern in the hotel dataset.Answer Here.

##### 2. What is/are the insight(s) found from the chart?

From the chart visualizing the number of cancelled bookings, the following insights can be inferred:

* Booking Cancellation Rate: The chart provides an overview of the number of bookings that were cancelled. By comparing the cancelled bookings to the total bookings, you can calculate the cancellation rate or percentage. This insight helps in understanding the proportion of bookings that were cancelled, which can be useful for managing capacity, revenue forecasting, and identifying potential issues.

* Booking Cancellation Patterns: The chart allows you to observe any patterns or trends in booking cancellations over time. You can identify if there are specific periods or seasons when cancellations are more prevalent. This insight can be valuable for optimizing staffing, inventory management, and pricing strategies during periods of higher cancellation rates.

* Impact on Revenue and Occupancy: High cancellation rates can impact revenue and occupancy levels. By analyzing the chart, you can evaluate the potential revenue loss due to cancellations. This information can help in implementing strategies to minimize cancellations, such as offering flexible booking options, improving communication with customers, or implementing cancellation policies.

* Evaluation of Booking Policies: The chart can provide insights into the effectiveness of booking policies and their impact on cancellation rates. For example, if there is a significant increase in cancellations shortly before the check-in date, it may indicate the need to revisit or tighten cancellation policies to minimize last-minute cancellations.

* Customer Behavior and Preferences: Analyzing the chart can offer insights into customer behavior and preferences regarding cancellations. It may uncover patterns related to specific booking channels, customer segments, or reservation types that are more prone to cancellations. Understanding these patterns can help tailor marketing strategies, customer service initiatives, and operational decisions to better meet customer needs and expectations.

Overall, the visualization of the number of cancelled bookings provides insights into the cancellation rate, patterns, and potential impacts on revenue and operations. These insights can inform decision-making processes to optimize resources, enhance customer experience, and minimize revenue loss due to cancellations.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights from visualizing the number of cancelled bookings can indeed help create a positive business impact. Understanding the cancellation patterns and factors influencing cancellations can provide valuable information for improving operations, enhancing customer satisfaction, and optimizing revenue management. Here's a breakdown:

Operational Efficiency: By analyzing the reasons behind cancellations, businesses can identify areas for improvement in their operations. For example, if there is a high number of cancellations due to customer complaints about cleanliness or service quality, the business can take measures to address these issues and improve customer satisfaction.

Revenue Optimization: Insights from the visualization can help businesses identify revenue optimization strategies. For instance, if there is a significant number of last-minute cancellations, the business can consider implementing stricter cancellation policies or offering non-refundable bookings to reduce revenue loss from last-minute cancellations.

Customer Experience: Understanding the factors leading to cancellations can help businesses improve the overall customer experience. By addressing common pain points or concerns, such as inflexible booking policies or unclear communication, businesses can enhance customer satisfaction and loyalty.

Pricing Strategies: The visualization can uncover patterns in cancellations related to pricing. For example, if there is a high number of cancellations after a price increase, it may indicate that the pricing strategy needs to be revisited. Adjusting pricing strategies based on cancellation insights can help maximize revenue and minimize cancellations.

However, it's important to note that while insights from analyzing cancelled bookings can generally have a positive impact, there can be instances where negative growth is observed. For example:

Unfavorable Market Conditions: The visualization may reveal a significant increase in cancellations due to external factors beyond the control of the business, such as economic downturns, natural disasters, or travel advisories. These circumstances can lead to negative growth despite efforts to minimize cancellations.

Ineffective Marketing or Sales Strategies: If the visualization shows a consistently high cancellation rate across different marketing or sales channels, it may indicate that the business's marketing or sales strategies are not effectively targeting the right customer segments or setting accurate expectations. This can result in negative growth as the business fails to convert bookings into actual stays.

In summary, while the insights gained from visualizing cancelled bookings generally have the potential to create a positive business impact, there can be external factors or internal shortcomings that may lead to negative growth. It is essential for businesses to carefully analyze the insights and take appropriate actions to address issues, enhance operations, and adapt strategies to minimize cancellations and foster positive growth

#### Chart - 2

In [None]:
# Chart - 2 visualization code
# Calculate the booking count for each hotel type
booking_counts = df_not_canceled['hotel'].value_counts()

# Calculate the booking ratio in percentage
booking_ratio = (booking_counts / booking_counts.sum()) * 100

# Create a bar plot to visualize the booking ratio
plt.bar(booking_ratio.index, booking_ratio.values)
plt.xlabel('Hotel')
plt.ylabel('Booking Ratio in %')
plt.title('Booking Ratio: Resort Hotel vs. City Hotel')
plt.show()


##### 1. Why did you pick the specific chart?

The Bar chart is chosen to visualize the booking ratio between the Resort Hotel and the City Hotel is a bar plot. Here are the reasons for selecting this chart:

* Comparison of Categories: A bar plot is ideal for comparing categories, such as the hotel types in this case. The distinct bars represent the different categories, making it easy to visually compare the booking ratios between the Resort Hotel and the City Hotel.

* Quantitative Representation: The length or height of each bar in the plot corresponds to the magnitude of the booking ratio. This quantitative representation allows for a clear understanding of the relative proportions between the two hotel types.

* Clear Labels: The x-axis represents the hotel types, and the y-axis represents the booking ratio in percentage. The labels provide clear information about the categories being compared and the measurement scale.

* Focus on Ratios: The bar plot emphasizes the booking ratio between the two hotel types by representing it as a proportion or percentage. This allows viewers to easily comprehend and compare the relative proportions between the Resort Hotel and the City Hotel.

Overall, the bar plot is an effective choice for visually representing the booking ratio between the Resort Hotel and the City Hotel. It enables a quick and intuitive comparison of the relative proportions and highlights the ratio aspect by presenting the data in a clear and concise manner.

##### 2. What is/are the insight(s) found from the chart?

The insights that can be obtained from the chart illustrating the booking ratio between the Resort Hotel and the City Hotel is more than 60% of the population booked the City hotel and some other insights also which are as follows:

* Booking Preference: The chart provides insights into the relative popularity or preference of the two hotel types. By comparing the heights of the bars, you can determine which hotel type has a higher booking ratio. This information indicates the booking preference of customers between the Resort Hotel and the City Hotel.

* Market Share: The chart allows you to assess the market share of each hotel type. The larger the booking ratio of a particular hotel type, the larger its market share in terms of bookings. This insight helps understand the competitive landscape between the Resort Hotel and the City Hotel.

* Potential Business Opportunities: Depending on the booking ratio, the chart can reveal potential business opportunities. For example, if the booking ratio of the Resort Hotel is significantly higher, it might indicate a demand for resort-style accommodations, leading to potential expansion or investment opportunities in that segment.

* Targeted Marketing and Resource Allocation: The booking ratio can inform marketing and resource allocation strategies. Understanding which hotel type has a higher booking ratio can help tailor marketing campaigns and allocate resources effectively to attract and accommodate the target audience.

* Competitive Analysis: The chart facilitates a comparison between the Resort Hotel and the City Hotel in terms of their booking ratios. This analysis can help identify strengths and weaknesses, areas for improvement, and potential strategies to gain a competitive advantage.

It's important to note that the insights obtained from the chart may vary depending on the actual booking ratio values and the context of the specific business or industry.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The insights gained from analyzing the booking ratio between the Resort Hotel and the City Hotel can potentially have a positive business impact. However, it depends on the specific context and goals of the business.

If the booking ratio between the Resort Hotel and the City Hotel is skewed heavily towards one hotel, it can provide insights for business decision-making. For example, if the Resort Hotel has a significantly higher booking ratio, it could indicate that there is a higher demand for resort-style accommodations. This insight may lead the business to invest more in marketing and enhancing the resort hotel's amenities and services to further attract customers and drive growth in that segment.

On the other hand, if the booking ratio is heavily skewed towards the City Hotel, it could imply that there is a higher demand for city-centered accommodations. In this case, the business may focus on optimizing the city hotel's offerings, targeting specific customer segments, and tailoring marketing strategies to capitalize on the existing demand.

It's important to note that these insights may not necessarily lead to negative growth. Rather, they can guide strategic decisions to better align the business with customer preferences and market demand. The visualization of the booking ratio between the Resort Hotel and the City Hotel provides valuable information that can help identify opportunities for growth and optimization within the business.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
# No  of booking for each year?
plt.figure(figsize =(10,10))
sns.set(style="whitegrid")
plt.title("Years ", fontdict={'fontsize': 20})
ax = sns.countplot(x = "arrival_date_year", data = df_not_canceled)

##### 1. Why did you pick the specific chart?

I chose a bar chart because it is an effective way to display and compare categorical data, in this case, the number of hotel bookings for each year. Bar charts make it easy to observe the differences between categories and identify any patterns.

##### 2. What is/are the insight(s) found from the chart?

By analyzing the chart, we can gain insights into the booking trends over the years. We can identify which years had the highest and lowest number of bookings and observe any significant changes or patterns. This information can be used to understand the growth or decline in hotel bookings over time and potentially identify factors that contribute to these trends.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights can certainly help create a positive business impact. Understanding the booking trends can assist hotel management in making informed decisions, such as determining the peak seasons or identifying opportunities for marketing campaigns. It can also help in resource allocation, staffing, and overall business planning, leading to improved efficiency and customer satisfaction.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
# What is the monthly customers each hotel is receiving?
plt.figure(figsize =(15,10))
sns.set(style="whitegrid")
plt.title(" Total Customers - Monthly ", fontdict={'fontsize': 20})
ax = sns.countplot(x = "arrival_date_month", hue = 'hotel',data = df_not_canceled)

##### 1. Why did you pick the specific chart?

I chose a countplot because it is specifically designed to show the count of observations in each category. It is an effective way to visualize categorical data, such as the number of hotel bookings for each month. The countplot provides a clear and concise representation of the busiest month based on the frequency of bookings.

##### 2. What is/are the insight(s) found from the chart?

By analyzing the countplot, we can determine the busiest month for hotels by observing the tallest bar. The month with the highest count indicates the busiest period for hotel bookings. This insight enables hotel management to focus on optimizing operations, staffing, and services during that month to handle the increased demand effectively.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can create a positive business impact. Knowing the busiest month allows hotels to plan and allocate resources accordingly. They can adjust their pricing strategies, marketing efforts, and staffing levels to optimize revenue and customer satisfaction during the peak month. By efficiently managing operations during the busiest period, hotels can enhance guest experiences, increase profitability, and strengthen their competitive edge.

#### Chart - 5

In [None]:
# Chart - 5 visualization code
# From which country most guests come?
plt.figure(figsize=(12,6))
plt.xticks(rotation=75)
plt.title('Top 10 countries from where these hotels are recieving guests')
sns.barplot(x=top_countries_with_codes.index, y=top_countries_with_codes);

##### 1. Why did you pick the specific chart?

 The specific chart, which is a bar plot, to visualize the number of guests from different countries because it effectively represents categorical data (countries) and their corresponding numerical values (number of guests). Here's why the bar plot is suitable for this scenario:

* Comparison: The bar plot allows for easy comparison between different categories (countries) by the length of the bars. It visually displays the differences in the number of guests from each country, making it simple to identify which country has the highest number of guests.

* Categorical data: The bar plot is well-suited for displaying categorical data, such as countries. Each country is represented by a separate bar, making it clear and intuitive to interpret the data.

* Clear visualization: The bar plot provides a clear and straightforward representation of the data. The length of the bars corresponds directly to the number of guests, making it easy to understand and compare the values.

* Additional customization: The bar plot can be customized further by adding colors to each bar, as in the provided visualization. This customization helps distinguish between different categories (countries) and adds visual appeal to the plot.

Overall, the bar plot is an effective choice for visualizing the number of guests from different countries due to its ability to compare categories, display categorical data clearly, and provide customization options for better visual representation.

##### 2. What is/are the insight(s) found from the chart?

From the chart, we can derive several insights:

* Top countries: The chart reveals which countries have the highest number of guests. By examining the heights of the bars, we can easily identify the country with the maximum number of guests.

* Guest distribution: The chart provides an overview of the distribution of guests across different countries. We can observe the relative proportions of guests from various countries based on the lengths of the bars.

* Popular destinations: The chart helps identify the countries that are popular destinations for guests. Countries with taller bars indicate a higher number of guests, suggesting that they attract a significant number of visitors.

* Comparison between countries: The chart allows for easy comparison between countries in terms of the number of guests. By visually comparing the lengths of the bars, we can identify the countries with higher or lower guest numbers relative to each other.

These insights can be valuable for understanding guest demographics, identifying target markets, and making informed business decisions related to marketing strategies, resource allocation, and customer service for different countries.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights from the chart can potentially help create a positive business impact in the following ways:

Targeted marketing: Knowing which countries have the most guests allows businesses to focus their marketing efforts on those specific regions. By tailoring promotional campaigns, advertisements, and offers to the preferences and needs of guests from these countries, businesses can attract more customers and increase bookings.

Resource allocation: Understanding the distribution of guests across countries helps businesses allocate their resources effectively. They can adjust staffing levels, amenities, and services to cater to the needs and preferences of guests from different countries, thereby enhancing the overall guest experience and increasing customer satisfaction.

Business partnerships: Identifying the popular destinations for guests can open opportunities for collaborations and partnerships with local businesses in those countries. Establishing relationships with travel agencies, tour operators, and other relevant businesses in those regions can lead to increased referrals, joint marketing initiatives, and mutually beneficial partnerships.

However, it is important to note that the insights gained from the chart alone do not directly indicate negative growth. Negative growth or potential challenges may arise if businesses fail to adapt their strategies and operations based on the insights gained. For example, if a business overlooks the preferences and needs of guests from certain countries with high guest numbers, it may lead to a negative impact on customer satisfaction, repeat bookings, and overall business growth.

To ensure a positive business impact, it is crucial for businesses to analyze the insights gained from the chart, implement appropriate strategies, tailor their services, and adapt to the expectations of guests from different countries. By leveraging the insights effectively, businesses can enhance their competitiveness, attract more guests, and foster positive growth in the long run.

#### Chart - 6

In [None]:
# Chart - 6 visualization code
# Calculating average monthly rate per person
plt.figure(figsize=(12,6))
df_not_canceled['adr_pp'] = df_not_canceled['adr'] / (df_not_canceled['adults'] + df_not_canceled['children'])
df_not_canceled['price'] = df_not_canceled['adr'] * (df_not_canceled['stays_in_weekend_nights'] + df_not_canceled['stays_in_week_nights'])
sns.lineplot(data = df_not_canceled, x = 'arrival_date_month', y = 'price', hue = 'hotel')
plt.show()

##### 1. Why did you pick the specific chart?

The specific chart chosen for this code snippet is a line plot, generated using the Seaborn library. Here's why this chart was selected for the given code:

* **Visualizing Price Trends**: The objective of the code is to visualize the price trends in hotels over different months. A line plot is an effective choice for this purpose as it allows us to observe the change in prices over time, represented by the x-axis (arrival_date_month) and y-axis (price).

* **Comparative Analysis**: By using different colors (hue) to represent the hotel types (resort or city), the line plot enables a comparative analysis of price trends between the two hotel categories. The use of color allows for easy differentiation and understanding of the data.

* **Time-based Analysis**: Line plots are particularly useful for showcasing trends over time. In this case, the x-axis represents the different months, providing a clear temporal dimension to the data. By connecting data points with lines, the plot highlights any patterns or changes in prices across the months.

* **Clear and Concise Presentation**: The line plot presents the information in a visually appealing and easy-to-understand manner. It provides a comprehensive view of how prices vary over time, allowing viewers to quickly grasp the overall trends and any differences between resort and city hotels.

Overall, the line plot is an appropriate choice for visualizing the price trends in hotels over different months and effectively communicates the analysis to the audience.


##### 2. What is/are the insight(s) found from the chart?

By analyzing the line plot, we can identify trends or fluctuations in the average daily rate per person over time. We can observe whether the rates are generally increasing, decreasing, or remaining relatively stable. These insights provide valuable information for hotel management to understand the pricing dynamics, demand patterns, and the potential impact of external factors on the ADR. For example, they can identify peak seasons or periods of high demand and adjust pricing strategies accordingly.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact. Understanding the trends in average daily rate per person over time allows hotels to optimize their pricing strategies and revenue management. By identifying periods of high demand or low occupancy, hotels can adjust rates dynamically to maximize revenue and profitability. Additionally, these insights enable hotels to make data-driven decisions when setting prices, creating promotions, and managing inventory, resulting in increased efficiency, revenue growth, and improved customer satisfaction.

#### Chart - 7

In [None]:
# Chart - 7 visualization code
# Number of bookings on weekday vs weekends

plt.figure(figsize=(20, 20))
plt.subplot(4, 4, 1)
sns.countplot(x='stays_in_weekend_nights',hue='hotel', data= df_not_canceled, palette='cool')
plt.title("Number of stays on weekend nights",fontweight="bold", size=20)
plt.subplot(4, 4, 2)
sns.countplot(data =  df_not_canceled, x = 'stays_in_week_nights', hue='hotel', palette='rocket')
plt.title('Number of stays_in_week_nights',fontweight="bold", size=20)
plt.subplots_adjust(right=1.7)

plt.show()

##### 1. Why did you pick the specific chart?

To visualize the comparison between weekend stays and weekday stays, a bar chart can be an appropriate choice. A bar chart allows you to compare the number of stays for each category (weekend and weekday) in a visually straightforward manner.

By using a bar chart, you can easily compare the height of the bars representing the number of weekend stays and weekday stays. This visual comparison helps to identify any significant differences or patterns between the two categories.

##### 2. What is/are the insight(s) found from the chart?

From the bar chart comparing the number of weekend stays vs. weekday stays, the insights that can be derived include:

* Comparison of Volume: The chart visually represents the difference in the number of stays between weekends and weekdays. By comparing the heights of the bars, it is easy to see which category has a higher volume of stays.

* Booking Patterns: The chart helps to understand the booking patterns of guests. If there is a significant difference between weekend stays and weekday stays, it may indicate certain preferences or trends among guests.

* Occupancy Management: The chart can provide insights for hotel managers in terms of managing occupancy. If there are clear differences in the number of stays between weekends and weekdays, it can help with resource allocation, staffing, and revenue management strategies.

* Marketing and Promotions: Understanding the difference in stays between weekends and weekdays can guide marketing and promotional activities. It can help in targeting specific offers or promotions to attract more guests during certain periods or incentivize stays during quieter times.

Overall, the chart allows for a quick comparison between weekend stays and weekday stays, enabling insights into guest booking behavior and potential areas for business optimization.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The insights gained from comparing weekend stays and weekday stays can indeed help create a positive business impact. Here's how:

* Revenue Optimization: Understanding the difference in stays between weekends and weekdays allows for better revenue management. Hotels can adjust pricing strategies and offers accordingly. For example, they can offer discounted rates or special packages during weekdays to attract more guests and maximize occupancy during quieter periods. This can help optimize revenue and improve overall financial performance.

* Resource Allocation: With insights into booking patterns, hotels can efficiently allocate resources such as staffing, housekeeping, and amenities. If weekends are busier, they can ensure sufficient staff and services are available to provide a positive guest experience. On the other hand, if weekdays have lower occupancy, resources can be scaled down to manage costs effectively.

* Marketing and Promotions: The insights can guide marketing efforts to target specific segments and periods. Hotels can design targeted marketing campaigns to attract more guests during weekends or create appealing weekday packages for business travelers or local residents. By aligning marketing strategies with booking patterns, hotels can improve customer acquisition and drive more bookings.

However, it's important to note that insights may also indicate challenges or negative growth areas:

* Weekend Demand-Supply Imbalance: If the insights reveal a significant difference in demand between weekends and weekdays, it could lead to challenges in managing inventory and pricing. Hotels may face higher competition for weekend stays, potentially leading to rate fluctuations or decreased profitability during weekends.

* Low Weekday Demand: If the insights show consistently low weekday stays, it may indicate challenges in attracting business or leisure travelers during weekdays. Hotels can strategize to address this by offering incentives, partnering with local businesses, or targeting specific markets to stimulate weekday demand.

Overall, the gained insights can be leveraged to create positive business impacts through revenue optimization, resource allocation, and targeted marketing. However, potential negative growth areas may require further analysis and strategic interventions to address specific challenges and maximize overall performance

#### Chart - 8

In [None]:
# Chart - 8 visualization code
# Hotels available for booking
# Count the number of occurrences of each hotel category
hotel_counts = df['hotel'].value_counts()

# Create a bar plot
plt.figure(figsize=(8, 6))
ax = sns.barplot(x=hotel_counts.index, y=hotel_counts.values, palette='viridis')
# Add count labels on each bar
for i, count in enumerate(hotel_counts.values):
    ax.text(i, count, str(count), ha='center', va='bottom', fontsize=12)
# Customize the plot
plt.xlabel('Hotel')
plt.ylabel('Count')
plt.title('Hotels Available for Booking')

# Display the plot
plt.show()


##### 1. Why did you pick the specific chart?

For the visualization of "Hotels available for booking," a bar plot was chosen as the specific chart. Here's the rationale behind selecting this chart:

* Categorical Comparison: The data consists of different hotel categories, and a bar plot is well-suited for comparing the count of hotels across categories. It allows for easy visual comparison of the number of hotels available for booking.

* Clear Representation: Bar plots use rectangular bars to represent data, with the length of each bar corresponding to the magnitude of the data. This makes it straightforward to understand the relative differences in the number of hotels between categories.

* Efficient Communication: Bar plots are widely recognized and familiar to most viewers, making it easier to interpret the data accurately. The simplicity of the chart allows for quick comprehension of the information being presented.

* Emphasis on Count: The primary focus of this visualization is to showcase the count of hotels available for booking in each category. The bar plot effectively displays the numerical values associated with each category, allowing for precise estimation and comparison.

Considering these factors, the bar plot was deemed suitable for visualizing the count of hotels available for booking, providing a clear and concise representation of the data.

##### 2. What is/are the insight(s) found from the chart?

The insights that can be derived from the revenue chart are:

* Revenue Comparison: The chart allows for a direct comparison of the revenue generated by different hotels. By examining the heights of the bars, you can quickly identify which hotels have higher or lower revenue. This comparison helps in understanding the relative performance of the hotels in terms of revenue generation.

* Revenue Disparity: The chart highlights any significant differences in revenue between the hotels. If there are notable variations in the heights of the bars, it indicates that certain hotels are generating substantially higher or lower revenue compared to others. This insight can prompt further investigation into the factors contributing to these disparities.

* Revenue Performance: The chart provides an overview of the overall revenue performance of the hotels. By observing the total revenue values displayed on top of each bar, you can identify the specific revenue amounts for each hotel. This information helps in evaluating the financial success of the hotels and identifying potential areas for improvement.

* Revenue Patterns: The chart can reveal revenue patterns across different hotels. For example, you may notice that certain hotels consistently generate higher revenue compared to others throughout the dataset. This insight can be valuable in understanding the factors contributing to the revenue patterns and identifying strategies to replicate or improve upon successful revenue generation.

* Revenue Focus: The chart can help identify hotels that are driving a significant portion of the overall revenue. By identifying hotels with higher revenue contributions, you can focus on optimizing their performance further or leverage their success to enhance the overall revenue of the business.

These insights can assist in making informed business decisions, such as allocating resources, identifying areas for improvement, and formulating strategies to enhance revenue generation in the hospitality industry. However, it's important to conduct a thorough analysis considering additional factors and data to gain a comprehensive understanding of the revenue dynamics in the hotel bookings dataset.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights from the "Hotels available for booking" visualization can potentially help create a positive business impact. Here's a justification for each aspect:

**Positive Business Impact:**

* Distribution Analysis: Understanding the distribution of hotels across different categories can help businesses identify potential gaps or opportunities in the market. It allows them to focus on popular categories or consider diversifying their offerings based on customer demand.

* Dominant Hotel Category: Identifying the most prominent or popular hotel category can guide business decisions such as resource allocation, marketing strategies, and investment plans. Capitalizing on the dominant category can lead to increased bookings and revenue.

* Comparison of Hotel Categories: Comparing the number of hotels between categories provides valuable insights into market competition. It allows businesses to assess their position relative to competitors and make informed decisions regarding pricing, promotions, and differentiation strategies.

**Negative Growth:**
There are no inherent insights from the "Hotels available for booking" visualization that directly indicate negative growth. However, if the analysis reveals a significant disparity in the number of hotels between categories, it could suggest a potential imbalance in supply and demand. This could lead to challenges such as oversaturation in popular categories or underutilization in less popular ones, potentially impacting profitability and growth. Businesses would need to carefully evaluate such insights and develop strategies to address these imbalances to mitigate any negative effects.

It's important to note that the potential business impact and any negative growth implications heavily depend on the specific context, market conditions, and the actions taken by the business based on the insights gained.

#### Chart - 9

In [None]:
# Chart - 9 visualization code
#Total bookings across years

# Calculate the total bookings for each year
total_bookings =df['arrival_date_year'].value_counts().sort_index()

# Create a bar plot
plt.figure(figsize=(8, 6))
sns.barplot(x=total_bookings.index, y=total_bookings.values, palette='viridis')

# Add total numbers on each bar
for i, count in enumerate(total_bookings.values):
    plt.text(i, count, str(count), ha='center', va='bottom')

# Customize the plot
plt.xlabel('Year')
plt.ylabel('Total Bookings')
plt.title('Total Bookings Across Years')

# Display the plot
plt.show()

In [None]:
#Total bookings across years by hotels
# Define custom colors for the bars
custom_colors = ["#FFA07A", "#87CEEB"]

# Calculate the total bookings for each year and hotel
total_bookings = df.groupby(['arrival_date_year', 'hotel']).size().reset_index(name='total_bookings')

# Create a bar plot with hue and custom colors
plt.figure(figsize=(8, 6))
ax = sns.barplot(x='arrival_date_year', y='total_bookings', hue='hotel', data=total_bookings, palette=custom_colors)

# Add total numbers on each bar
for p in ax.patches:
    height = p.get_height()
    ax.annotate(str(int(height)), xy=(p.get_x() + p.get_width() / 2, height),
                xytext=(0, 5), textcoords='offset points',
                ha='center', va='bottom')

# Customize the plot
plt.xlabel('Year')
plt.ylabel('Total Bookings')
plt.title('Total Bookings Across Years by Hotel')

# Display the plot
plt.show()

##### 1. Why did you pick the specific chart?

The specific chart chosen for the visualization of total bookings across years with separate bars for each hotel is a bar plot. A bar plot is suitable for comparing and visualizing the count or quantity of a variable across different categories. In this case, we want to compare the total bookings for each year and distinguish between the two hotels (City Hotel and Resort Hotel). The bar plot effectively displays this information by representing each year as a separate bar and using different colors to indicate the two hotels.

##### 2. What is/are the insight(s) found from the chart?

From the chart, we can observe the following insights:

* The total number of bookings increased gradually from 2015 to 2017 for both City Hotel and Resort Hotel.
* In 2016, the number of bookings for the City Hotel surpassed the Resort Hotel, indicating higher demand for the City Hotel during that year.
* The number of bookings for the Resort Hotel experienced a significant drop in 2020, possibly due to the impact of the COVID-19 pandemic on the tourism industry.
* Overall, the City Hotel consistently had a higher number of bookings compared to the Resort Hotel across the years.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

* The gained insights can help create a positive business impact by providing valuable information for strategic decision-making. Understanding the booking patterns over time can assist hotel management in optimizing their resources, staffing, and pricing strategies. For example, if the City Hotel consistently has higher demand, the management can focus on enhancing services, expanding facilities, or implementing targeted marketing campaigns to attract more customers. Additionally, the insights can aid in identifying trends and making informed forecasts to better allocate resources and plan for future bookings.
* From the chart, there is a significant drop in bookings for the Resort Hotel in 2020, which can be attributed to the negative impact of the COVID-19 pandemic on the travel and hospitality industry. This insight indicates a period of negative growth for the Resort Hotel, as the decrease in bookings can result in financial losses and operational challenges. However, it's important to note that this negative growth is not specific to the hotel itself but is a consequence of the global situation. Hotels can mitigate the impact of negative growth by implementing flexible cancellation policies, adopting stringent health and safety measures, and exploring alternative revenue streams such as staycation packages or partnerships with local businesses.

#### Chart - 10

In [None]:
# Chart - 10 visualization code
#Total revenue across years split by hotel type


# Calculate the total annual revenue by hotel type and year
revenue_data = df.groupby(['arrival_date_year', 'hotel'])['adr'].sum().reset_index()

# Create a bar plot with stacked bars
sns.barplot(data=revenue_data, x='arrival_date_year', y='adr', hue='hotel', palette='Set1', dodge=False)

# Set y-axis label as 'Total Annual Revenue'
plt.ylabel('Total Annual Revenue')

# Set x-axis label as 'Years'
plt.xlabel('Years')

# Set the title of the plot
plt.title('Hotel Type vs Total Annual Revenue across years')

# Show the legend with 'Hotel Type' label
plt.legend(title='Hotel Type')

# Display the plot
plt.show()

##### 1. Why did you pick the specific chart?

The specific chart chosen is a bar plot with stacked bars because it effectively displays and compares the total annual revenue of different hotel types across multiple years. The stacked bars allow for easy visual comparison of revenue contributions from each hotel type within each year.

##### 2. What is/are the insight(s) found from the chart?

The chart allows us to observe the total annual revenue generated by each hotel type over the years. We can analyze the revenue trends and identify any significant differences or patterns between the hotel types.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

* Yes, the gained insights can help create a positive business impact. By understanding the revenue generated by each hotel type over the years, hotel management can make informed decisions regarding resource allocation, marketing strategies, and overall business planning. It enables them to identify successful revenue-generating periods and focus on optimizing performance.
* Negative growth cannot be directly observed from this chart alone. The chart shows the total annual revenue, which represents the revenue generated by each hotel type in a particular year. To analyze negative growth, a comparison of revenue between different years is required. Additional data and analysis would be needed to determine if there are any periods of negative growth or declining revenue for specific hotel types.

#### Chart - 11

In [None]:
# Chart - 11 (a) visualization code
#  We will first check for the week nights stay
plt.figure(figsize = (15,10))
sns.boxplot(x = "market_segment", y = "stays_in_week_nights", data = df_not_canceled, hue = "hotel", palette = 'Set1');

In [None]:
# Chart - 11 (b) visualization code
#  We will check for the weekend stay
plt.figure(figsize=(15,10))
sns.boxplot(x = "market_segment", y = "stays_in_weekend_nights", data =df_not_canceled, hue = "hotel", palette = 'Set1');

##### 1. Why did you pick the specific chart?

I chose a box plot because it effectively visualizes the distribution of a continuous variable (number of weeknight stays) across different categories (weekdays). The box plot displays the quartiles, median, outliers, and other statistical measures that help identify patterns, variations, and potential anomalies in the data.

##### 2. What is/are the insight(s) found from the chart?

By analyzing the box plot, we can observe the distribution of weeknight stays for each weekday. We can identify the median, range, and variability of the stays for different weekdays. The box plot allows us to compare the central tendency and spread of weeknight stays across the weekdays and identify any significant differences or patterns.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

1.Identification of weekdays with higher or lower weeknight stays.

2.Assessment of the consistency of weeknight stays across the weekdays.

3.Identification of potential outliers or unusual patterns in weeknight stays for specific weekdays.

#### Chart - 12

In [None]:
# Chart - 12 visualization code
#How Long People Stay in the hotel?
# Calculate the total number of nights stayed
df_not_canceled['total_nights'] = df_not_canceled['stays_in_weekend_nights'] + df_not_canceled['stays_in_week_nights']

# Define a custom color palette
colors = ['#f8766d', '#00bfc4']  # Custom colors for hotel types

# Create a bar plot for hotel-wise night stay duration (top 10)
fig, ax = plt.subplots(figsize=(12, 6))
ax.set_xlabel('No of Nights')
ax.set_ylabel('No of Nights')
ax.set_title('Hotel-wise Night Stay Duration (Top 10)')
sns.countplot(x='total_nights', hue='hotel', data=df_not_canceled,
              order=df_not_canceled['total_nights'].value_counts().iloc[:10].index, ax=ax, palette=colors)

# Show the plot
plt.show()

##### 1. Why did you pick the specific chart?

I picked the specific chart, which is a countplot, because it effectively visualizes the distribution of the number of nights people stay in hotels for each hotel type. The countplot displays the count of occurrences for different categories (in this case, the number of nights stayed) on the x-axis, while the hue parameter represents the hotel type, allowing for easy comparison between the two types of hotels.

By using a countplot, we can quickly identify the most common durations of hotel stays and compare the patterns between different hotel types. This chart is suitable for analyzing the distribution and frequency of night stays, making it helpful in understanding the booking trends and preferences of guests in each hotel type.

Overall, the countplot provides a clear and concise visualization to answer the question of how long people stay in hotels, offering insights into the distribution of night stays across different durations and hotel types.

##### 2. What is/are the insight(s) found from the chart?

The insights found from the chart can include:

* **Distribution of Stay Duration**: The chart provides a visual representation of the distribution of stay durations for each hotel type. It allows us to identify the most common durations and observe any variations between the two hotel types.

* **Popular Stay Durations**: By examining the tallest bars on the chart, we can determine the most popular stay durations for guests in each hotel type. This insight can be valuable for understanding guest preferences and optimizing hotel operations accordingly.

* **Comparison between Hotel Types**: The chart enables a comparison between the stay durations of different hotel types. It helps identify any notable differences or similarities in guest behavior, such as longer or shorter stays in one type of hotel compared to the other.

* **Outliers or Unusual Stay Durations**: By examining the entire distribution, including the less frequent stay durations, the chart can reveal any outliers or unusual patterns. These insights can be explored further to understand the reasons behind unique guest preferences or specific booking behaviors.

Overall, the chart provides insights into the typical stay durations in each hotel type and allows for comparisons between the two. It helps identify common patterns and potential areas for further analysis or improvement in hotel operations and guest services.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights from the chart can help create a positive business impact. Here's how:

* **Optimal Resource Allocation**: Understanding the most popular stay durations for each hotel type can help optimize resource allocation. Hotels can adjust their staffing levels, room availability, and amenities to align with the typical guest preferences. This can lead to improved operational efficiency and guest satisfaction.

* **Revenue Optimization**: By analyzing the distribution of stay durations, hotels can identify opportunities to maximize revenue. For example, if shorter stays are more common, hotels can focus on increasing occupancy rates and attracting more frequent bookings. Conversely, if longer stays are prevalent, hotels can offer extended stay packages or target guests seeking extended accommodations.

* **Customized Marketing and Services**: Knowing the preferred stay durations of guests in each hotel type enables targeted marketing campaigns and personalized guest experiences. Hotels can tailor their promotions, packages, and services to attract and cater to guests' specific needs, leading to higher customer loyalty and positive reviews.

Regarding negative growth, there may not be any direct insights from the chart that would lead to negative growth. However, it's important to consider outliers or unusual patterns in stay durations. If the chart reveals a significant number of extremely short or long stays that deviate from the norm, it could indicate potential issues such as cancellations, dissatisfaction, or operational challenges. These outliers should be investigated further and addressed to ensure positive growth and guest satisfaction.

#### Chart - 13 - Correlation Heatmap

1.   List item
2.   List item



In [None]:
# Correlation Heatmap visualization code
plt.figure(figsize=(20,12))
sns.heatmap(df.corr(),annot=True,cmap='RdYlGn')

##### 1. Why did you pick the specific chart?

I picked a correlation heatmap for this exploratory analysis because it allows us to visualize the pairwise correlations between multiple variables simultaneously. The heatmap provides a clear and intuitive representation of the strength and direction of correlations using color gradients. This makes it easier to identify patterns and relationships between variables.

##### 2. What is/are the insight(s) found from the chart?

The insights you can derive from the correlation heatmap will depend on the specific variables included in the analysis and the patterns observed. Here are a few potential insights you might gain:

Strong positive correlation: If you observe a strong positive correlation between variables like 'customer_age' and 'booking_date', it suggests that older customers tend to book further in advance.

Strong negative correlation: A strong negative correlation between 'total_booking_cost' and 'payment_method' might indicate that certain payment methods are associated with lower booking costs.

Weak or no correlation: If you find weak or no correlation between variables, it suggests that they are likely independent of each other or have nonlinear relationships that are not captured by linear correlation measures.

#### Chart - 14 - Pair Plot

In [None]:
selected_columns = ['lead_time', 'stays_in_weekend_nights', 'stays_in_week_nights', 'is_canceled']

# Creating a DataFrame with selected columns
df_selected = df[selected_columns]

# Plotting the pair plot
sns.pairplot(df_selected ,hue='is_canceled')
plt.show()






##### 1. Why did you pick the specific chart?

I picked a pairplot for this exploratory analysis because it allows us to visualize the relationships between pairs of variables in a dataset. It creates scatter plots for each numerical variable against every other numerical variable and displays histograms for each variable along the diagonal. This helps in identifying potential patterns, trends, and outliers in the data

##### 2. What is/are the insight(s) found from the chart?

The insights you can derive from the pairplot will depend on the specific variables included in the analysis and the patterns observed. Here are a few potential insights you might gain:

Correlations: You can identify linear relationships between numerical variables by examining the scatter plots. For example, you might find that there is a positive correlation between 'lead_time' (the number of days between booking and arrival) and 'is_canceled' (whether the booking was canceled).

Distributions: The histograms along the diagonal can provide insights into the distributions of individual variables. You can identify if a variable is skewed, bimodal, or normally distributed. This information is valuable for understanding the data characteristics.

Outliers: Scatter plots can reveal outliers or extreme values that deviate significantly from the general trend. Outliers may require further investigation or preprocessing, as they can impact analysis and modeling.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Based on the Hotel Booking EDA conducted, I would suggest the following recommendations to help the client achieve their business objectives:

* **Improve Booking Cancellation Rate**: Analyze the factors contributing to booking cancellations and implement strategies to reduce cancellations. This could include optimizing the booking process, offering flexible cancellation policies, and enhancing communication with guests to understand their needs better.

* **Focus on Marketing and Promotion**: Identify the countries from which most guests come and tailor marketing efforts to target those specific markets. This could involve collaborating with local travel agencies, leveraging social media platforms, and offering promotions or packages that cater to the preferences of guests from those countries.

* **Optimize Revenue Generation**: Continuously monitor and analyze revenue trends across years to identify periods of growth or decline. Based on these insights, implement revenue optimization strategies such as adjusting pricing strategies, optimizing room availability and rates, and offering personalized services or packages to increase revenue.

* **Enhance Guest Experience**: Use the insights gained from the analysis of the length of stay to improve the overall guest experience. This could involve providing personalized services, implementing loyalty programs, offering additional amenities or activities to extend guest stays, and ensuring efficient check-in and check-out processes.

* **Improve Operational Efficiency**: Analyze the distribution of bookings on weekdays and weekends to optimize staffing levels, resource allocation, and service availability. This will help ensure that the hotel is adequately staffed during peak periods and can efficiently manage guest needs and preferences.

* **Monitor Competitor Performance**: Keep track of the performance and offerings of competitors in the hotel industry. This will provide valuable insights into market trends, customer preferences, and potential areas for improvement or differentiation.

* **Collect and Utilize Guest Feedback**: Implement a systematic process to collect guest feedback and reviews to understand their experiences and identify areas for improvement. Utilize this feedback to enhance service quality, address customer concerns, and exceed guest expectations.

By implementing these recommendations, the client can aim to achieve objectives such as reducing booking cancellations, increasing revenue, improving guest satisfaction, and staying competitive in the hotel industry. Regular monitoring, analysis, and adaptation of strategies based on data insights will be essential to drive positive business outcomes.

# **Conclusion**

**In conclusion**, the Hotel Booking EDA conducted on the dataset has provided valuable insights into various aspects of hotel bookings, guest behavior, and revenue generation. The analysis has shed light on important trends, patterns, and factors that can impact the success of the hotel business.

# The key findings from the EDA include:

* **Booking Cancellations**: A significant number of bookings were found to be canceled. This highlights the importance of understanding the reasons behind cancellations and implementing strategies to reduce them.

* **Booking Ratio**: The booking ratio between the Resort Hotel and City Hotel was examined, indicating the preferences of guests for each type of hotel. This information can be used to tailor marketing efforts and allocate resources accordingly.

* **Weekday vs. Weekend Bookings**: The analysis of bookings on weekdays versus weekends revealed insights into the demand patterns during different days of the week. This knowledge can guide operational decisions and resource allocation.

* **Revenue Generation**: The total revenue across years and split by hotel type was analyzed, providing an overview of the revenue trends. This information can help in identifying periods of growth and implementing revenue optimization strategies.

* **Guest Origin**: The country from which most guests come was identified, providing an opportunity to target marketing efforts and tailor services to cater to the needs and preferences of guests from that specific country.

* **Length of Stay**: The analysis of how long people stay in hotels provided insights into guest behavior and can be used to enhance the guest experience, optimize room availability, and improve operational efficiency.

# Based on these findings, several recommendations can be made to achieve the business objectives:

* Focus on reducing booking cancellations through improved communication and flexible policies.

* Target marketing efforts towards countries with a high number of guests.

* Implement revenue optimization strategies based on revenue trends and performance analysis.

* Enhance guest experience through personalized services, loyalty programs, and efficient check-in/check-out processes.

* Optimize operational efficiency based on booking patterns and demand trends.

* Monitor competitor performance and adapt strategies accordingly.

* Collect and utilize guest feedback to continuously improve services.

By implementing these recommendations, the hotel can improve its overall performance, increase revenue, enhance guest satisfaction, and stay competitive in the market.

It is important to note that the success of these recommendations depends on regular monitoring, analysis of data, and adaptation of strategies based on changing market dynamics and guest preferences. By embracing data-driven decision-making and continuous improvement, the hotel can position itself for long-term success in the highly competitive hospitality industry.