<a href="https://colab.research.google.com/github/rutujagabhane/Hotel-Analysis-Project/blob/main/Hotel_Analysis_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    -Hotel Booking Analysis




##### **Project Type**    - EDA
##### **Contribution**    - Individual



# **Project Summary -**


The primary objective of this project is to analyze business data to identify insights such as seasonality and weekend visitation trends. By understanding patterns, strategies can be devised to enhance weekday customer engagement and maximize profits from weekend bookings we have. Additionally, the analysis aims to uncover actionable recommendations for improving overall business performance and sustainability. Through data-driven insights, targeted interventions can be implemented to drive growth and optimize operational efficiency.

# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


The dataset comprises booking information for a city hotel and a resort hotel, providing insights into various factors influencing hotel bookings. The objective is to analyze the data and derive actionable insights to address the following key questions:

When is the optimal time to book a hotel room for both city and resort hotels?
How does the length of stay correlate with the daily rate, and what is the optimal duration for cost-effectiveness?
Can we predict whether a hotel is likely to receive an unusually high number of special requests based on booking details?
What factors significantly impact booking decisions, including lead time, meal plans, market segments, and customer types?
How do demographic factors like the number of adults, children, and babies affect booking patterns?
What is the influence of available parking spaces on booking behavior and guest satisfaction?
What trends in cancellation rates can be identified, and how can we mitigate cancellations effectively?
What insights can we glean from the distribution of booking channels and market segments?
How does past booking history, including previous cancellations, influence future booking decisions?
How can the findings be leveraged to optimize revenue, enhance guest experience, and improve operational efficiency?

#### **Define Your Business Objective?**


The objective of the project is to analyze business data to understand seasonal trends and weekend visitor patterns, ultimately devising strategies to enhance weekend customer engagement and maximize profitability. Through rigorous analysis and strategic implementation, the project aims to uncover actionable insights that drive business growth and optimize performance in a dynamic market environment.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

### Dataset Loading

In [None]:
from google.colab import drive
drive.mount('/content/drive')


In [None]:
df = pd.read_csv('/content/drive/MyDrive/Almabetter/Hotel Bookings.csv')

In [None]:
df = pd.read_csv('/content/Hotel Bookings.csv')

### Dataset First View

In [None]:
# Dataset First Look
df

In [None]:
df.head()

In [None]:
df.tail()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
df.shape

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
df.drop_duplicates(inplace=True)
df

In [None]:
df.shape

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
df.isnull().sum()

In [None]:
# Visualizing the missing values
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Assuming df is your DataFrame
# Generate a heatmap of missing values
plt.figure(figsize=(50, 30))
sns.heatmap(df.isnull(), cmap='viridis', cbar=False, yticklabels=False)
plt.title('Missing Values Heatmap')
plt.show()


In [None]:
import matplotlib.pyplot as plt
import pandas as pd

# Assuming df is your DataFrame
# Calculate the number of missing values in each column
missing_values_count = df.isnull().sum()

# Plot the number of missing values
plt.figure(figsize=(50, 30))
missing_values_count.plot(kind='bar', color='skyblue')
plt.title('Missing Values by Column')
plt.xlabel('Columns')
plt.ylabel('Number of Missing Values')
plt.xticks(rotation=45, ha='right')  # Rotate x-axis labels for better visibility
plt.show()


### What did you know about your dataset?

In [None]:
df.columns

hotel                               :City Hotel,Resort Hotel

is_canceled                         :0(False),1(True)

lead_time                           :Number of days that elapsed between the entering date of the booking into the PMS and the arrival date

arrival_date_year                   :Year of arrival date 2015,2016,2017

arrival_date_month                  :Month of arrival date,Jan-dec

arrival_date_week_number            :Week number for arrival date

arrival_date_day_of_month           :1-31

stays_in_weekend_nights             :0-19

stays_in_week_nights                :0-50

adults                              :0-55

children                            :0,1,2,3,10

babies                              :0,1,2,9,10

meal                                :BB,FB,HB,SC,UNDEFINED

country:                            :Countr,Native Vountry name

market_segment                      :Aviation,Offline TA/TO,Complementary,Corporate,Direct,UNDEFINED,Groups,Online TA


distribution_channel                :Corporate,Direct,TA/TO,UNDEFINED,GDS

is_repeated_guest:                  :0(Frue),1(True)

previous_cancellations:             :0-26,Was there a cancellation before

previous_bookings_not_canceled      :0-7 not cancelled

reserved_room_type:A-P

assigned_room_type:A-P

booking_changes:0-21,Count of changes made to booking

deposit_type:No Deposit,Non Refund,Refundable

agent:RANDOM

company:40,185,225,405

days_in_waiting_list:0,53,15O N0 od days in Wating list

customer_type:40,185,Type of customer

adr:RANDOM

required_car_parking_spaces:0,1,2

total_of_special_requests:0,1,2,3

reservation_status:Check-Out,Canceled,NO_SHOW

reservation_status_date:Date of the specific status

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
df.columns

### Variables Description

In [None]:
# Dataset Describe
df.describe()




> Indented block


```
# This is formatted as code
```

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
print(df.apply(lambda col: col.unique()))

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
import numpy as np

# Replace missing values in the "country" column with NaN
df["country"].fillna(value='others', inplace=True)
df["agent"].fillna(value= 0, inplace=True)
df["company"].fillna(value= 0, inplace=True)
df.isnull().sum()

### What all manipulations have you done and insights you found?

Answer Here.

---



In [None]:
import numpy as np

# Replace missing values in the "country" column with NaN
df["is_canceled"].replace(1, 'is_canceled', inplace=True)
df["is_canceled"].replace(0,'not_canceled', inplace=True)

df["is_repeated_guest"].replace({0:'No',1:'Yes'}, inplace=True)

In [None]:
filtered_df = df[df["is_repeated_guest"] == 'Yes']
print(filtered_df["is_repeated_guest"].head(50))

In [None]:
df["Total_guest"] = df["adults"]+df["children"]+df["babies"]
df["Total_guest"]
df

In [None]:
df.info()

In [None]:
# Assuming df is your DataFrame

month_dict = {'January': '01', 'February': '02', 'March': '03', 'April': '04', 'May': '05', 'June': '06',
              'July': '07', 'August': '08', 'September': '09', 'October': '10', 'November': '11', 'December': '12'}


df["Arriving_date"] = pd.to_datetime(df["arrival_date_year"].astype(str) + '-' + df["arrival_date_month"].apply(lambda x: month_dict[x]) + '-' + df["arrival_date_day_of_month"].astype(str), errors='coerce')
df


In [None]:
import pandas as pd

df["reservation_status_date"] = pd.to_datetime(df["reservation_status_date"])
df["date_difference"] = (df["reservation_status_date"] - df["Arriving_date"])
df


In [None]:
df["total_stay"] = df["stays_in_weekend_nights"]+df["stays_in_week_nights"]
df["total_stay"]

In [None]:
df["revenue"]=df["total_stay"]*df["adr"]
df["revenue"]

In [None]:
hotel_type_wise_revenu = df.groupby("hotel")["revenue"].sum()
hotel_type_wise_revenu

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Assuming 'df' is your DataFrame

filtered_df1 = df[df["hotel"] == 'Resort Hotel'].count()["hotel"]
filtered_df2 = df[df["hotel"] == 'City Hotel'].count()["hotel"]

labels = ['Resort Hotel', 'City Hotel']
values = [filtered_df1, filtered_df2]

plt.figure(figsize=(8,8))
# Create a pie chart using Matplotlib
#plt.figure(figsize=(8, 8))
plt.pie(values, labels=labels , autopct='%1.1f%%') ##startangle=90, colors=plt.cm.Paired.colors)

# Set aspect ratio to be equal, ensuring the pie is drawn as a circle.
plt.axis('equal')

# Show the plot
plt.show()



##### 1. Why did you pick the specific chart?

1.This is a categorical data.

2.It is easy to visualise categorical data with pie chart when the categories are less.

##### 2. What is/are the insight(s) found from the chart?

1.The hotel booking for city hotel is more than resort hotel.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The insights from this chart is that people book more city hotel so we should increase the the number of hotels and services in city area to gain more profit.

#### Chart - 2

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Assuming 'df' is your DataFrame
##df["is_repeated_guest"].replace({0:"is_not_repeated_guest",1:"is_repeated_guest"},inplace= True)
df1 = df[df["is_repeated_guest"] == 'Yes'].count()["is_repeated_guest"]
df2 = df[df["is_repeated_guest"] == 'No'].count()["is_repeated_guest"]



In [None]:
labels = ['repeated_guest', 'nonrepeated_guest']
values = [df1, df2]

plt.figure(figsize=(8,8))
# Create a pie chart using Matplotlib
#plt.figure(figsize=(8, 8))
plt.pie(values, labels=labels, autopct='%1.1f%%') ##, startangle=90, colors=plt.cm.Paired.colors)

# Set aspect ratio to be equal, ensuring the pie is drawn as a circle.
plt.axis('equal')

# Show the plot
plt.show()

##### 1. Why did you pick the specific chart?

1.This is a categorical data.

2.It is easy to visualise categorical data with pie chart when the categories are less.

##### 2. What is/are the insight(s) found from the chart?

the pie chart actually shows that 96.1% of guests are non-repeat guests, and only 3.9% are repeat guests. This suggests a low customer retention rate, which can be a concern for a business.But




##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.






We cant say that it leads to negative growth but we required more information in order to identyfy which type of customer visits are maily their if the customer visiting for exploring the place then we can consider the above ratio oteherwise it lead to negative growt.The business might need to consider strategies to improve customer satisfaction and encourage repeat visits. Analyzing what differentiates the 27% who return could provide valuable insights into attracting and retaining future customers. It's important to consider if there are external factors impacting the customer base, such as new competitors or changes in the industry. It's crucial to remember that this is just a single data point, and further investigation would be needed to understand the full picture of the business's customer base and identify potential solutions.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
sns.barplot(x='distribution_channel',y='Total_guest',hue='distribution_channel', data=df, palette='Set2')

##### 1. Why did you pick the specific chart?



I choose this chart because it is designed to compare categories. In the  case of this chart, it is comparing the total number of guests across different distribution channels.Other charts, such as a pie chart, could also be used to represent this data. However, a pie chart is only a good choice for comparing two or three categories. Because this data has more than three categories, a bar chart is a better choice.

##### 2. What is/are the insight(s) found from the chart?

The main traffic is direct.
Here are some possible reasons for the increase in bookings through the Direct channel:

1.Increased popularity of online travel agencies (OTAs) that allow hotels to sell directly to consumers

2.Hotels are investing more in their own websites and marketing channels to drive direct bookings

3.Consumers are increasingly comfortable booking directly with hotels, knowing they can get the best rates and deals


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights from the bar graph can help create a positive business impact. By visualizing the total number of guests acquired through different distribution channels, the company can identify which channels are the most successful at bringing in guests.This information can be used to Focus marketing efforts on the most successful channels, Improve the performance of less successful channels and Develop new distribution channels.
Overall, the insights gained from the bar graph can help the company to improve its distribution strategy and increase its overall sales.

#### Chart - 4

---




In [None]:
df.columns

In [None]:
import matplotlib.pyplot as plt

df_3 = df[df["previous_cancellations"] == 1].count()["previous_cancellations"]
df_4 = df[df["previous_bookings_not_canceled"] == 1].count()["previous_bookings_not_canceled"]

plt.figure(figsize=(8, 8))
values = [df_3, df_4]
labels = ["is_canceled", "not_canceled"]
plt.pie(values, labels=labels, autopct='%1.1f%%')  # Add percentage labels

plt.show()


##### 1. Why did you pick the specific chart?





In this case, the categories are the number of bookings that were previously canceled and the number of bookings that were not previously canceled. Pie charts are a good choice for this type of data because they are easy to read and understand,the pie chart shows that there are more bookings in the "not_canceled" category than in the "is_canceled" category. This means that a higher percentage of bookings were not previously canceled.But we should focus on increasing the no of customers in not_cabcelled section.

##### 2. What is/are the insight(s) found from the chart?



Answer HereThe insight found from the chart is that slightly more than half of the events were not canceled (51.3%), while the remaining events were canceled (48.7%).In other words, the chart shows a nearly even split between canceled and not canceled events.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The pie chart helps!  Seeing mostly "not canceled" suggests low cancellation rates, leading to positive impacts like more revenue and happier guests.
However, a high cancellation slice or increasing trend over time could signal problems like unhappy guests or booking issues, potentially hurting growth.

#### Chart - 5

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Assuming df is your DataFrame
contry_wise_guest1 = pd.DataFrame({
    "country": df["country"],
    "Total_guest": df["Total_guest"]
})

# Calculating the sum of "Total_guest" for each country
contry_wise_guest = contry_wise_guest1.groupby("country")["Total_guest"].sum()

# Sorting the countries based on the total number of guests
sorted_countries = contry_wise_guest.sort_values(ascending=False).index

# Selecting the top 10 countries
top_10_countries = sorted_countries[:10]

# Filtering the DataFrame for the top 10 countries
df_sorted = contry_wise_guest1[contry_wise_guest1["country"].isin(top_10_countries)].copy()

# Convert the "country" column to categorical
df_sorted["country"] = pd.Categorical(df_sorted["country"])

# Creating a bar plot
plt.figure(figsize=(12, 6))
sns.barplot(x="country", y="Total_guest", hue="country" , data=df_sorted, palette="Set2")

# Set labels and title
plt.xlabel("Country")
plt.ylabel("Total Guests")
plt.title("Top 10 Countries by Total Number of Guests")

# Show the plot
plt.show()

##### 1. Why did you pick the specific chart?

It is a categorical data .The data consists the demand of differnt room type and we can easly find which room has more demend.

##### 2. What is/are the insight(s) found from the chart?

The demand of room A is more.So we should focus on avaibility of this room also what features can add to another room so if we need more A type room then we can adjust with others.Also avaibility of customers choice is prority for good revews.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The relative popularity of each room type can help businesses identify which types of rooms to invest in.The growth or decline of the popularity of a particular room type over time can help businesses identify trends and make adjustments to their room inventory.the insights gained from a bar graph of reserved rooms by type can be a valuable tool for businesses to improve their room offerings and increase their overall profitability.

#### Chart - 6

In [None]:
# Chart - 6 visualization code
# Creating a bar plot
plt.figure(figsize=(12, 6))
sns.barplot(x="country", y="Total_guest", hue="country", data=df_sorted, palette="Set2", legend=False)

# Set labels and title
plt.xlabel("Country")
plt.ylabel("Total Guests")
plt.title("Top 10 Countries by Total Number of Guests")

# Show the plot
plt.show()




##### 1. Why did you pick the specific chart?

---



the chart is comparing the total number of guests from different countries.
The chart is comparing a single variable (total number of guests) across multiple categories (countries),The data is categorical (countries) rather than continuous and the main thing is that it  focus is on comparing the total number of guests between different countries.

##### 2. What is/are the insight(s) found from the chart?

From above information the most visitors are from Brazil.We should add more food from this country and know more about their culture.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

No it just showing that which county visitors are intrested more.

#### Chart - 7

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Define the order of months
month_order = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]

# Convert the "arrival_date_month" column to a categorical data type with the specified order
df["arrival_date_month"] = pd.Categorical(df["arrival_date_month"], categories=month_order, ordered=True)

plt.figure(figsize=(12, 6))
sns.barplot(x="arrival_date_month", y="Total_guest", hue= "arrival_date_month" ,data=df, palette="Set3")
plt.xticks(rotation=45)  # Rotate x-axis labels for better readability
plt.show()


##### 1. Why did you pick the specific chart?

 I choose this chart because it was a good choice to visualize the data because it effectively shows the number of guest per month. Bar charts are ideal for displaying data that is categorized along one axis, with heights of the bars representing values for each category.It is easy to compare the number of guest across different months because the bars have the same width.Other chart types, like a line chart, might not have been as suitable. Line charts are typically used to show trends over time.

##### 2. What is/are the insight(s) found from the chart?

The insight found from the chart is that the number of guests has been increasing over time.  The months of Janauary through july show a steady increase in the number of guests, with the highest number of guests in Augast.  There is then a decrease in the number of guests in September to November.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.


Yes, the gained insights from the graph  could help create a positive business impact. The graph shows an overall increasing trend in the number of total guests over the past year, which suggests increasing demand. This could be helpful for businesses in planning for the future, such as staffing or inventory levels.

However, there are also some insights that could lead to negative growth.  For instance, there is a significant decrease in the number of guests between September to November. This could be a cause for concern, as it suggests a seasonal dip in business. Businesses would need to take steps to mitigate this, such as by running promotions or discounts during this time period.



#### Chart - 8

In [None]:
# Chart - 8 visualization code
import seaborn as sns
import matplotlib.pyplot as plt

def bar_chart(dataframe, x_column, hue_column):
    sns.countplot(x=x_column, hue=hue_column, data=dataframe)
    plt.show()

# Example usage
bar_chart(df, 'assigned_room_type', 'assigned_room_type')

##### 1. Why did you pick the specific chart?

Room type is a categorical data and it is good to use bar plot.From the chart of booked hotel type we come to know that the demand for room typr A is greater so the assigned room type A is assigned mostly.

##### 2. What is/are the insight(s) found from the chart?

BY comparing both the graph we get on conclusion that we are good at meeting customers demand.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The insights lead to positive business impact as we have sufficient rooms.As the demand is meeting the customers expectetation .As a room is assigned the most we can say that we almost meets the customer need but we have to take look on other roomswht they are not in demand.

#### Chart - 9

In [None]:
# Chart - 9 visualization code
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(8, 6))
sns.barplot(x="hotel", y="adr", hue="hotel", data=df, palette="Set2", dodge=False)
plt.xlabel("Hotel")
plt.ylabel("Average Daily Rate (ADR)")
plt.title("Average Daily Rate (ADR) by Hotel")
##plt.legend([], frameon=False)  # To remove the legend
plt.show()


##### 1. Why did you pick the specific chart?

The chart is a bar chart that compares the average daily rate (ADR) of two different hotel types: Resort Hotel and City Hotel.  A bar chart is a good choice for this data because it effectively compares different categories.

##### 2. What is/are the insight(s) found from the chart?

The chart shows that resort hotels have a higher ADR than city hotels. This means that people typically pay more per night to stay in a resort hotel than a city hotel. There are a few reasons why this might be the case.Resort hotels are often located in desirable destinations, such as beaches or mountains. People may be willing to pay more to stay in a location that offers them a chance to relax and enjoy the scenery.Resort hotels typically offer more amenities than city hotels, such as swimming pools, restaurants, and spas. People may be willing to pay more for a hotel that offers them these amenities.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The graph shows that resort hotels have a higher ADR than city hotels. This suggests that people are willing to pay more for rooms at resort hotels. This information can be used by businesses to make decisions about pricing, marketing, and investment. By understanding the ADR of different hotel types, businesses can make more informed decisions about pricing, marketing, and investment.

#### Chart - 10

In [None]:
# Chart - 10 visualization code
df.columns

In [None]:
#mport seaborn as sns
##import matplotlib.pyplot as plt
##plt.figure(figsize=(12,10))
#Assuming you have a DataFrame named 'df' with columns 'meal', 'Total_guest', and 'hotel'
#sns.barplot(x='meal', y='Total_guest', hue='hotel', data=df)
#plt.show()
hotel_wise_meal=df.groupby(["hotel","meal"])["meal"].count().unstack()
hotel_wise_meal.plot(kind = 'bar',figsize =(10,6))
hotel_wise_meal

##### 1. Why did you pick the specific chart?

I choose this chart because it shows comparision between hotel type along with drill through diffrent type of fmeals in each hotel type.

##### 2. What is/are the insight(s) found from the chart?

In both city_hotel and resort hotel the BB type of meal is preffered more so we should increase the varity in similar way.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

No insights does not lead to negative growth.This information can be valuable for the hotels in planning menus, staffing during peak meal times, and potentially influencing marketing strategies.

#### Chart - 11

In [None]:
# Chart - 11 visualization code
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 6))
sns.barplot(x="market_segment", y="Total_guest", hue="market_segment", data=df, palette="Set3", dodge=False)
plt.xlabel("Market Segment")
plt.ylabel("Total Guests")
plt.title("Total Guests by Market Segment")
plt.xticks(rotation=45)  # Rotate x-axis labels for better readability
plt.legend([], frameon=False)  # To remove the legend
plt.show()


##### 1. Why did you pick the specific chart?

 It is a good for visualizing the number of guests by market segment by bar chart because it allows for easy comparison between the different segments. Each market segment has its own bar on the chart, and the height of the bar corresponds to the number of guests in that segment. This makes it easy to see which segment has the most guests, and how the other segments compare.

##### 2. What is/are the insight(s) found from the chart?

---



we can see that the hotel has a variety of market segments, including Direct, Corporate, Online TA, Offline TA/TO, Complementary, Groups, and Undefined. The largest segment is Total Guests, followed by Online TA. This suggests that the hotel may get a significant number of bookings through online travel agencies.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Understanding guest segments from a bar graph sparks better marketing. Bigger bars mean bigger opportunities. Businesses can target high-potential segments with focused campaigns. Lower numbers might indicate areas for improvement, like tracking undefined sources. Analyzing trends over time allows for strategic adjustments. Overall, this data is key to attracting more guests and boosting business.

#### Chart - 12

In [None]:
#correlation heatmap visualization code
corr_df = df[['lead_time','previous_cancellations','previous_bookings_not_canceled','Total_guest','booking_changes','days_in_waiting_list','adr','required_car_parking_spaces'
,'total_of_special_requests']].corr()
f,ax =plt.subplots(figsize=(12,12))
sns.heatmap(corr_df, annot = True,fmt='.2f', annot_kws={'size':10},vmax=1, square=True,cmap="YlGnBu")

##### 1. Why did you pick the specific chart?

To comprehend the relationships among various numerical quantities

##### 2. What is/are the insight(s) found from the chart?

The axis's highest corelation value is 39% positive and its lowest correlation value is -9% negative.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

The following business aim was met :
1. In order for the hotel industry to prosper, a few factors must be taken into account, including high revenue generation, customer happiness, and employee retention.
2. By using a pie chart distribution, we can demonstrate to the client which months generate the most money.
3. Increasing revenue by using a bar chart to show which types of rooms are most frequently booked and when visitors are most likely to travel.
4. As a result, the customer can be properly prepared in advance, minimising long-term complaints and contributing to further improvement
of their hospitality, To encourage clients to contact offices for bulk reservations during the off-season, outliers such as larger visitor numbers than average
were sprinkled across the plot. This helped generate more money.
6. We can display the visitor arrivals trend at client venues, allowing clients to schedule visitors in advance for their entertainment and leisure activities.
7. In order for the percentages underneath those numbers to be improved by a variety of mediums, we were also able to correlate the values indicating the maximum and minimum % between them.

# **Conclusion**

When is the optimal time to book a hotel room for both city and resort hotels?
How does the length of stay correlate with the daily rate, and what is the optimal duration for cost-effectiveness?

1. City Hotel generates greater income and profit and appears to be more popular among travellers.

2. Compared to the other months, the majority of reservations are made in July and August.we can predict whether a hotel is likely to receive an unusually high number of special requests based on booking details and monthly demend


3. Travellers favour accommodation Type A over all other accommodation types.
,demographic factors like the number of adults, children, and babies directly affect booking patterns for room type.

4. As a result, the customer can be properly prepared in advance, minimising long-term complaints and contributing to further improvement
of their hospitality.

5.  The most traffic is from online source as from online souce customer nedd less brokarage cost.

6.   To encourage clients to contact offices for bulk reservations during the off-season, outliers such as larger visitor numbers than average
were sprinkled across the plot. This helped generate more money.







### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***