# **Project Name**    - Hotel Booking Analysis



##### **Project Type**    - EDA
##### **Contribution**    - Individual


# **Project Summary -**

The Hotel Booking Analysis dataset includes information about the booking frequency and length of stay, as well as customer booking preferences and behaviors. The dataset also includes information about the characteristics of the hotel such as the type of room booked, the price, and the cancellation policy. The data can be used to analyze the customer booking patterns and predict future booking trends. Additionally, the data can be used to identify customer preferences and tailor marketing and pricing strategies accordingly.

# **GitHub Link -**

https://github.com/Dipak9699-ds/Internship/tree/main/Almabetter%20Capstone%20Projects

# **Problem Statement**


**For this project we will analysing Hotel Booking data. This data set contains booking information for a city hotel and a resort hotel, and includes information such as when the booking was made, length of stay, the number of adults, children, or babies, and the number of available parking spaces.**

**Hotel industry is a very voltile industry and the booking depends on above factors and many more. Our main objective is perform EDA on the given dataset and draw useful conclusions about general trends in hotel bookings and how factors governing hotel bookings interact with each other.**

**The main objective behind this project is to explore and analyse data to discover important factor that govern the bookings and give insights to hotel management, which can perform various campaigns to boost the business and performance.**

#### **Define Your Business Objective?**

The business objective of the hotel booking dataset is to identify patterns in customer behavior in order to optimize hotel booking decisions and maximize revenue. This includes analyzing customer demographics, booking preferences, and other factors that influence bookings. Additionally, the data can be used to identify potential areas of improvement, such as offering targeted promotions or discounts to certain customer segments or providing better customer service.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required. 
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits. 
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule. 

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

### Dataset Loading

In [None]:
# Mounting Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Load Dataset
hotel_df = pd.read_csv(r'/content/drive/MyDrive/AlmaBetter Capstone Projects/Datasets/Hotel Bookings.csv')

### Dataset First View

In [None]:
# Dataset First Look
hotel_df.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
hotel_df.shape

### Dataset Information

In [None]:
# Dataset Info
hotel_df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate or Non Duplicate Value Count
hotel_df.duplicated().value_counts()

In [None]:
# Dataset Duplicate Value Count
len(hotel_df[hotel_df.duplicated()])

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
print(hotel_df.isna().sum())

In [None]:
# Visualizing the missing values
sns.heatmap(hotel_df.isnull(), cbar=False)

### What did you know about your dataset?

The dataset given is a dataset from Hotel industry, and we have to analysis the Hotel booking data.

This dataset contains booking information such as when the booking was made, length of stay, the number of adults, children, and the number of available parking spaces.

Hotel industry is a very volatile industry and the bookings depends on above factors and many more.

The above dataset has 119390 rows and 32 columns. There are mising values in country, agent, and company columns and also duplicate values in the dataset.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
hotel_df.columns

In [None]:
# Dataset Describe
hotel_df.describe(include='all').style.background_gradient()

######We see that there are 32 columns in the dataframe and some columns like 'children', 'company', 'country' and 'agent' have null values.

### Variables Description 

- **hotel**: Name of hotel (City or Resort)
- **is_canceled**: Whether the booking is canceled or not (0 for no canceled and 1 for canceled)
- **lead_time**: time (in days) between booking transaction and actual arrival.
- **arrival_date_year**: Year of arrival date.
- **arrival_date_month**: Month of arrival date.
- **arrival_date_week_number**: Week number of arrival date.
- **arrival_date_day_of_month**: Day of month of arrival date.
- **stays_in_weekend_nights**: No. of weekend nights spent in a hotel
- **stays_in_week_nights**: No. of weeknights spent in a hotel
- **adults**: No. of adults in single booking record.
- **children**: No. of children in single booking record.
- **babies**: No. of babies in single booking record. 
- **meal**: Type of meal chosen.
- **country**: Country of origin of customers (as mentioned by them)
- **market_segment**: What segment via booking was made and for what purpose.
- **distribution_channel**: Via which medium booking was made.
- **is_repeated_guest**: Whether the customer has made any booking before(0 for No and 1 for Yes)
- **previous_cancellations**: No. of previous bookings cancelled by the customer prior to the current booking.
- **previous_bookings_not_canceled**: No. of previous bookings non-cancelled by the customer prior to the current booking.
- **reserved_room_type**: Room type reserved by a customer.
- **assigned_room_type**: Room type assigned to the customer.
- **booking_changes**: No. of booking changes done by customers.
- **deposit_type**: Type of deposit at the time of making a booking (No deposit/ Refundable/ No refund)
- **agent**: Id of the travel agent that made booking.
- **company**: Id of the company that made a booking.
- **days_in_waiting_list**: No. of days the booking was in the waiting list before it was confirmed to the customer.
- **customer_type**: Type of customer(Transient, Group, etc.)
- **adr**: Average Daily rate as defined by dividing the sum of all loading transaction by the total number of staying nights.
- **required_car_parking_spaces**: No. of car parking spaces required by the customer.
- **total_of_special_requests**: total no. of special request made by customer.
- **reservation_status**: Whether a customer has checked out or canceled, or not showed.
- **reservation_status_date**: Date of making reservation status.

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for column in hotel_df.columns:
  print(str(column) + ' : ' + str(hotel_df[column].unique()))
  print('____________________________________________')

In [None]:
# Check Unique Values for each variable.
for i in hotel_df.columns.tolist():
  print("No. of unique values in ",i,"is",hotel_df[i].nunique(),".")

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
# Create a copy of the current dataset and assigning to df
hotel_df1 = hotel_df.copy()

##Cleaning data
Cleaning data is crucial step before EDA as it will remove the ambigous data that can affect the outcome of EDA.

While cleaning data we will perform following steps: \

*   Remove duplicate rows.
*   Handling missing values.

*   Convert columns to appropriate datatypes.
*   Adding impoertant columns.




##Step 1: Removing duplicate rows if any

In [None]:
# Show no. of rows of duplicate rows
hotel_df1[hotel_df1.duplicated()].shape

In [None]:
# Dropping duplicate values
hotel_df1.drop_duplicates(inplace = True)

In [None]:
# Show no. of rows after remove duplicate rows
hotel_df1.shape

##Step2: Handling missing values.

In [None]:
# Check columns having missing values.
hotel_df1.isnull().sum().sort_values(ascending = False)

In [None]:
# Show only missing value columns.
hotel_df1.isnull().sum().sort_values(ascending = False)[:4]

Since, company and agent columns have comany number and agent numbers as data. There may be some cases when customer didnt booked hotel via any agent or via any company. So in that case values can be null under these columns.
We will replace null values by 0 in these columns

In [None]:
hotel_df1[['company','agent']] = hotel_df1[['company','agent']].fillna(0)

In [None]:
hotel_df1['children'].unique()

This column 'children' has 0 as value which means 0 children were present in group of customers who made that transaction.
So, 'nan' values are the missing values due to error of recording data.

We will replace the null values under this column with mean value of children.

In [None]:
hotel_df1['children'].fillna(hotel_df1['children'].mean(), inplace = True)

Next column with missing value is 'country'. This column represents the country of oriigin of customer.
Since, this column has datatype of string. We will replace the missing value with the mode of 'country' column.

In [None]:
hotel_df1['country'].fillna('others', inplace = True)

In [None]:
# Checking if all null values are removed.
hotel_df1.isnull().sum().sort_values(ascending = False)[:4]

There are some rows with total number of adults, children or babies equal to zero. So we will remove such rows.

In [None]:
hotel_df1[hotel_df1['adults'] + hotel_df1['babies'] + hotel_df1['children'] == 0].shape

In [None]:
hotel_df1.drop(hotel_df1[hotel_df1['adults'] + hotel_df1['babies'] + hotel_df1['children'] == 0].index, inplace = True)

##Step 3: Converting columns to appropriate datatypes.

In [None]:
# Converting datatype of columns 'children', 'company' and 'agent' from float to int.
hotel_df1[['children', 'company', 'agent']] = hotel_df1[['children', 'company', 'agent']].astype('int64')

In [None]:
# Changing datatype of column 'reservation_status_date' to data_type.
hotel_df1['reservation_status_date'] = pd.to_datetime(hotel_df1['reservation_status_date'], format = '%Y-%m-%d')

##Step 4: Adding important columns.

In [None]:
# Adding total staying days in hotels
hotel_df1['total_stay'] = hotel_df1['stays_in_weekend_nights'] + hotel_df1['stays_in_week_nights']

# Adding total people num as column, i.e. total people num = num of adults + children + babies
hotel_df1['total_people'] = hotel_df1['adults'] + hotel_df1['children'] + hotel_df1['babies']

We are adding this column so that we can analyse the stay length at hotels.

In [None]:
hotel_df1.shape

### What all manipulations have you done and insights you found?

Cleaning data is crucial step before EDA as it will remove the ambigous data that can affect the outcome of EDA. So First I have check duplicate rows and after that remove duplicate rows from dataset. 

After that I have check null values and replace the null values by 0 and mean as per perticular data. 

After that I have convert datatypes of some column like children, company and agent columns float datatype into int and convert reservation_status_date column object datatype into datetime. 

In the last I have adding two extra columns like total_stay and total_people for better visual understanding. In total_stay column I have merge 2 columns stays_in_weekend_nights and stays_in_week_nights and in total_people I have merge 3 columns adults, childrens and babaies. 

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#EDA (Exploratory Data Analysis)
Univariate Analysis:-

#### Chart - 1

Q1) Which hotel is most preffered?

In [None]:
# Chart - 1 Visualization code
# Visualizsing by pie chart.
hotel_df1['hotel'].value_counts().plot.pie(explode = [0.05, 0.05], autopct = '%1.1f%%', shadow=True, figsize = (10,8), fontsize=20)   
plt.title('Pie Chart for Most Preffered  Hotel')

##### 1. Why did you pick the specific chart?

A pie chart expresses a part-to-whole relationship in your data. It's easy to explain the percentage comparison through area covered in a circle with different colors. Where differenet percentage comparison comes into action pie chart is used frequently. So, I used Pie chart and which helped me to get the percentage comparision of the feature variable.

##### 2. What is/are the insight(s) found from the chart?

City hotels is the most preferred hotel type by the guests. We can say that City hotel is the busiest hotel.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on city or resort hotel bookings, as they can be used to identify customer preferences and trends in the hotel industry. For example, these insights could help hotels identify peak booking times and optimize their marketing efforts, or they could be used to offer personalized discounts and packages to target customers.

No, there are no insights that lead to negative growth. The insights gained from data analysis can help businesses make informed decisions and provide better customer experiences. As long as the data is used responsibly, the insights should only have a positive impact on the business.

#### Chart - 2

Q2) Which Agent made the most bookings?

In [None]:
# Return highest bookings made by agents
highest_bookings = hotel_df1.groupby(['agent'])['agent'].agg({'count'}).reset_index().rename(columns = {'count': "Most_Bookings" }).sort_values(by = 'Most_Bookings', ascending = False)

# As agent 0 was NAN value and we replaced it with 0 and indicates no bookings.so droping.
highest_bookings.drop(highest_bookings[highest_bookings['agent'] == 0].index, inplace=True) 

# Taking top 10 bookings made by agent
top_ten_highest_bookings = highest_bookings[:10]

top_ten_highest_bookings

In [None]:
# Chart - 2 Visualization code
# Visualizsing by bar chart.
plt.figure(figsize = (18,8))
sns.barplot(x = top_ten_highest_bookings['agent'], y = top_ten_highest_bookings['Most_Bookings'], order = top_ten_highest_bookings['agent'], palette = 'rocket')
plt.xlabel('Agent No')
plt.ylabel('Number of Bookings')
plt.title("Most Bookings Made by the agent")

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

Agent ID no: 9 made most of the hotel bookings.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on most hotel bookings made by the agent. For instance, the insights might reveal certain booking patterns or preferences of customers that the agent can capitalize on, such as offering discounts or special packages to certain customers. Additionally, the insights could also provide the agent with valuable information about the most popular hotels in a given area, allowing them to make better-informed decisions when they are booking hotels.

However, there could also be insights that lead to negative growth. For example, if the insights reveal that customers are consistently booking the same hotels or not taking advantage of discounts or packages being offered, this could lead to a decrease in revenue for the agent. Additionally, if the insights reveal only a limited number of customers that are booking hotels, this could also lead to a decrease in revenue.

#### Chart - 3

3) What is the Percentage of repeated guests?

In [None]:
# Chart - 3 Visualization code
# Visualizsing by pie chart.
hotel_df1['is_repeated_guest'].value_counts().plot.pie(explode = [0.05, 0.05], autopct = '%1.1f%%', shadow = True, figsize = (10,8), fontsize=20)
plt.title(" Percentgae (%) of repeated guests")

##### 1. Why did you pick the specific chart?

A pie chart expresses a part-to-whole relationship in your data. It's easy to explain the percentage comparison through area covered in a circle with different colors. Where differenet percentage comparison comes into action pie chart is used frequently. So, I used Pie chart and which helped me to get the percentage comparision of the feature or target variable.

##### 2. What is/are the insight(s) found from the chart?

Only 3.9% people were revisited the hotels. Rest 96.1% were new guests. Thus retention rate is low. In order to retained the guests management should take feedbacks from guests and try to imporve the services.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on repeated guests for hotel bookings. For example, the insights can be used to identify patterns in booking behavior, customer preferences, and loyalty. This can then be used to tailor promotions, discounts, and offers to entice returning customers and improve customer loyalty. Additionally, the insights can be used to develop customer service strategies to ensure a positive customer experience and help retain customers.

No, there are no insights that lead to negative growth. The insights gained from analyzing customer data can only be used to create a positive business impact on hotel bookings. The data can help to identify customer preferences and develop strategies to better serve them, leading to improved customer loyalty and increased revenue.

#### Chart - 4

4) What is the pecentage of cancellation?

In [None]:
# Chart - 4 Visualization code
# Visualizsing by pie chart.
hotel_df1['is_canceled'].value_counts().plot.pie(explode = [0.05, 0.05], autopct = '%1.1f%%', shadow = True, figsize = (10,8), fontsize = 20)
plt.title("Cancellation and non Cancellation")

##### 1. Why did you pick the specific chart?

A pie chart expresses a part-to-whole relationship in your data. It's easy to explain the percentage comparison through area covered in a circle with different colors. Where differenet percentage comparison comes into action pie chart is used frequently. So, I used Pie chart and which helped me to get the percentage comparision of the feature or target variable.

##### 2. What is/are the insight(s) found from the chart?

0= not cancled\
1= canceled\
27.5% bookings were got cancelled out of all the bookings.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights can definitely help create a positive business impact on the cancellation of hotel bookings. For example, the insights might reveal that certain customer segments are more likely to cancel their bookings and that certain booking policies are more likely to result in cancellations. This information can then be used to develop more effective booking policies and to target marketing campaigns to the customer segments who are more likely to cancel.

However, it is also possible that the insights could lead to negative growth. For example, if the insights suggest that customers are cancelling bookings because of higher prices, then the hotel might have to lower its prices to reduce cancellations, which could lead to lower profits.

#### Chart - 5

5) What is the percentage distribution of required_car_parking_spaces?

In [None]:
# Chart - 5 Visualization code
# Visualizsing by pie chart.
hotel_df1['required_car_parking_spaces'].value_counts().plot.pie(explode = [0.05] * 5, autopct = '%1.1f%%', shadow = False, figsize = (12,8), fontsize = 15, labels = None)

labels = hotel_df1['required_car_parking_spaces'].value_counts().index
plt.title('% Distribution of required car parking spaces')
plt.legend(bbox_to_anchor=(0.85, 1), loc='upper left', labels=labels)

##### 1. Why did you pick the specific chart?

A pie chart expresses a part-to-whole relationship in your data. It's easy to explain the percentage comparison through area covered in a circle with different colors. Where differenet percentage comparison comes into action pie chart is used frequently. So, I used Pie chart and which helped me to get the percentage comparision of the feature or target variable.

##### 2. What is/are the insight(s) found from the chart?

91.6% guests did not required the parking space. only 8.3 % guests required only 1 parking space.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on required car parking spaces for hotel bookings. Insights such as the number of cars associated with each booking, the type of cars associated with each booking, and the time length of car parking can help the hotel to better plan for the car parking spaces needed for each booking. This can help the hotel to better project their car parking needs and make sure they always have enough spaces to accommodate the number of bookings they receive.

No, there are no insights that lead to negative growth. All of the insights taken from the data can help the hotel to better plan for their car parking needs and make sure they always have enough spaces to accommodate their bookings.

#### Chart - 6

6) What is the percentage distribution of "Customer Type"?

In [None]:
# Chart - 6 Visualization code
# Visualizsing by pie chart.
hotel_df1['customer_type'].value_counts().plot.pie(explode = [0.05] * 4, shadow = True, autopct = '%1.1f%%', figsize = (12,8), fontsize = 15, labels = None)

labels = hotel_df1['customer_type'].value_counts().index.tolist()
plt.title('% Distribution of Customer Type')
plt.legend(bbox_to_anchor = (0.85, 1), loc = 'upper left', labels = labels)

##### 1. Why did you pick the specific chart?

A pie chart expresses a part-to-whole relationship in your data. It's easy to explain the percentage comparison through area covered in a circle with different colors. Where differenet percentage comparison comes into action pie chart is used frequently. So, I used Pie chart and which helped me to get the percentage comparision of the feature or target variable.

##### 2. What is/are the insight(s) found from the chart?

Most of the customers/guests were Transient type(82.4%) and transient party were 13.4% and 0.6% belongs to group and remainning guests belongs to contract type.

1. Contract

  when the booking has an allotment or other type of contract associated to it

2. Group

  when the booking is associated to a group

3. Transient

  when the booking is not part of a group or contract, and is not associated to other transient booking

4. Transient-party

  when the booking is transient, but is associated to at least other transient booking

Transient customer type is more which is 82.4 %. percentage of Booking associated by the Group is vey low.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on customer type for hotel bookings. The insights can help identify popular customer types, determine which customer types are generating the most revenue, and identify which customer types are the most loyal. This information can be used to target the right types of customers and tailor the hotel’s services and marketing campaigns to best meet their needs. Additionally, the insights can help identify areas of improvement, such as areas where certain customer types are not being targeted or where customer service could be improved.

However, there are also potential negative impacts from the insights. For example, if a hotel targets a certain customer type too heavily, it could lead to an oversaturation of that type of customer and lead to decreased profits. Additionally, focusing too heavily on one type of customer could lead to neglecting other customer types and could lead to a decrease in overall customer loyalty.

#### Chart - 7

7) What is the percentage of booking changes made by the customer?

In [None]:
# Chart - 7 Visualization code
# Visualizsing by bar chart.
booking_changes_df = hotel_df1['booking_changes'].value_counts().reset_index().rename(columns = {'index':"number_booking_changes", 'booking_changes':'Counts'})

plt.figure(figsize = (12,8))
sns.barplot(x = booking_changes_df['number_booking_changes'], y = booking_changes_df['Counts'] * 100 / hotel_df1.shape[0], palette = 'Set2')
plt.title("% of Booking change")
plt.xlabel('Number of booking changes')
plt.ylabel('Percentage(%)')

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

Almost 82% of the bookings were not changed by guests.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on booking changes made by the customer for hotel bookings. For example, the insights can be used to better understand customer preferences and make more accurate predictions about customer behaviour, which can lead to more efficient marketing strategies and improved customer satisfaction. Additionally, the insights can be used to identify customer needs and make recommendations that will encourage customers to make more changes to their existing bookings.

However, there is a possibility that the insights may lead to negative growth, particularly if the insights are misinterpreted or used in an inappropriate manner. For example, if the insights are used to create overly aggressive marketing strategies or to target customers with irrelevant or intrusive offers, this could lead to customer dissatisfaction and cause customers to avoid booking changes altogether.

#### Chart - 8

8) Which type of food is mostly preferred by the guests?

In [None]:
# Chart - 8 Visualization code
# Visualizsing by count plot.
plt.figure(figsize = (18,8))
sns.countplot(x = hotel_df1['meal'], palette = 'viridis')
plt.xlabel('Meal Type')
plt.ylabel('Count')
plt.title("Preferred Meal Type")

##### 1. Why did you pick the specific chart?

The countplot is used to represent the occurrence(counts) of the observation present in the categorical variable and it is used to show the counts of observations in each categorical bin using bars.

##### 2. What is/are the insight(s) found from the chart?

Types of meal in hotels:

1.   BB - (Bed and Breakfast)
2.   HB- (Half Board)
3.   FB- (Full Board)
4.   SC- (Self Catering)

So the most preferred meal type by the guests is BB( Bed and Breakfast).
HB(Half Board) and SC(Self Catering) are equally preferred.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights may help create a positive business impact on which type of food is mostly preferred by the guests for hotel bookings. With the insights gained, hotels could use the data to identify which types of food are most popular among their guests so they can adjust their menu offerings or pricing to better meet their customer needs.

However, there are also potential negative impacts of the insights. For example, if the data shows that certain types of food are unpopular, the hotel may choose to discontinue those items from their menu, which could lead to a decrease in customer satisfaction. Additionally, if the data shows that certain types of food are more expensive than others, the hotel may choose to raise prices on those items, which could lead to a decrease in customer spending.

#### Chart - 9

9) What is Percentage distribution of Deposite type?

In [None]:
# Chart - 9 Visualization code
# Visualizsing by pie chart.
hotel_df1['deposit_type'].value_counts().plot.pie(explode = (0.5,0.5,0.05), autopct = '%1.1f%%', shadow=False, figsize = (14,8), fontsize = 20, labels=None)
plt.title("% Distribution of deposit type")
labels = hotel_df1['deposit_type'].value_counts().index.tolist()
plt.legend(bbox_to_anchor = (0.85, 1), loc = 'upper left', labels = labels)

##### 1. Why did you pick the specific chart?

A pie chart expresses a part-to-whole relationship in your data. It's easy to explain the percentage comparison through area covered in a circle with different colors. Where differenet percentage comparison comes into action pie chart is used frequently. So, I used Pie chart and which helped me to get the percentage comparision of the feature or target variable.

##### 2. What is/are the insight(s) found from the chart?

98.7 % of the guests prefer "No deposit" type of deposit.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on deposit type for hotel bookings. For example, insights about customer preferences for a particular type of deposit may lead to an increase in revenue as customers may be more likely to book a hotel if it offers a deposit type that they prefer. Additionally, insights about customer preferences for longer deposit periods may also lead to an increase in revenue as customers may be more likely to book a hotel for longer periods of time if it offers them a longer deposit period.

However, there may also be insights that lead to negative growth. For example, insights about customer preferences for shorter deposit periods may lead to decreased revenue as customers may be more likely to book a hotel for shorter periods of time if it offers them a shorter deposit period. Additionally, insights about customer preferences for a particular type of deposit may lead to decreased revenue as customers may be less likely to book a hotel if it does not offer the type of deposit they prefer.

#### Chart - 10

10) Which is the most preferred room type by the customers?

In [None]:
# Chart - 10 Visualization code

# Set plotsize
plt.figure(figsize = (18,8))

# Plotting 
sns.countplot(x = hotel_df1['assigned_room_type'], order = hotel_df1['assigned_room_type'].value_counts().index, palette = 'magma')

# Set xlabel for the plot
plt.xlabel('Room Type')

# Set y label for the plot
plt.ylabel('Count of Room Type')

# Set title for the plot
plt.title("Most preferred Room type")

##### 1. Why did you pick the specific chart?

The countplot is used to represent the occurrence(counts) of the observation present in the categorical variable and it is used to show the counts of observations in each categorical bin using bars.

##### 2. What is/are the insight(s) found from the chart?

Room type 'A' is the most prefered by the guests and the second most prefered is 'D'.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights gained can help create a positive business impact on which is the most preferred room type by customers for hotel bookings. Knowing the most preferred room type by customers can help the hotel better allocate their resources and focus on areas that will likely bring in more business. It can also help the hotel create tailored offers and promotions that are more likely to be accepted by the customer. 

No, there are no insights that lead to negative growth. The insights gained are all focused on understanding customer preferences and helping the hotel make more informed decisions about their operations. This would only result in more positive growth for the hotel.

#### Chart - 11

11) From which country the most guests are coming?

In [None]:
# Chart - 11 Visualization code
# Counting the guests from various countries.
country_df = hotel_df1['country'].value_counts().reset_index().rename(columns = {'index':'country', 'country':'count of guests'})[:10]

# Visualizsing by bar chart.
plt.figure(figsize = (20,8))
sns.barplot(x = country_df['country'], y = country_df['count of guests'], palette = 'rocket_r')
plt.xlabel('Country')
plt.ylabel('Number of guests',fontsize = 12)
plt.title("Number of guests from diffrent Countries")

In [None]:
# Importing the folium library
import folium                          
import plotly.express as px

In [None]:
basemap = folium.Map()
guests_map = px.choropleth(country_df, locations = country_df['country'], color = country_df['count of guests'], hover_name = country_df['country'])
guests_map.show()

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

Abbreevations for countries-

1.   PRT- Portugal
2.   GBR- United Kingdom
3.   FRA- France
4.   ESP- Spain
5.   DEU - Germany
6.   ITA -Itlay
7.   IRL - Ireland
8.   BEL -Belgium
9.   BRA -Brazil
10.   NLD-Netherlands

Most of the guests are coming from Portugal i.e more 25000 guests are from portugal. After Portugal, GBR(Great Britain), France and Sapin are the countries from where most of the guests came.


##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights will help create a positive business impact on the country from which the most guests book hotel rooms. By understanding which country the most guests are coming from, a hotel can tailor its services to the specific needs of the guests from that country. This can include providing services and amenities that the guests from that country prefer, as well as increasing marketing efforts in that country to attract even more guests.

There are no insights that lead to negative growth. The insights simply provide data on which country the most guests are coming from, and this data can be used to make positive business decisions.

#### Chart - 12

12) In which month most of the bookings happened?

In [None]:
# Groupby arrival_date_month and taking the hotel count
bookings_by_months_df = hotel_df1.groupby(['arrival_date_month'])['hotel'].count().reset_index().rename(columns={'hotel':"Counts"})

# Create list of months in order
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']

# Creating df which will map the order of above months list without changing its values.
bookings_by_months_df['arrival_date_month'] = pd.Categorical(bookings_by_months_df['arrival_date_month'], categories = months, ordered=True)

# Sorting by arrival_date_month
bookings_by_months_df = bookings_by_months_df.sort_values('arrival_date_month')

bookings_by_months_df

In [None]:
# Chart - 12 Visualization code
# Visualizsing by line plot.

# Set plot size
plt.figure(figsize = (20,8))

# Ploting lineplot on x- months & y- booking counts
sns.lineplot(x = bookings_by_months_df['arrival_date_month'], y = bookings_by_months_df['Counts'], palette = 'crest')

# Set title for the plot
plt.title('Number of bookings across each month')

# Set x label
plt.xlabel('Month')

# Set y label
plt.ylabel('Number of bookings')

##### 1. Why did you pick the specific chart?

Seaborn Line Plots depict the relationship between continuous as well as categorical values in a continuous data point format. Line Plots display numerical values on one axis, and categorical values on the other.

##### 2. What is/are the insight(s) found from the chart?

As we can see in the line chart, from June to September has most of the bookings happened. It's Summer time can be the reason for the most bookings. After September bookings starts declining.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on which month most of the bookings happened for hotel bookings. For example, if the data reveals that the most bookings take place during the summer months, the hotel can create special discounts and promotions to draw more visitors during this time. Additionally, the insights can be used to better target potential customers and adjust marketing strategies to increase overall bookings.

No, there are no insights that lead to negative growth. The insights from the data are only meant to provide information on which months are the most popular for hotel bookings. With this information, the hotel can adjust their strategy to maximize the number of bookings.

#### Chart - 13

13) Which year had the highest bookings?

In [None]:
# Chart - 13 Visualization code
# Visualizsing by the count plot.

# Set plot size
plt.figure(figsize = (12,8))

# Plot with countplot
sns.countplot(x = hotel_df1['arrival_date_year'], hue = hotel_df1['hotel'], palette = 'husl')
plt.title("Year Wise bookings")

##### 1. Why did you pick the specific chart?

The countplot is used to represent the occurrence(counts) of the observation present in the categorical variable and it is used to show the counts of observations in each categorical bin using bars.

##### 2. What is/are the insight(s) found from the chart?

Most of the bookings for City hotels and Resort hotels were happened in 2016. As we can see most of the bookings were for City hotels.

2016 had the higest bookings.\
2015 had less 7000 bookings.\
Overall City hotels had the most of the bookings.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on which year had the highest hotel bookings. For instance, the insights may reveal which seasons were the busiest for bookings, which factors drove the highest bookings, and what strategies were successful in driving more bookings. By analyzing this data, a hotel can tailor their strategies to maximize bookings in the future. 

No, there are no insights that lead to negative growth. The gained insights are all focused on understanding which strategies and factors led to the highest bookings, which is information that can be used to increase bookings in the future.

#### Chart - 14

14) Which Distribution channel is mostly used for hotel bookings?

In [None]:
# Chart - 14 Visualization code
# Visualizsing by the pie chart.

# Creating labels
labels = hotel_df1['distribution_channel'].value_counts().index.tolist()

# Creating new df of distribution channel
distribution_channel_df = hotel_df1['distribution_channel'].value_counts().reset_index().rename(columns={'index':"distribution_channel", 'distribution_channel':'count'})

# Adding percentage columns to the distribution_channel_df
distribution_channel_df['percentage'] = round(distribution_channel_df['count'] * 100 / hotel_df1.shape[0], 1)

# Creating list of percentage
sizes = distribution_channel_df['percentage'].values.tolist()

# Plotting the piw chart
hotel_df1['distribution_channel'].value_counts().plot.pie(explode = [0.05, 0.05,0.05,0.05,0.05], shadow = False, figsize = (15,8), fontsize = 10, labels = None)   

# Setting legends with the percentage values
labels = [f'{l}, {s}%' for l, s in zip(labels, sizes)]
plt.legend(bbox_to_anchor = (0.85, 1), loc = 'upper left', labels = labels)
plt.title(' Mostly Used Distribution Channel for Hotel Bookings ')

##### 1. Why did you pick the specific chart?

A pie chart expresses a part-to-whole relationship in your data. It's easy to explain the percentage comparison through area covered in a circle with different colors. Where differenet percentage comparison comes into action pie chart is used frequently. So, I used Pie chart and which helped me to get the percentage comparision of the feature or target variable.

##### 2. What is/are the insight(s) found from the chart?

79.1% bookings were made through TA/TO (travel agents/tour operators). Second most distribution channel is direct.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

The insights gained from analyzing which distribution channel is mostly used for hotel bookings can help create a positive business impact. For example, if the analysis reveals that customers prefer to book hotels through a particular distribution channel, the business can focus their marketing efforts on that channel in order to drive more bookings. Additionally, the business can focus their efforts on providing customers with a better booking experience on the preferred channel in order to increase customer satisfaction and loyalty.

There are not necessarily any insights that lead to negative growth. It is possible that the analysis might reveal that customers are not using certain distribution channels, in which case the business may need to adjust their strategy in order to drive more bookings. However, the insights would still be positive in that it would help the business identify opportunities for growth.

Bivariate and Multivariate Analysis:-

In [None]:
# Group by hotel
grup_by_hotel = hotel_df1.groupby('hotel')

#### Chart - 15

1) Which Hotels has the most repeated guests?

In [None]:
# Chart - 15 Visualization code
# Repeated guest = 1
# Not repeated guest = 0

# Groupby hotel
repeated_guests_df = hotel_df1[hotel_df1['is_repeated_guest'] == 1].groupby('hotel').size().reset_index().rename(columns = {0:'number_of_repated_guests'})

# Set plot size adn plot barplot
plt.figure(figsize = (10,8))
sns.barplot(x = repeated_guests_df['hotel'], y = repeated_guests_df['number_of_repated_guests'], palette = 'coolwarm')

# Set labels
plt.xlabel('Hotel type')
plt.ylabel('count of repeated guests')
plt.title("Most repeated guests for each hotel")

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

Resort Hotel has slightly more repeated guests than the City Hotels. In order to get increase the count of repeated guests hotel management need to take the valuable feedbacks from the guests and try to give good service.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

The insights gained from studying the number of repeated guests at city hotels and resort hotels can help create a positive business impact by informing business decisions. For example, the insights can help identify which type of hotel is more popular among repeat guests and can help inform marketing and promotional strategies to attract more repeat customers.

However, there could also be potential negative growth implications if the insights are used incorrectly. For example, if the insights are used to focus marketing efforts solely on the type of hotel that has more repeat guests, there could be a risk of alienating potential customers who may prefer to stay at the other type of hotel. Additionally, if the insights are used to reduce prices for repeat customers at the hotel with more repeat guests, there could be a risk of reducing revenue if the hotel has higher prices than the other type of hotel.

#### Chart - 16

2) Which Hotel type has the highest ADR?

In [None]:
# Chart - 16 Visualization code
# Visualizsing by the bar chart.

# Grouping by hotel adr
highest_adr = grup_by_hotel['adr'].mean().reset_index()

# Set plot size
plt.figure(figsize = (10,8))

# Set labels
plt.xlabel('Hotel type')
plt.ylabel('ADR')
plt.title("Avg ADR of each Hotel type")

# Plot the graph
sns.barplot(x = highest_adr['hotel'], y=highest_adr['adr'], palette = 'Paired')

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

City hotel has the highest ADR. That means city hotels are generating more revenues than the resort hotels. More the ADR more is the revenue.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help creating a positive business impact on which hotel type has the highest average daily rate city hotels or resort hotels. For example, the insights can reveal which type of hotel has higher occupancy rate, higher customer satisfaction scores, or higher revenue per available room. This information can help the hotel business to identify which type of hotel to focus more on and optimize their operations and marketing efforts accordingly.

On the other hand, the insights can also lead to negative growth if there is not enough demand for certain types of hotels or if the hotel is not able to optimize their operations or marketing efforts to attract more customers. This could result in a decrease in revenue or occupancy rate, which could lead to negative growth.

#### Chart - 17

3) Which hotel type has the more lead time?

In [None]:
# Chart - 17 Visualization code
# Visualizsing by the bar chart.

# Group by hotel and taking mean of lead time
avg_lead_time = grup_by_hotel['lead_time'].mean().reset_index()

# Set plot size
plt.figure(figsize = (10,8))

# Plot the bar plot
sns.barplot(x = avg_lead_time['hotel'], y = avg_lead_time['lead_time'], palette = 'cubehelix')

# Set lables
plt.xlabel('Hotel type')
plt.ylabel('Average Lead time')
plt.title("Average Lead Time for each Hotel type")

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

Average lead time for resort hotel is high. It means people plan their trip too early. Usually people prefer resort hotels for longer stays. That's why people plan early.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights may help create a positive business impact on which hotel type has the more lead time city hotels or resort hotels, depending on the findings. For example, if the insights show that city hotels have a higher lead time compared to resort hotels, then the hotel can focus their marketing efforts more on city hotels to increase their overall occupancy and profits. On the other hand, if the insights show that resort hotels have a higher lead time compared to city hotels, then the hotel can focus their marketing efforts more on resort hotels to increase their overall occupancy and profits.

There are no insights that lead to negative growth as the insights are focused on understanding the lead time of city and resort hotels in order to determine which hotel type should be the focus of marketing efforts. The goal is to increase the overall occupancy and profits. Therefore, the insights should not lead to any negative growth.

#### Chart - 18

4) Which hotel has highest percentage of booking cancellation?

In [None]:
# Chart - 18 Visualization code
# Visualizsing by the bar chart.

# booking canceled = 1 
# booking not canceled = 0

# Creating new DataFrame where bookings are cancelled.
canceled_df = hotel_df1[hotel_df1['is_canceled'] == 1]

# Grouping by hotel
canceled_df = canceled_df.groupby('hotel').size().reset_index().rename(columns = {0: "no_of_cancelled_bookings"})

# Adding 'total booking column for calculating the percentage.
canceled_df['total_booikngs'] = hotel_df1.groupby('hotel').size().reset_index().rename(columns = {0:"total_bookings"}).drop('hotel', axis = 1)
canceled_df

# Plotting the barchat
plt.figure(figsize = (10,8))
sns.barplot(x = canceled_df['hotel'], y = canceled_df['no_of_cancelled_bookings'] * 100 / canceled_df['total_booikngs'], palette = 'icefire')

# Set labels
plt.xlabel('Hotel type')
plt.ylabel('Percentage(%)')
plt.title("Percentage of booking cancellation")

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

Booking cancellation rate is high for City hotels which almost 30%.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights can help to create a positive business impact on which hotel has highest percentage of booking cancellation city hotels or resort hotels. For instance, the insights can reveal which cities have the highest rates of booking cancellation and which hotels are in those cities. From there, hotel managers can work to identify the root causes for the higher booking cancellation rate and take steps to reduce it. 

However, the insights gained may also lead to negative growth if the data reveals that customers are not satisfied with the services or offerings of city hotels or resort hotels. In such a case, hotel managers would need to take steps to improve the customer experience in order to encourage more bookings and reduce cancellations.

#### Chart - 19

5) Which hotel has longer waiting time?

In [None]:
# Chart - 19 Visualization code
# Visualizsing by the bar chart.

# Groupping by hoetl and takin mean of days in waiting list
waiting_time_df = hotel_df1.groupby('hotel')['days_in_waiting_list'].mean().reset_index()
# waiting_time_df

# Set the plot size
plt.figure(figsize = (10,8))

# Plot barchart
sns.barplot(x = waiting_time_df['hotel'],y = waiting_time_df['days_in_waiting_list'], palette = 'Spectral')

# Set labels
plt.xlabel('Hotel type')
plt.ylabel('waiting time')
plt.title("Waiting time for each hotel type")

##### 1. Why did you pick the specific chart?

Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages.

To show the count of most bookings made by the agent that's why I have used Bar Chart.

##### 2. What is/are the insight(s) found from the chart?

Waiting time period for City hotels are high as compared to the Resort hotels. Thus we can say that City hotels are much busier than the Resort hotels.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on which hotel has longer waiting time city hotels or resort hotels. By understanding the reasons why customers wait longer in either city hotels or resort hotels, businesses can make adjustments to their operations to reduce wait times and improve customer satisfaction. For example, they could implement more efficient booking systems, prioritize customer service, and make sure staff are properly trained and equipped to handle customer inquiries.

On the other hand, the gained insights could lead to negative growth if the business does not make the necessary adjustments to improve their operations. For example, if the business continues to ignore customer wait times, customers may become increasingly frustrated and choose not to return. This could result in a decrease in revenue for the business, since customers are not satisfied with their experience.

#### Chart - 20

In [None]:
# ADR across the different month

# Groupby
bookings_by_months_df = hotel_df1.groupby(['arrival_date_month','hotel'])['adr'].mean().reset_index()

# Create month list
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']

# It will take the order of the month list in the df along with values
bookings_by_months_df['arrival_date_month'] = pd.Categorical(bookings_by_months_df['arrival_date_month'], categories = months, ordered = True)

# Sortting
bookings_by_months_df = bookings_by_months_df.sort_values('arrival_date_month')
bookings_by_months_df

In [None]:
# Chart - 20 Visualization code

# Set plot size and plot the line
plt.figure(figsize = (20,8))

# Visualizsing by line plot.
sns.lineplot(x = bookings_by_months_df['arrival_date_month'], y = bookings_by_months_df['adr'], hue = bookings_by_months_df['hotel'], palette = 'Set2')

# Set lables
plt.title('ADR across each month')
plt.xlabel('Month')
plt.ylabel('ADR')

##### 1. Why did you pick the specific chart?

Seaborn Line Plots depict the relationship between continuous as well as categorical values in a continuous data point format. Line Plots display numerical values on one axis, and categorical values on the other.

##### 2. What is/are the insight(s) found from the chart?

For Resrot hotel is ADR is high in the months June, July, August as compared to City Hotels. May be Customers/People wants to spend their Summer vaccation in Resorts Hotels.

The best time for guests to visit Resort or City hotels is January, February, March, April, October, November and December as the avrage daily rate in this month is very low.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the gained insights can help create a positive business impact on average daily rate across the different months for hotel bookings. For example, if the insights reveal that there are certain months that have higher booking rates due to an increase in demand for travel during those months, the hotel can increase their prices for those months to take advantage of the higher demand and increase their average daily rate.
 
The insights can also help identify any potential negative growth. For example, if the insights show that there are certain months that have lower booking rates due to a decrease in demand for travel during those months, the hotel can decrease their prices for those months to make them more attractive to potential guests and reduce the negative growth.

#### Chart - 21 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code
plt.figure(figsize = (18,10))
sns.heatmap(hotel_df1.corr(), annot = True, cmap = 'coolwarm', vmin = -1)
plt.title('Co-relation of the columns')

##### 1. Why did you pick the specific chart?

The correlation heatmap chart is a great way to visualize correlations between multiple variables. It provides a clear and concise view of the relationships between the variables, which allows for easy and quick analysis. Additionally, the color coding used in the heatmap helps to quickly and easily identify correlations that may otherwise not be as apparent.

##### 2. What is/are the insight(s) found from the chart?

*   is_canceled and same_room_alloted_or_not are negatively corelated. That means customer is unlikely to cancel his bookings if he don't get the same room as per reserved room. We have visualized it above.

*   lead_time and total_stay is positively corelated.That means more is the stay of cutsomer more will be the lead time.

*   adults,childrens and babies are corelated to each other. That means more the people more will be adr.

*   is_repeated guest and previous bookings not canceled has strong corelation. may be repeated guests are not more likely to cancel their bookings.





#### Chart - 22 - Pair Plot 

In [None]:
# Pair plot visualization code
sns.pairplot(hotel_df1, hue = 'is_canceled')

##### 1. Why did you pick the specific chart?

Pair plot is used to understand the best set of features to explain a relationship between two variables or to form the most separated clusters. It also helps to form some simple classification models by drawing some simple lines or make linear separation in our data-set.

Thus, I used pair plot to analyse the patterns of data and realationship between the features. It's exactly same as the correlation map but here you will get the graphical representation.

##### 2. What is/are the insight(s) found from the chart?

From the above chart I got to know, there are less linear relationship between variables and data points aren't linearly separable.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ? 

*   Increase the visibility of the hotel by optimizing for local SEO and using digital marketing channels such as Google Ads, Facebook Ads, and other relevant social media platforms.

*   Utilize email marketing campaigns to reach potential customers and promote targeted offers.

*   Set up targeted campaigns with influencers and social media personalities to promote the hotel’s offerings.

*   Implement a loyalty program to reward repeat customers and encourage them to book with the hotel again.

*   Build relationships with travel agents, tour operators, and other hotel booking websites to increase the hotel’s reach.

*   Utilize data and insights to identify customer preferences and tailor offers to meet their needs.

*   Create a user-friendly website and mobile app to increase user engagement and make booking easier.


*   Utilize online travel agents and hotel booking websites to increase visibility and bookings.

*   Leverage technology to provide a seamless customer experience.

*   Use booking engine software to streamline the booking process.



# **Conclusion**

*   City hotels are the most preferred hotel type by the guests. We can say City hotel is the busiest hotel.

*   27.5 % bookings were got cancelled out of all the bookings.

*   Only 3.9 % people were revisited the hotels. Rest 96.1 % were new guests. Thus retention rate is low.

*   The percentage of 0 changes made in the booking was more than 82 %. Percentage of Single changes made was about 10%.

*   Most of the customers (91.6%) do not require car parking spaces.

*   79.1 % bookings were made through TA/TO (travel agents/Tour operators).

*   BB( Bed & Breakfast) is the most preferred type of meal by the guests.

*   Maximum number of guests were from Portugal, i.e. more than 25000 guests.

*   Average ADR for city hotel is high as compared to resort hotels. These City hotels are generating more revenue than the resort hotels.

*   Booking cancellation rate is high for City hotels which almost 30 %.

*   Average lead time for resort hotel is high.

*   Waiting time period for City hotel is high as compared to resort hotels. That means city hotels are much busier than Resort hotels.

*   Resort hotels have the most repeated guests.

*   Optimal stay in both the type hotel is less than 7 days. Usually people stay for a week.

*   Almost 19 % people did not cancel their bookings even after not getting the same room which they reserved while booking hotel. Only 2.5 % people cancelled the booking.







### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***