# **Project Name**    - **Hotel Booking analysis**
---





##### **Project Type**    - EDA/Regression/Classification/Unsupervised
##### **Contribution**    - Individual
Member - Ajay popat jagdale

# **Project Summary -**

The aim of this project is to perform exploratory data analysis on a hotel booking dataset to gain insights into booking patterns, cancellation rates, length of stay and other trends. The dataset used in this project contains information about bookings made resorts, hotels, including details such as booking dates, lead times, room types, and more.

The first step in this project is to load the dataset and perform some basic data cleaning, such as dropping unnecessary columns and filling in missing values. After cleaning the data, the next step is to perform some initial exploratory data analysis using various visualization techniques such as heatmaps, histograms, bar charts, and scatter plots.

The analysis starts by looking at the distribution of hotel types and bookings per month, which helps to understand the booking patterns. Next, the cancellation rate is examined to understand the reasons behind cancellations and the factors that influence them, such as hotel type, booking month, and lead time. Additionally, the distribution of lead times is explored to understand the time between booking and arrival.

Overall, the EDA project provides a comprehensive overview of the hotel booking dataset and helps to identify important trends and patterns. This analysis can be useful for hotel managers and marketers who are interested in understanding the booking patterns and customer behavior to optimize their business strategies.

# **GitHub Link -**


https://github.com/ajdev007/EDA



# **Problem Statement**


**Have you ever wondered when the best time of year to book a hotel room is? Or the optimal length of stay in order to get the best daily rate? What if you wanted to predict whether or not a hotel was likely to receive a disproportionately high number of special requests? This hotel booking dataset can help you explore those questions! This data set contains booking information for a city hotel and a resort hotel, and includes information such as when the booking was made, length of stay, the number of adults, children, and/or babies, and the number of available parking spaces, among other things. All personally identifying information has been removed from the data. Explore and analyse the data to discover important factors that govern the bookings.**

#### **Define Your Business Objective?**

To optimize the hotel's operations, improve customer satisfaction, and increase revenue.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required. 
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits. 
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule. 

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from datetime import datetime
import seaborn as sns
import ast 


In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
path = '/content/drive/MyDrive/Hotel booking.csv'

### Dataset Loading

In [None]:
import pandas as pd

Hotel_dataset = pd.read_csv(path)

### Dataset First View

In [None]:
# Dataset First Look
Hotel_dataset.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
print(Hotel_dataset.shape)

print('\n')
print(Hotel_dataset.index)


### Dataset Information

In [None]:
# Dataset Info
Hotel_dataset.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
# Dataset Duplicate Value Count, to remove  values, .drop_duplicate this fun is used to delete  duplicate rows
Hotel_dataset.drop_duplicates(inplace = True )

#count unique rows 
uniq_num_of_rows = Hotel_dataset.shape[0]
uniq_num_of_rows 

In [None]:
Hotel_dataset.reset_index() #for Viewing unique data

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
Null_Values =Hotel_dataset.isnull() == True
Hotel_dataset.fillna(np.nan, inplace = True)

Hotel_dataset # replace all the null value as NaN.

In [None]:
# Visualizing the missing values
missing_values =(Hotel_dataset.isnull().sum().sort_values(ascending=False))  #to chake individual count of null values
missing_values

### What did you know about your dataset?

This data set contains a single file which compares various booking information between two hotels: a city hotel and a resort hotel.Includes information such as when the booking was made, length of stay, the number of adults, children, and/or babies, and the number of available parking spaces, among other things. The dataset contains a total of 119390 rows and 32 columns.Dataset Contains duplicated items i.e 31944 which was removed later .In this dataset I find data types of every columns i.e (Int, float ,string) and observe that some columns data types is not accurate and remove later. i find unique value of every columns it means what actual values in every columns.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
Hotel_dataset_columns = Hotel_dataset.columns
Hotel_dataset_columns

**Describing  the dataset**

In [None]:
# Dataset Describe
Hotel_dataset.describe()


### Variables Description 

The columns and the data it represents are listed below:

hotel : Name of the hotel (Resort Hotel or City Hotel)

is_canceled : If the booking was canceled (1) or not (0)

lead_time: Number of days before the actual arrival of the guests

arrival_date_year : Year of arrival date

arrival_date_month : Month of month arrival date

arrival_date_week_number : Week number of year for arrival date

arrival_date_day_of_month : Day of arrival date

stays_in_weekend_nights : Number of weekend nights (Saturday or Sunday) spent at the hotel by the guests.

stays_in_week_nights : Number of weeknights (Monday to Friday) spent at the hotel by the guests.

adults : Number of adults among guests

children : Number of children among guests

babies : Number of babies among guests

meal : Type of meal booked

country : Country of guests

market_segment : Designation of market segment

distribution_channel : Name of booking distribution channel

is_repeated_guest : If the booking was from a repeated guest (1) or not (0)

previous_cancellations : Number of previous bookings that were cancelled by the customer prior to the current booking

previous_bookings_not_canceled : Number of previous bookings not cancelled by the customer prior to the current booking

reserved_room_type : Code of room type reserved

assigned_room_type : Code of room type assigned

booking_changes : Number of changes/amendments made to the booking

deposit_type : Type of the deposit made by the guest

agent : ID of travel agent who made the booking

company : ID of the company that made the booking

days_in_waiting_list : Number of days the booking was in the waiting list

customer_type : Type of customer, assuming one of four categories

adr : Average Daily Rate, as defined by dividing the sum of all lodging transactions by the total number of staying nights

required_car_parking_spaces : Number of car parking spaces required by the customer

total_of_special_requests : Number of special requests made by the customer

reservation_status : Reservation status (Canceled, Check-Out or No-Show)

reservation_status_date : Date at which the last reservation status was updated



### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
print(Hotel_dataset.apply(lambda col: col.unique())) #i have stored all individual values in indivdual columns

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
missing_values[:4]

In [None]:
# cheaking the percentage of null values in company
percentage_company_null = missing_values[0] / uniq_num_of_rows*100
percentage_company_null

In [None]:
#drop company column due to high null value
Hotel_dataset.drop(['company'], axis=1, inplace=True)

In [None]:
# now let's check for agent

percentage_agent_null =  missing_values[1] / uniq_num_of_rows*100
percentage_agent_null

In [None]:
# As we have seen, there is minimul null values in agent,so  fill these value by taking mode of the all values
Hotel_dataset['agent'].fillna(value = 0, inplace = True)
Hotel_dataset['agent'].isnull().sum() #now column has no null value

In [None]:
#now cheaking for country
percentage_country_null = missing_values[2] / uniq_num_of_rows*100
percentage_country_null

In [None]:
# less null vlues in country coloumn, so i replace it with null from 'other' as country name.

Hotel_dataset['country'].fillna(value = 'others', inplace = True)
Hotel_dataset['country'].isnull().sum() # we re-check that column has no null value

In [None]:
#now last Check the percentage null value in children coloumn

percentage_children_null = missing_values[3] / uniq_num_of_rows*100
percentage_children_null

In [None]:
#i will replace null from 0 as country name cause this coloumn also have less no null values.

Hotel_dataset['children'].fillna(value = 0, inplace = True)
Hotel_dataset['children'].isnull().sum() # we re-check that column has no null value

In [None]:
#cheaking null values in all anyother coloumn 
Hotel_dataset.isnull().sum() 

**lets cheake the dataset information**


In [None]:
Hotel_dataset.info()

In [None]:
# We have seen that childern & agent column as datatype as float whereas it contains only int value, lets change datatype as 'int64'
Hotel_dataset[['children', 'agent']] = Hotel_dataset[['children', 'agent']].astype('int64')

**Adding   new column as  required**

In [None]:
#adding new coloumn 'total stay in nights'
Hotel_dataset['total_stay_in_nights'] = Hotel_dataset ['stays_in_week_nights'] + Hotel_dataset ['stays_in_weekend_nights']
Hotel_dataset['total_stay_in_nights'] #  created a col for total stays in nights by adding week night & weekend nights stay col.

**creating one another coloumn for revenue**

In [None]:
# created a col for revenue using total stay * adr
Hotel_dataset['revenue'] = Hotel_dataset['total_stay_in_nights'] *Hotel_dataset['adr']
Hotel_dataset['revenue']

In [None]:
# Also, for information, we will add a column with total guest coming for each booking
Hotel_dataset['total_guest'] = Hotel_dataset['adults'] + Hotel_dataset['children'] + Hotel_dataset['babies']
Hotel_dataset['total_guest'].sum()

In [None]:
# for understanding, from col 'is_canceled': we will replace the value from (0,1) to not_canceled, is canceled.

Hotel_dataset['is_canceled'] = Hotel_dataset['is_canceled'].replace([0,1], ['not canceled', 'is canceled'])
Hotel_dataset['is_canceled']

In [None]:
#Same for 'is_repeated_guest' col
Hotel_dataset['is_repeated_guest'] = Hotel_dataset['is_repeated_guest'].replace([0,1], ['not repeated', 'repeated'])
Hotel_dataset['is_repeated_guest']

In [None]:
#Now, we will check overall revenue hotel wise
Hotel_dataset[['hotel', "revenue"]]

### What all manipulations have you done and insights you found?

**I have done few manipulations in the Data.**

**----Addition of columns----**

I had seen that there are few columns required in Data to analysis purpose which can be evaluated from the given columns.

a) **Total Guests:** This columns will help us to evaluate the volumes of total guest and revenue as well. I get this value by adding total no. of Adults, Children & babies.

b) **Revenue:** I find revenue by multiplying adr & total guest. This column will use to analyse the profit and growth of each hotel.



**----Delete of columns----**

a)**company:** A I have seen that this columns has almost Null data. Thats why I have delete this column as this will not make any impact in the analysis. 



**----Replace of Values in columns----**

a)**is_canceled, is_not_canceled & is_repeated_guest:** I  have seen, that these columns contains only 0,1 as values which represent the status of booing cancellation. i replaced these values (0,1) from 'Canceled' & 'Not canceled. In the same way for column 'is_repeated_guest', we replace 0,1 from 'Repeated' & 'Not repeated'. Now this values will help to make better understanding while visulization.


**----Changes in data type of values in columns----**

a)**Agent & Children:** i checked that these columns contains float values, which is not making any sense in data as this values repreasent the count of guest & ID of agent. So i changed the data type of these columns from 'float' to 'Integer'.


**----Removed is_null values & duplicate entries----**

a)Before visualize any data from the data set we have to do data wrangling.
For that, i have checked the null value in all the columns. After checking, when i was getting a column which has more number of null values, dropped that column by using the 'drop' method. In this way, i had dropped the 'company' column. When i found minimal number of null values, filling thse null values with necesary values as per requirement by using .fillna().

b) In the same, i have checked if there is any duplicacy in data & ifound that there are few rows have duplicate data. So i have removed those row from data set by using .drop_duplicates() method.






**In this way,i have removed unneccesary data & make our data clean and ready to analyse.** 



## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:

# Let's create a function which will give us bar chart of data respective with a col.
def get_count_from_column_bar(df, column_label):
  df_grpd = df[column_label].value_counts()
  df_grpd = pd.DataFrame({'index':df_grpd.index, 'count':df_grpd.values})
  return df_grpd

# plot a pie chart from grouped data
def plot_bar_chart_from_column(df, column_label, t1):
  df_grpd = get_count_from_column(df, column_label)
  fig, ax = plt.subplots(figsize=(14, 6))
  c= ['g','r','b','c','y']
  ax.bar(df_grpd['index'], df_grpd['count'], width = 0.4, align = 'edge', edgecolor = 'black', linewidth = 4, color = c, linestyle = ':', alpha = 0.5)
  plt.title(t1, bbox={'facecolor':'0.8', 'pad':3})
  plt.legend()
  plt.ylabel('Count')
  plt.xticks(rotation = 15) # use to format the lable of x-axis
  plt.xlabel(column_label)
  plt.show()

In [None]:
# Chart - 1 visualization code

def get_count_from_column(df, column_label):
  df_grpd = df[column_label].value_counts()
  df_grpd = pd.DataFrame({'index':df_grpd.index, 'count':df_grpd.values})
  return df_grpd

# plot a pie chart from grouped data
def plot_pie_chart_from_column(df, column_label, t1, exp):
  df_grpd = get_count_from_column(df, column_label)
  fig, ax = plt.subplots(figsize=(14,9))
  ax.pie(df_grpd.loc[:, 'count'], labels=df_grpd.loc[:, 'index'], autopct='%1.2f%%',startangle=90,shadow=True, labeldistance = 1, explode = exp)
  plt.title(t1, bbox={'facecolor':'0.8', 'pad':3})
  ax.axis('equal')
  plt.legend()
  plt.show()  

In [None]:
exp1 = [0.05,0.05]
plot_pie_chart_from_column(Hotel_dataset, 'hotel', 'Booking percentage of Hotel by Name', exp1)

##### 1. Why did you pick the specific chart?

I picked the pie chat because To present the data in which type of hotel have more booking done.

##### 2. What is/are the insight(s) found from the chart?

Here, I found that the booking number is Higher in City Hotel which is 61.12% than Resort Hotel which is 38.87%.from this City hotel has more consumption

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, for both Hotels, this data making some positive business impact : -

City Hotel :- Provided more services to attract more guest to increase more revenue.

Resort Hotel :- Find solution to attract guest and find what city hotel did to attract guest.

#### Chart - 2

In [None]:
# Chart - 2 visualization code
exp4 = [0,0.2]
plot_pie_chart_from_column(Hotel_dataset, 'is_canceled', 'Cancellation volume of Hotel', exp4)

##### 1. Why did you pick the specific chart?

**In this chart, I presented the cancellation percentage of the hotels bookings**

##### 2. What is/are the insight(s) found from the chart?

**from the above pie chart Here,I found that overall more than 25% of booking got cancelled**

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

**Here, from above chart i can see, that more than 27% booking getting cancelled.**

**Solution: We can check the reason of cancellation of a booking & need to get this sort on hotel level**

#### Chart - 3

In [None]:
# Chart - 3 visualization code
exp2 = [0.2, 0,0,0,0,0,0,0,0,0,0,0.1]
plot_pie_chart_from_column(Hotel_dataset, 'arrival_date_month', 'Month-wise booking', exp2)

##### 1. Why did you pick the specific chart?

To show the percentage share of booking in each month,on overall level

##### 2. What is/are the insight(s) found from the chart?

The above percentage shows month May, July and Aug are the highest booking months due to holiday season. Recommending aggressive advertisement to lure more and more customers.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, with increased volume of visitors will help hotel to manage revenue in down time, will also help employee satisfaction and retention.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
plot_bar_chart_from_column(Hotel_dataset, 'distribution_channel', 'Distibution Channel Volume')

##### 1. Why did you pick the specific chart?

The following chart represent maximum volume of booking done through which channel to represnt the numbers in descending order we chose bar graph

##### 2. What is/are the insight(s) found from the chart?

As clearly seen TA/TO(Tour of Agent & Tour of operator) is highest, recommending to continue booking through TA/TO

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Yes this shows positive business impact.

Higher the number of TA/TO will help to increase the revenue generation of Hotel.

#### Chart - 5

In [None]:
# Chart - 5 visualization code
exp3 = [0,0.3]
plot_pie_chart_from_column(Hotel_dataset, 'is_repeated_guest', 'Guest repeating status', exp3)



##### 1. Why did you pick the specific chart?

To show the percentage share of repeated & non-repeated guests.

##### 2. What is/are the insight(s) found from the chart?

Here, we can see that the number of repeated guests is very less as compared to overall guests, this means that people are not repeated same hotel

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

We can give alluring offers to non-repetitive customers during Off seasons to enhance revenue and also improve service quality 

#### Chart - 6

In [None]:
# Chart - 6 visualization code
plot_bar_chart_from_column(Hotel_dataset, 'assigned_room_type', 'Assigment of room by type')

##### 1. Why did you pick the specific chart?

To show distribution by volume, which room is most alotted.

##### 2. What is/are the insight(s) found from the chart?

This chart shows room type 'A' is most prefered by guest

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 7

In [None]:

# Chart - 7 visualization code
guest_country_wise = pd.DataFrame(Hotel_dataset[['country', 'total_guest']])
guest_country_wise_df = guest_country_wise.groupby(['country'])['total_guest'].sum()
guest_country_wise_df.sort_values(ascending = False, inplace = True)
top_10_country_by_guest = guest_country_wise_df.head(10)

plt.figure(figsize=(12,6))
sns.barplot(x =top_10_country_by_guest.index, y =top_10_country_by_guest).set(title='Top 10 Countries by Guest')
print("\n\nPRT = Portugal\nGBR = Great Britain & Northern Ireland\nFRA = France\nESP = Spain\nDEU = Germany\nITA = Italy\nIRL = Ireland\nBRA = Brazil\nBEL = Belgium\nNLD = Netherland")



Chart is showing graph of top 10 country

##### 1. Why did you pick the specific chart?

As we can see, that maximum guest is coming in Portugal

### ** Chart - 8**

In [None]:
# Chart - 8 visualization code

plt.figure(figsize = (12,6))
sns.scatterplot(y = 'total_stay_in_nights', x = 'adr', data = Hotel_dataset[Hotel_dataset['adr'] < 1000])
plt.show() #

##### 1. Why did you pick the specific chart?

To show comparision & affect of total stay days vs ADR

##### 2. What is/are the insight(s) found from the chart?

Here, we found that if guest's stay days is getting decreased,as ADR is getting high

#### Chart - 9

In [None]:
# Chart - 9 visualization code
plt.figure(figsize = (12,10), dpi = 100)
hotel_wise_meal = Hotel_dataset.groupby(['hotel', 'meal'])['meal'].count().unstack()
hotel_wise_meal.plot(kind ='bar', figsize = (12,8))
hotel_wise_meal

##### 1. Why did you pick the specific chart?

To show the meal preferance of the guest hotel-wise

##### 2. What is/are the insight(s) found from the chart?

As we can see, BB (Bed & breakfast) meal is most prefered by guests in both the hotels. So Hotel can give more delisious dishes in  meal to get customer repeat & attaract new customer

**Chart - 10 **

In [None]:
# chart - 10  visualization code
market_segment_df = pd.DataFrame(Hotel_dataset['market_segment'])
market_segment_df_data = market_segment_df.groupby('market_segment')['market_segment'].count()
market_segment_df_data.sort_values(ascending = False, inplace = True)
plt.figure(figsize=(15,6))
y = np.array([4,5,6])
market_segment_df_data.plot(kind = 'bar', color=['g', 'r', 'c', 'b', 'y', 'black', 'brown'], fontsize = 15,legend='True')

##### 1. Why did you pick the specific chart?

from this chart,we can know by which method most bookings done.

##### 2. What is/are the insight(s) found from the chart?

insight found from the chart online TA has been used most frequently to book hotel by the guest.


#### **Chart - 11 Correlation Heatmap**

In [None]:
# Chart - 11 visualization code
corr_ds = Hotel_dataset[['lead_time','previous_cancellations', 'previous_bookings_not_canceled', 'total_guest',
                    'booking_changes', 'days_in_waiting_list', 'adr', 'required_car_parking_spaces', 'total_of_special_requests']].corr()
f, ax = plt.subplots(figsize=(12, 12))
sns.heatmap(corr_ds, annot = True, fmt='.2f', annot_kws={'size': 10},  vmax=1, square=True, cmap="YlGnBu")

##### 1. Why did you pick the specific chart?

from this chart,we can  know  by which method most bookings done.

##### 2. What is/are the insight(s) found from the chart?

Highest corelation value between axis is 39% positive & lowest corelation value between the axis is -9% negative.

#### Chart - 12

In [None]:
# Chart - 12 visualization code 



##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot 

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ? 
Explain Briefly.

1. For hotel business to flourish few things which we need to consider is high revenue generation, customers satisfaction and employeee retention.

2. We are able achieve the same by showing the client which are the months which are high in revenue generation by pie chart distribution.

3. Increasing the revenue achieved by bar chart distribution of which typre room are most reserved and what are the months likely for visitors.

4. So for these the client can be well prepaare in advance so that minimum grievances would be faced by clients in long run and would help in further enhancement of their hospitality.

5. Outliers like higher the visitor then adr has reduced drastically was shown in scattered plot so in off season client can engage with offices for bulk booking this will aslo help extra revenue generation.

6. We are are able to shoe the trend of arrivals of visitor at client locations through which client engaged visitos well advance for there entaertainment and leisure activities.

7. We where also able to co relate the values showing the max and min percentage between them so that the percenytage lying those numbers can be enhanced by various medium.

8. Hotels can do improving occupancy forecasting, setting competitive room prices so coustmers can get attracted.

# **Conclusion**

1. City Hotel seems to be more preferred among travellers and it also generates more revenue & profit.

2. Most number of bookings are made in July and August as compared rest of the months.

3. Room Type A is the most preferred room type among travellers.

4. Portugal, the UK, and France, Spain and Germany are the top countries from most guests come, more than 80% come from these 5 countries..

5. Most of the guest stays for 1-4 days in the hotels.

6. City Hotel retains more number of guests.

7. Around one-fourth of the total bookings gets cancelled. More cancellations are from City Hotel.

8. New guest tends to cancel bookings more than repeated customers.

9. Lead time, number of days in waiting list or assignation of reserved room to customer does not affect cancellation of bookings.

10. Corporate has the most percentage of repeated guests while TA/TO has the least whereas in the case of cancelled bookings TA/TO has the most percentage while Corporate has the least.

11. The length of the stay decreases as ADR increases probably to reduce the cost.
12. Couple (or 2 adults) is the most popular accommodation type. So hotels can make arrangement plans accordingly

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***