# **Project Name**    - Hotel Booking Analysis


##### **Project Type**    - EDA
##### **Contribution**    - Individual


# **Project Summary -**
Have you ever wondered when the best time of year to book a hotel room is? Or the optimal length of stay in order to get the best daily rate? What if you wanted to predict whether or not a hotel was likely to receive a disproportionately high number of special requests? This hotel booking dataset can help you explore those questions!


This data set contains booking information for a city hotel and a resort hotel, and includes information such as when the booking was made, length of stay, the number of adults, children, and/or babies, and the number of available parking spaces, among other things. All personally identifying information has been removed from the data.
Explore and analyze the data to discover important factors that govern the bookings.

# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


**Write Problem Statement Here.**

#### **Define Your Business Objective?**

Answer Here.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import numpy as np
import pandas as pd
from numpy import math
from numpy import loadtxt
import seaborn as sns
import matplotlib.pyplot as plt



### Dataset Loading

In [None]:
# Load Dataset
Hotel_booking = pd.read_csv('/content/Hotel_Bookings.csv')


### Dataset First View

In [None]:
# Dataset First Look
Hotel_booking.head()

In [None]:
Hotel_booking.tail()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
Hotel_booking.shape
print(list(Hotel_booking.columns))

### Dataset Information

In [None]:
# Dataset Info
Hotel_booking.info()

#### Duplicate Values

In [None]:
len(Hotel_booking[Hotel_booking.duplicated()])

# Dataset Duplicate Value Count

In [None]:
Hotel_booking.duplicated().value_counts()

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
print(Hotel_booking.isnull().count)

In [None]:
# Visualizing the missing values
plt.figure(figsize =(10,8))
sns.countplot(x=Hotel_booking.duplicated())

### What did you know about your dataset?

In this dataset there are 32 columns and 119390 rows. There is complete analysis of Hotel Booking related query this data set contains booking information for a city hotel and a resort hotel, and includes information such as when the booking was made, length of stay, the number of adults, children, and/or babies, and the number of available parking spaces, among other things. All personally identifying information has been removed from the data.
Explore and analyze the data to discover important factors that govern the bookings.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
Hotel_booking.columns

In [None]:
# Dataset Describe
Hotel_booking.describe()

### Variables Description


 0   hotel  : (Resort Hotel or City Hotel)
 1   is_canceled:Value indicating if the booking was canceled (1) or  
      not (0)
 2   lead_time  Number of days that elapsed between the entering date
     of the
     booking into the PMS and the arrival date                     
 3   arrival_date_year :Year of arrival date              
 4   arrival_date_month:Month of arrival date
 5   arrival_date_week_number :Week number of arrival date
 6   arrival_date_day_of_month:day of month of arrival date
 7   stays_in_weekend_nights :Number of weekend nights (Saturday or
     Sunday) the guest stayed or booked to stay at the hotel
 8   stays_in_week_nights :Number of week nights (Monday to Friday)  
     the guest stayed or booked to stay at the hotel       
 9   adults:Number of adults                          
 10  children:Numbers of children     
 11  babies:number of babies   
 12  meal:Types of meal   
 13  country:origin of country       
 14  market_segment  Market segment designation. In categories, the
     term “TA” means “Travel Agents” and “TO” means “Tour Operators”
 15  distribution_channel: Booking distribution channel. The term “TA”
     means “Travel Agents” and “TO” means “Tour Operators”
 16  is_repeated_guest Value indicating if the booking name was from a
     repeated guest (1) or not (0)              
 17  previous_cancellations Number of previous bookings that were
     cancelled by the customer prior to the current booking
 18  previous_bookings_not_canceled Number of previous bookings not
     cancelled by the customer prior to the current booking
 19  reserved_room_type Code of room type reserved. Code is presented
     instead of designation for anonymity reasons.       
 20  assigned_room_type Code for the type of room assigned to the
     booking.       
 21  booking_changes Number of changes/amendments made to the booking
     from the moment the booking was entered on the PMS until the moment of check-in or cancellation                
 22  deposit_type Indication on if the customer made a deposit to
     guarantee the booking.                  
 23  agent ID of the travel agency that made the booking      
 24  company  ID of the company/entity that made the booking or
     responsible for paying the booking                     
 25  days_in_waiting_list   Number of days the booking was in the
     waiting list before it was confirmed to the customer         
 26  customer_typeType of booking, assuming one of four
     categories      
 27  adr Average Daily Rate as defined by dividing the sum of all
     lodging transactions by the total number of staying nights                          
 28  required_car_parking_spaces Number of car parking spaces required by the customer
 29  total_of_special_requests  Number of special requests made by the
     customer (e.g. twin bed or high floor)     
 30  reservation_status Reservation last status, assuming one of three
     categories            
 31  reservation_status_date reservation status by date

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for i in Hotel_booking.columns.tolist():
  print("No of unique values",i,"is",Hotel_booking[i].nunique(),".")

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
# here we are copying dataset and named as df
df=Hotel_booking.copy()
# Deleting duplicate values and ready for anlayze data
df=df.drop_duplicates()
df.info()

### What all manipulations have you done and insights you found?

Here we drop duplicated values and clean the data make a chart for show the duplicated value

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

In [None]:
# first we deal with null value
df.isna().sum()

we got null values in children,country, agent and company

#### Chart - 1

In [None]:
# Chart - 1 visualization code
plt.figure(figsize=(20,6))
sns.heatmap(df.isnull(),cmap='viridis')

In [None]:
#here we are delaing with null values with na and 0
children = df['children'].mode()[0]
df['children'].fillna(children,inplace=True)
df['country'].fillna('others',inplace=True)
df['agent'].fillna(0,inplace=True)
df['company'].fillna('unknown',inplace=True)

##### 1. Why did you pick the specific chart?

i picked this code and chart to analyze the miss values

##### 2. What is/are the insight(s) found from the chart?

i got missing values and sort all these

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes it will give good impact with bussiness understanding

#### Chart - 2

In [None]:

# Hotel type and preffered hotel type
df['hotel'].unique()


In [None]:
# number of bookings
hotelbooking = df['hotel'].value_counts().reset_index()
hotelbooking.columns=['hotel type','number of bookings']
hotelbooking

In [None]:
#graphical representation
# Chart - 2 visualization code




##### 1. Why did you pick the specific chart?

i picked this chart because bar graph is more efficient to comparison

##### 2. What is/are the insight(s) found from the chart?

i got that city hotel is more utilize in compare to resort

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

the positivity is city hotel is more occupied bussiness than resort

#### Chart - 3

In [None]:
# Chart - 3 visualization code
# Booking hotels year wise
plt.figure(figsize=(10,6))
sns.countplot(x=df['arrival_date_year'],hue=df['hotel'])
plt.title("Year wise booking")

##### 1. Why did you pick the specific chart?

I picked for comparing the value

##### 2. What is/are the insight(s) found from the chart?

i found that except 2015 year both year have city hotel have more occupied

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

in this chart we can analyze that year wise bussiness and process

#### Chart - 4

In [None]:
# Chart - 4 visualization code
# pick business season
data = df["arrival_date_month"].value_counts()
plt.figure(figsize=(10,6))
data.plot(kind='bar',color='r',grid=True)
plt.title("A year data")
plt.xlabel("A year_data")
plt.ylabel("number of guests")

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 5

In [None]:
# Chart - 5 visualization code
# cancellation rate(
cancellationRate=df['is_canceled'].value_counts().reset_index()
cancellationRate
cancellationRate.columns=["cancellation_show","user"]
sns.barplot(x=cancellationRate["cancellation_show"],y=cancellationRate["user"],palette='tab10')
plt.title("cancellationRate")
plt.xlabel("booking not cancel guest                      booking cancel guest")
plt.ylabel("number of guest")
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.
i want to show cancellation of guest with comparison

##### 2. What is/are the insight(s) found from the chart?

Answer Here
it helps to check cancel guest

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here
it will easily measure cancellation of guest so we can make action

#### Chart - 6

In [None]:
# Chart - 6 visualization code
# how many types of rooms are provided through hotels with room preference
room_by_hotel=df["reserved_room_type"].unique()
room_by_hotel
prefer_room_type=df["reserved_room_type"].value_counts().reset_index()
prefer_room_type.columns=["room_type","Number of preference"]
sns.barplot(x=prefer_room_type["room_type"],y=prefer_room_type["Number of preference"],palette="Set2")
plt.title("room preference")
plt.ylabel("Number of preference")
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.
it shows the value easily

##### 2. What is/are the insight(s) found from the chart?

Answer Here
it shows A room type has maximum preference so that it will help to mantaining the same

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here
it help to business study and maintaining room and facility according their type

#### Chart - 7

In [None]:
# Chart - 7 visualization code
#type of meal offer by hotel and most prefer meal consumption
df["meal"].unique()
meal_preference=df["meal"].value_counts().reset_index()
meal_preference.columns=["meal_type","number of preference"]
meal_preference
sns.barplot(x=meal_preference["meal_type"],y=meal_preference["Number of preference"],palette="tab10")
plt.title()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 8

In [None]:
# Chart - 8 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Answer Here.

# **Conclusion**

Write the conclusion here.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***