<a href="https://colab.research.google.com/github/rashmi0852/AirBnb-Booking-Analysis-EDA--/blob/main/Indivisual_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    - AirBnb Booking Analysis



##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### **Name**            - Rashmiranjan Nayak


# **Project Summary -**

Airbnb, Inc. is an American company that operates an online marketplace for lodging, primarily homestays for vacation rentals, and tourism activities. Based in San Francisco, California, the platform is accessible via website and mobile app.

Since 2008, guests and hosts have used Airbnb to expand on traveling possibilities and present a more unique, personalized way of experiencing the world. Today, Airbnb became one-of-a-kind service that is used and recognized by the whole world.

The Exploratory Data Analysis (EDA) project conducted on Airbnb data aimed to gain insights and derive meaningful conclusions from a comprehensive dataset related to Airbnb listings and bookings. EDA is a crucial phase in data analysis that involves investigating and visualizing the data to uncover patterns, trends, outliers, and relationships that can guide further analysis and decision-making. In this project, we delved into the Airbnb dataset to extract valuable information and provide actionable insights for both hosts and potential guests.

The dataset used in this analysis was obtained from Airbnb's platform and encompassed a wide range of information, including listing details, host information, geographical location, pricing, and booking history. The dataset covered listings from various cities and included attributes such as property type, neighbourhood, amenities, reviews, and availability. The primary objectives of this EDA project were to understand the key factors influencing listing prices, identify popular neighbourhoods and property types, analyse booking trends, and explore the relationships between different variables.

The project began with data preprocessing, where the dataset was cleaned and transformed to ensure consistency and accuracy.

Missing values were handled appropriately, and outliers were identified and addressed to prevent skewed analysis.

Once the data was prepared, various visualization techniques were employed to uncover insights.



# **GitHub Link -**

https://github.com/rashmi0852/AirBnb-Booking-Analysis-EDA--.git


# **Problem Statement**


Since 2008, guests and hosts have used Airbnb primarily homestays for vacation rentals, and tourism activities,The challenge is to extract meaningful insights from Airbnb booking data swiftly and efficiently. This project aims to perform Exploratory Data Analysis (EDA) on Airbnb booking records to uncover trends, patterns, and correlations. With the exponential growth of Airbnb's user base, understanding booking behavior, peak seasons, and preferred accommodation types is crucial. This EDA project seeks to address the demand for actionable insights that hosts and travelers can use to optimize their experiences. By dissecting booking data,we aim to empower users with data-driven decisions, aiding hosts in better pricing strategies and assisting travelers in finding ideal stays efficiently.

#### **Define Your Business Objective?**

The business objective is by utilizing comprehensive Airbnb booking analysis to inform strategic decisions, optimizing host pricing models and enhancing  traveler experiences,Embering Airbnb's ecosystem by leveraging data-driven insights.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import warnings
warnings.filterwarnings("ignore")

### Dataset Loading

In [None]:
# Load Dataset
airbnb=pd.read_csv("Airbnb NYC 2019.csv")

### Dataset First View

In [None]:
airbnb.head().T

In [None]:
airbnb.tail().T

### Dataset Rows & Columns count

In [None]:
#Columns count

print("columns of dataset---",airbnb.columns)
print("columns count=",len(airbnb.columns))

In [None]:
#Rows count

print("Rows count=",len(airbnb.axes[0]))

In [None]:
#Shape of Dataset
airbnb.shape

### Dataset Information

In [None]:
# Dataset Info
airbnb.info()

Here we have in total 16 columns from that ,we have
Numeric =10,
categorical= 6( last_review = date)


#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count

duplicate_values=airbnb.duplicated().sum()
print("Duplicate rows in AirBnb dataset:",duplicate_values)

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
print (f"Missing Values in each column"+"\n" + "--"*15)
print(airbnb.isnull().sum())

In [None]:

# percentage of missing value in AIrBnb dataset
print (f"Missing Values % in each column"+"\n" + "--"*15)
(airbnb.isnull().mean())*100





*   **NAME** has in total **16** Null values which is **0.03**%
*   **Host Name** has in total **21** Null values which is **0.04**%
*    **Last Review** has in total **10,052** Null values which is **20.5**%
* **Reviews Per Month** has in total **10,052** Null values which is **20.5**%









In [None]:
# Visualizing the missing values of Airbnb Dataset


# Calculating the percentage of missing values in each column
missing_percent = (airbnb.isnull().sum() / len(airbnb)) * 100

# Creating a DataFrame to store column names and corresponding missing percentages
missing_df = pd.DataFrame({'Column': airbnb.columns, 'MissingPercent': missing_percent})

# Sorting the DataFrame by missing percentages in descending order
missing_df = missing_df.sort_values(by='MissingPercent', ascending=False)

# Creating a simple vertical bar plot with missing percentages displayed at the top
plt.figure(figsize=(10, 6))
bars = plt.bar(missing_df['Column'], missing_df['MissingPercent'], color='purple')

# Adding missing percentages as text labels on top of each bar
for bar in bars:
    plt.text(bar.get_x() + bar.get_width() / 2, bar.get_height(), f'{bar.get_height():.2f}%',
             ha='center', va='bottom', color='black', fontsize=10)

# Customizing the plot
plt.xlabel('Columns')
plt.ylabel('Percentage of Missing Values')
plt.title('Missing Values Percentage in Airbnb Dataset')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()

# Showing the plot
plt.show()




1. Why did you pick the specific chart?

 A bar plot is a suitable choice for visualizing missing values in a dataset due to its capacity to present clear comparisons and relative proportions of missing data across columns. With bars directly corresponding to missing value percentages, it provides an intuitive and easily interpretable representation, making it accessible to a wide audience. Bar plots are effective in highlighting columns with high or low missing value percentages, aiding in prioritizing data cleaning efforts.

2. What are the insights found from the chart?



*   The columns ***Last Review*** & ***Reviews Per Month*** contains the largest proportion of NaN values, around **20.5%** . Dropping these rows would result in significant data loss, which could potentially reduce the accuracy of the analysis. Instead of dropping the rows, a better approach could be to impute the missing values using an aggregate metric, such as the mean, median or mode of the remaining values in the "Rating" column,Or by replacling with 0.
*   The other columns ***Name*** & ***Host Name*** have a much lower percentage of null values, with values ***0.03%*** & ***0.04%***. As missing persentage is very low we can delete those missing values records & it will have tends to 0 effect on the dataset.







### What did you know about your dataset?

By exploring ***AirBnb*** dataset following insights were found

The dataset has ***48895*** Records & **16** Features .
Features are described bellow-


* ***id*** : A unique id identifying an airbnb lisitng
* ***name*** : Name representating the accomodation/Property
* ***host_id*** : A unique id identifying an airbnb host
* ***host_name*** : Name under whom host is registered
* ***neighbourhood_group*** : A group of area(location)
* ***neighbourhood*** : Area falls under neighbourhood_group
* ***latitude*** : Co-ordinate of Property
* ***longitude*** : Co-ordinate of Property
* ***room_type*** : Type to categorize listing rooms
* ***price*** : Price of listing
* ***minimum_nights*** : Minimum nights required to stay in a single visit
* ***number_of_reviews*** : Total count of reviews given by visitors
* ***last_review*** : Last review given date
* ***reviews_per_month*** : No.of reviews given per month
* ***calculated_host_listings_count*** : Total no of listing registered under the host
* ***availability_365*** : Number of days property is availabe in a year





Also we found that among all 16 features


*   ***Name***,***host_name***,***neighbourhood_group***,***neighbourhood***,***room_type*** are categorcal Features
*   only one column ***last_review*** is Date type
*   ***latitude*** and ***longitude*** has represented  co-ordinates & rest are numeric types.



We can check there are 4 columns containing null values which are ***name***, ***host_name*** (looks like we don't have any alternative for those missing values & we can drop those co-rresponding records) ,***reviews_per_month*** (obviously, if a listing has never received a review, its possible and valid). So we will just fillna(0) to those null value & ***last_review*** is not a necessary column for analysis ,handeling missing values is not required as  we are going to drop it along with ***longitude***&***latitude***

Handeling missing values

In [None]:
#Removing null records based on Name,Host_Name column
airbnb = airbnb.dropna(subset=['name', 'host_name'])


In [None]:
#filling 0 to null values in reviews_per_month column
airbnb['reviews_per_month'] = airbnb['reviews_per_month'].fillna(0)

Removing unnecessary columns

In [None]:
#Removing longitude,latitude,last_review column

airbnb.drop(["longitude","latitude","last_review"],axis=1,inplace=True)

In [None]:
airbnb.shape

In [None]:
airbnb.isnull().sum()

We can see that after handeling null values & removing unnecessary features ,Records reduced to **48858** & features reduced to **13**.Now **AirBnb** dataset is pretty much  cleaned & we can proceed to next step

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
airbnb.columns

In [None]:
# Dataset Describe
airbnb.describe()

### Variables Description

Features like **name**, **host_name**, **neighbourhood_group**,
 **neighbourhood**, **room_type** are some categorical columns which can't be described with describe method so rest  features are described as-
 * Count of **id**, **host_id** is **48858** same as records as both contains unique values
 * Rest all records seems normal with some minimum,maximun& counts except **price** column where minimum values shows **0** ,which is a human error. To handel those records we can drop those misleading records.Along with we can ask hosts to update their prices by visiting Airbnb website.

In [None]:
#droping records containing 0 as price
airbnb = airbnb[airbnb['price'] != 0]

In [None]:
airbnb.describe()

In [None]:
airbnb.shape

Now we can say AirBnb dataset is fully cleaned & Description also looks pretty normal now & records reduced to **48847** wth **13** features

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for name column
print("Name of properties are=",airbnb["name"].unique())
#number of unique values in name column
print("Unique count=",airbnb["name"].nunique())

In [None]:
# Check Unique Values for host_name column
print("Host names are=",airbnb["host_name"].unique())
#number of unique values in host_name column
print("Unique count=",airbnb["host_name"].nunique())

In [None]:
# Check Unique Values for neighbourhood_group column
print(" Name of neighbourhood_groups are=",airbnb["neighbourhood_group"].unique())
#number of unique values in neighbourhood_group column
print("Unique count=",airbnb["neighbourhood_group"].nunique())

In [None]:
# Check Unique Values for neighbourhood column
print(" Name of neighbourhoods are=",airbnb["neighbourhood"].unique())
#number of unique values in neighbourhood column
print("Unique count=",airbnb["neighbourhood"].nunique())

In [None]:
# Check Unique Values for room_type  column
print(" Types of rooms available are=",airbnb["room_type"].unique())
#number of unique values in room_type  column
print("Unique count=",airbnb["room_type"].nunique())

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
#problem statement-1
#What is the distribution of listings across different neighborhood_groups? & what is the average price of property in each neighbourhood_groups?


# Group data by neighborhood_group and calculate the count and average price
grouped_data = airbnb.groupby('neighbourhood_group').agg({'id': 'count', 'price': 'mean'}).reset_index()
grouped_data.rename(columns={'id': 'property_count', 'price': 'avg_price'}, inplace=True)

# Sort by property_count in descending order
grouped_data = grouped_data.sort_values(by='property_count', ascending=False)




In [None]:
#problem statement-2
#What are the different types of room listings available? & How does the price vary across different room types?

# Get the counts of each room type
room_type_counts = airbnb['room_type'].value_counts()

# Calculate the average cost of each room type
avg_price_by_room_type = airbnb.groupby('room_type')['price'].mean()






In [None]:
#problem statement-3: Who are the top hosts (based on the number of listings)?


# Get top hosts based on number of listings
top_hosts = airbnb['host_name'].value_counts().head(10)

# Filter data to include only properties of top hosts
top_hosts_data = airbnb[airbnb['host_name'].isin(top_hosts.index)]

# Calculate average price of properties for top hosts
avg_price_top_hosts = top_hosts_data.groupby('host_name')['price'].mean().sort_values(ascending=False)






In [None]:
#problem statement-4
# How does the count of different room types vary across various neighborhood groups in the  listings?

# Group data by neighborhood_group and room_type, then calculate counts
roomtype_counts_by_neighbourhood = airbnb.groupby(['neighbourhood_group', 'room_type']).size().reset_index(name='count')




In [None]:
#problem statement-5
#which are top 5 & bottom 5 neighbourhoods in terms of listings?


# Group data by neighborhood and calculate room count
room_count_by_neighbourhood = airbnb['neighbourhood'].value_counts()

# Find the top 10 and bottom 10 neighborhoods
top_10_neighbourhoods = room_count_by_neighbourhood.head(5)
bottom_10_neighbourhoods = room_count_by_neighbourhood.tail(5)



In [None]:
#problem statement-6
#How does the room count vary among  neighbourhoods in each neighbourhood groups?(Top 3)

# Group data by neighbourhood_group and neighbourhood, then calculate room count
room_count_by_neighbourhood = airbnb.groupby(['neighbourhood_group', 'neighbourhood'])['id'].count()

# Sort and select top 3 neighborhoods within each neighbourhood_group
top_3_neighbourhoods = room_count_by_neighbourhood.groupby('neighbourhood_group').nlargest(3)




In [None]:
#problem statement-7
# How does the price of different room types vary across various neighborhood groups?

# Group data by neighbourhood_group, room_type, and calculate average price
group_data = airbnb.groupby(['neighbourhood_group', 'room_type']).agg({'price': 'mean'}).reset_index()

# Create a pivot table to reshape the data for plotting
pivot_table = group_data.pivot_table(index='neighbourhood_group', columns='room_type', values='price')



### What all manipulations have you done and insights you found?

Answer Here.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code
#What is the distribution of listings across different neighborhood_groups ? & what is the average price of property in each neighbourhood_groups?


# Create subplots
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Plot 1: Number of Properties Listed by Neighborhood Group
sns.barplot(data=grouped_data, x='neighbourhood_group', y='property_count', palette='viridis', ax=axes[0])
axes[0].set_title('Number of Properties Listed by Neighborhood Group')
axes[0].set_xlabel('Neighborhood Group')
axes[0].set_ylabel('Number of Properties')
axes[0].set_xticklabels(axes[0].get_xticklabels(), rotation=45)

# Add data labels to the bars
for p in axes[0].patches:
    axes[0].annotate(f'{int(p.get_height())}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Plot 2: Average Price per Property by Neighborhood Group
sns.barplot(data=grouped_data, x='neighbourhood_group', y='avg_price', palette='viridis', ax=axes[1])
axes[1].set_title('Average Price per Property by Neighborhood Group')
axes[1].set_xlabel('Neighborhood Group')
axes[1].set_ylabel('Average Price')
axes[1].set_xticklabels(axes[1].get_xticklabels(), rotation=45)

# Add data labels to the bars
for p in axes[1].patches:
    axes[1].annotate(f'${p.get_height():.2f}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Adjust layout and display
plt.tight_layout()
plt.show()



##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 2

In [None]:
# Chart - 2 visualization code
#What are the different types of room listings available? &How does the price vary across different room types?

# Create subplots
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Plot 1: Count of Different Room Types
room_type_counts.plot(kind='bar', ax=axes[0], color='skyblue')
axes[0].set_title('Count of Different Room Types')
axes[0].set_xlabel('Room Type')
axes[0].set_ylabel('Count')
axes[0].set_xticklabels(room_type_counts.index, rotation=45)

# Add data labels to the bars for the first plot
for p in axes[0].patches:
    axes[0].annotate(f'{int(p.get_height())}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Plot 2: Average Cost of Each Room Type
avg_price_by_room_type.plot(kind='bar', ax=axes[1], color='yellow')
axes[1].set_title('Average Cost of Each Room Type')
axes[1].set_xlabel('Room Type')
axes[1].set_ylabel('Average Price')
axes[1].set_xticklabels(avg_price_by_room_type.index, rotation=45)

# Add data labels to the bars for the second plot
for p in axes[1].patches:
    axes[1].annotate(f'${p.get_height():.2f}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Adjust layout and display
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 3

In [None]:
# Chart - 3 visualization code

# Create subplots
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Plot 1: Top Hosts by Number of Listings
top_hosts.plot(kind='bar', ax=axes[0], color='blue')
axes[0].set_title('Top Hosts by Number of Listings')
axes[0].set_xlabel('Host Name')
axes[0].set_ylabel('Number of Listings')
axes[0].set_xticklabels(top_hosts.index, rotation=45)

# Add data labels to the bars for the first plot
for p in axes[0].patches:
    axes[0].annotate(f'{int(p.get_height())}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Plot 2: Average Price of Properties for Top Hosts
avg_price_top_hosts.plot(kind='bar', ax=axes[1], color='green')
axes[1].set_title('Average Price of Properties for Top Hosts')
axes[1].set_xlabel('Host Name')
axes[1].set_ylabel('Average Price')
axes[1].set_xticklabels(avg_price_top_hosts.index, rotation=45)

# Add data labels to the bars for the second plot
for p in axes[1].patches:
    axes[1].annotate(f'${p.get_height():.2f}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Adjust layout and display
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 4

In [None]:
# Chart - 4 visualization code

# How does the count of different room types vary across various neighborhood groups in the  listings?

# Create a bar plot
plt.figure(figsize=(10, 6))
sns.barplot(data=roomtype_counts_by_neighbourhood, x='neighbourhood_group', y='count', hue='room_type')
plt.title('Room Type Count in Each Neighborhood Group')
plt.xlabel('Neighborhood Group')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.legend(title='Room Type')

# Annotate bars with values
for p in plt.gca().patches:
    plt.gca().annotate(f'{int(p.get_height())}', (p.get_x() + p.get_width() / 2., p.get_height()),
                       ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Adjust layout and display
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 5

In [None]:
# Chart - 5 visualization code
#which are top 5 & bottom 5 neighbourhoods in terms of listings?


# Create subplots
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Plot 1: Top 10 Neighborhoods by Room Count
top_10_neighbourhoods.plot(kind='bar', ax=axes[0], color='green')
axes[0].set_title('Top 10 Neighborhoods by Room Count')
axes[0].set_xlabel('Neighbourhood')
axes[0].set_ylabel('Room Count')
axes[0].set_xticklabels(top_10_neighbourhoods.index, rotation=45)

# Add data labels to the bars for the first plot
for p in axes[0].patches:
    axes[0].annotate(f'{int(p.get_height())}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Plot 2: Bottom 10 Neighborhoods by Room Count
bottom_10_neighbourhoods.plot(kind='bar', ax=axes[1], color='blue')
axes[1].set_title('Bottom 10 Neighborhoods by Room Count')
axes[1].set_xlabel('Neighbourhood')
axes[1].set_ylabel('Room Count')
axes[1].set_xticklabels(bottom_10_neighbourhoods.index, rotation=45)

# Add data labels to the bars for the second plot
for p in axes[1].patches:
    axes[1].annotate(f'{int(p.get_height())}', (p.get_x() + p.get_width() / 2., p.get_height()),
                      ha='center', va='center', xytext=(0, 10), textcoords='offset points')

# Adjust layout and display
plt.tight_layout()
plt.show()


##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 6

In [None]:
# Chart - 6 visualization code
#How does the room count vary among  neighbourhoods in each neighbourhood groups?(Top 3)

# Create a grouped bar plot
plt.figure(figsize=(12, 8))
ax = top_3_neighbourhoods.plot(kind='bar', color=['skyblue', 'green', 'orange'])

# Add value labels to each bar
for container in ax.containers:
    for bar in container:
        yval = bar.get_height()
        plt.text(bar.get_x() + bar.get_width()/2, yval + 10, int(yval), ha='center', va='bottom', fontsize=10)

plt.title('Top 3 Neighborhoods by Room Count in Each Neighbourhood Group')
plt.xlabel('Neighbourhood Group - Neighborhood')
plt.ylabel('Room Count')
plt.xticks(rotation=45, ha='right')

# Show legend for neighbourhood groups
plt.legend(title='Neighbourhood Group', bbox_to_anchor=(1, 1))

plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 7

In [None]:
# Chart - 7 visualization code
# How does the price of different room types vary across various neighborhood groups?

# Create a grouped bar plot to show the average price of each room type in each neighbourhood group
plt.figure(figsize=(12, 8))
ax = pivot_table.plot(kind='bar', color=['skyblue', 'green', 'orange', 'purple', 'brown'])

plt.title('Average Price of Room Types in Each Neighbourhood Group')
plt.xlabel('Neighbourhood Group')
plt.ylabel('Average Price ($)')
plt.xticks(rotation=0, ha='center')

# Show legend for room types
plt.legend(title='Room Type')

# Add value labels to each bar
for container in ax.containers:
    for bar in container:
        yval = bar.get_height()
        plt.text(bar.get_x() + bar.get_width()/2, yval + 5, f'${yval:.2f}', ha='center', va='bottom', fontsize=9)

plt.tight_layout()
plt.show()


##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 8

In [None]:
# Chart - 8 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Answer Here.

# **Conclusion**

Write the conclusion here.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***