# **Project Name**    -



##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### **Team Member 1 -** - Nishith Sharma


# **Project Summary -**

Since its inception in 2008, Airbnb has revolutionized the travel industry by providing a unique and personalized way for guests to experience the world. Today, Airbnb is a globally recognized service, catering to millions of users. The platform's vast database, consisting of millions of listings, is a goldmine for data analysis. Analyzing this data is crucial for enhancing security, making informed business decisions, understanding customer and host behavior, improving performance, guiding marketing initiatives, and implementing innovative services.

The dataset in question comprises approximately 49,000 observations with 16 columns, containing a mix of categorical and numeric values. This dataset offers a comprehensive view of Airbnb listings, which can be leveraged to gain valuable insights into various aspects of the platform's operations and user interactions.


# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


Understanding Listing Characteristics: Identify patterns and trends in the types of properties listed on Airbnb, including location, property type, price range, and availability.

Customer and Host Behavior: Analyze customer booking patterns and host activity to understand behavior and performance. This includes examining booking frequency, duration of stays, and customer reviews.

Market Segmentation: Segment the market based on listing attributes and user behavior to identify distinct groups and tailor marketing strategies accordingly.

Performance Metrics: Evaluate key performance metrics such as occupancy rates, revenue generation, and customer satisfaction to assess the overall health of the platform.

Security and Trust: Identify potential risks and anomalies in the data that could impact the security and trustworthiness of the platform. This includes detecting fraudulent listings or suspicious user activity.

Enhancing User Experience: Use the insights gained to suggest improvements to the platform that enhance the user experience for both guests and hosts. This could involve recommendations for new features, optimizing the booking process, or improving customer support.

By addressing these goals, the analysis will provide a comprehensive understanding of Airbnb's operations and user interactions, enabling the company to make data-driven decisions that foster growth and innovation.

#### **Define Your Business Objective?**

The business objective of this analysis is to leverage data-driven insights from the Airbnb dataset to enhance the platform's operational efficiency, customer satisfaction, and overall market competitiveness. By thoroughly examining the dataset, the aim is to achieve the following:

Optimize Listing Management:

Identify the most popular types of properties and locations to help hosts optimize their listings and attract more bookings. Provide data-driven recommendations to hosts on pricing strategies and property improvements to maximize occupancy and revenue. Enhance User Experience:

Improve the booking process by understanding guest preferences and behavior patterns, leading to a more personalized and seamless user experience. Identify and mitigate potential risks to enhance the security and trustworthiness of the platform for both guests and hosts. Drive Marketing and Growth Strategies:

Segment the market effectively to tailor marketing campaigns and promotions that resonate with different user groups, driving increased engagement and bookings. Analyze customer feedback and reviews to understand areas of improvement and implement changes that boost customer satisfaction and loyalty. Improve Operational Efficiency:

Monitor and analyze key performance metrics to identify areas where operational efficiencies can be improved, reducing costs and enhancing service delivery. Use predictive analytics to forecast demand trends and optimize resource allocation, ensuring the platform is well-prepared to meet future needs. Innovate and Expand Services:

Identify opportunities for introducing new services or features that cater to emerging customer needs and preferences, staying ahead of competitors in the market. Utilize insights from data analysis to support the development and implementation of innovative solutions that enhance the overall value proposition of Airbnb. By achieving these objectives, Airbnb aims to strengthen its market position, drive sustained growth, and provide exceptional value to its users, ultimately contributing to the long-term success and sustainability of the platform.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries

from bs4 import BeautifulSoup

import pandas as pd
import numpy as np
from numpy import math
import time
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import seaborn as sns


### Dataset Loading

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')

In [None]:
path = '/content/drive/MyDrive/Airbnb NYC 2019.csv'
# load csv file
data = pd.read_csv(path)

### Dataset First View

In [None]:
# Dataset First Look
data

In [None]:
data.head()

In [None]:
data.tail()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
data.shape

### Dataset Information

In [None]:
# Dataset Info
data.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
data.duplicated().sum()

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
data.isnull().sum()

In [None]:
# Missing Values/Null Values Count
data.isna().sum().sort_values(ascending=False)[:4]

In [None]:
# Visualizing the missing values
missing_values_count = data.isna().sum()
# Create a bar plot to visualize missing values
plt.figure(figsize=(10, 6))
sns.barplot(x=missing_values_count.index, y=missing_values_count.values, palette='rocket')
plt.xlabel('Columns')
plt.ylabel('Count of Missing Values')
plt.title('Missing Values in Dataset')
plt.xticks(rotation=90)  # Rotate x-axis labels for better visibility
plt.show()

### What did you know about your dataset?

The dataset contains 48895 rows and 16 columns.

The name column has 16 missing values.

The host_name column has 21 missing values.

The last_review column has 10,052 missing values.

The reviews_per_month column has 10,052 missing values.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
data.columns

In [None]:
# Dataset Describe
data.describe()

### Variables Description

All the numerical variables of the dataset is having a good range of values , except for minimum_nights and calculated_host_listings_count have very high maximum values, which seems to be incorrect. So we will handle it while proceeding with EDA.

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for column in ['neighbourhood_group','neighbourhood','room_type']:
    print(f'{column} : {data[column].unique()}')

In [None]:
data['neighbourhood'].value_counts()

In [None]:
data['neighbourhood_group'].value_counts()

In [None]:
data['room_type'].value_counts()

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
data.info()

In [None]:
data['last_review'].head()

In [None]:
# Removing Null Values From last_review column and creating a new data frame
data_review=data[~data["last_review"].isna()]

In [None]:
data_review.info()

In [None]:
data_review.head()

In [None]:
# Converting to datetime format
data_review['last_review']=pd.to_datetime(data_review['last_review'])

In [None]:
data_review['last_review'].max()

In [None]:
data_review['last_review'].min()

In [None]:
# Creating a new column days_since_last_review
data_review['days_since_last_review']=(data_review['last_review'].max()-data_review['last_review']).dt.days

In [None]:
data_review.head()

In [None]:
#Droping last_review and reviews_per_month
data.drop(['last_review','reviews_per_month'],axis=1,inplace=True)

In [None]:
data.info()

In [None]:
#fill missing value
data['name'].fillna('uknown',inplace=True)
data['host_name'].fillna('uknown',inplace=True)

In [None]:
# cheacking total null value
data.isnull().sum()

### What all manipulations have you done and insights you found?


1.   Replaced non value of column name and host_name with "unknown".
2. Deleted column last_review and reviews_per_month because it contains high null values around 10000.
3. created a subset of data, data as data_review to analyse review related features seprately.
4. created a new features "days_since_last_review" in data_review



## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

# **Univariate analysis**

In [None]:
# Checking dataset information
data.info()

#### Chart - 1

**Box plot for numerical variables**

In [None]:
# Chart - 1 visualization code

In [None]:
columns=list(data.columns[data.dtypes!='object'])
print(columns)

In [None]:
columns.remove('id'),columns.remove('host_id')
print(columns)

In [None]:
fig,ax = plt.subplots(3,3, figsize=(14,10))
ax = ax.flatten()  # Flatten the array to use single index

for i,x in enumerate(columns):
   sns.boxplot(data[x],color='pink',ax=ax[i])
plt.tight_layout()
plt.show()

In [None]:
# Checking the max value of minimum nights
data['minimum_nights'].sort_values(ascending=False).reset_index().head(20)

In [None]:
#Checking counts of minimum_nights is equal to 365 days
data[data['minimum_nights']==365].shape

In [None]:
#Setting minimum_nights max values to 365 /Removing outliers
data=data[data['minimum_nights']<=365]

In [None]:
#Checking the max values of  calculated_host_listings_count.
data['calculated_host_listings_count'].sort_values(ascending=False).reset_index().head(500)

In [None]:
#Setting calculated_host_listings_count max values to 150 /Removing outliers
data=data[data['calculated_host_listings_count']<=150]

##### 1. Why did you pick the specific chart?

To display the distribution of data based on a five-number summary.

##### 2. What is/are the insight(s) found from the chart?

This provide a visual summary of the data, highlighting the median, quartiles, and potential outliers.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes by removeing outliers we cleaned the data with correct range of values so that while finding the insights like average value , it will give correct value and not biast towards outliers.

#### Chart - 2

 **Histogram plot**

In [None]:
# Chart - 2 visualization code
fig,ax = plt.subplots(3,3 , figsize=(12,8))
ax = ax.flatten()  # Flatten the array to use single index

for i,x in enumerate(columns):
   sns.histplot(data[x],color = 'purple',bins=20,ax=ax[i])
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

To visualise the distribution of dataset.

##### 2. What is/are the insight(s) found from the chart?



1. Most of the data in our dataset is from a particular region and its not distributed evenly across all region given in the dataset.
2. Price of most of the hotels are between 0 to 500 , but there are some exceptional cases also.
3. Although all hotels are available throughout the year but there are some hotels also which are available for only around 20 days which may be seasonal bookings like summer vacations.




##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes it will help to understand the data of different sectors and analyze them acordingly.

**checking top 10 host_name and hotel_name on airbnb bold**

In [None]:
data['name'].value_counts().head(10)

In [None]:
data['host_name'].value_counts().head(10)

#### Chart - 3

**Bar Plot**

In [None]:
data.info()

In [None]:
data['neighbourhood_group'].value_counts()

In [None]:
data['neighbourhood'].value_counts()

In [None]:
data['room_type'].value_counts()

In [None]:
# Chart - 3 visualization code
columns_to_plot =['neighbourhood_group','room_type']

# Setup the figure and axes for the subplots
fig, ax = plt.subplots(1,2, figsize=(10,5))  # Correct comment to match the grid setup
ax = ax.flatten()  # Flatten the array to use single index

# Loop over the columns and their corresponding axes
for i, col in enumerate(columns_to_plot):
    # Countplot for each categorical column
    sns.countplot(x=data[col], ax=ax[i], color='red')
    plt.setp(ax[i].get_xticklabels(), rotation=45)  # Rotate x labels for better readability

plt.tight_layout()  # Adjust layout
plt.show()

##### 1. Why did you pick the specific chart?

To represent the frequency of categorical variables.

##### 2. What is/are the insight(s) found from the chart?

1. In neighbourhood group manhattan has highest number of counts and staten island has lowest number of count.
2. most number of bookings are happening in entire home/apt, whereas least number of bookings are happening in shared room.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes the people are preffering entire home/apt, this suggest that we should prefer listings of entire home/apt through airbnb as much as possible.

Bookings of shared rooms are least preffered by customers so we should avoid listing it.

**Hotel Price analysis**

In [None]:
#Checking lowest end price of hotel
data['price'].sort_values().head(20)

In [None]:
#Checking highest end price of hotel
data['price'].sort_values(ascending=False).head(20)

In [None]:
#Removing hotels with zero prices
non_zero_price=data[data['price']!=0]

In [None]:
#Checking five numbers only
non_zero_price['price'].describe()

1. the average per night price of the hotel is 151.35
2. the miniumum per night price of the hotel is 10
3. the maximum per night price of the hotel is 10000
4. 75 percentile value is 175 but maximum price is 10000 that means there is some values which is exceptional or outlier

 **Top 10 neighbourhood**

#### Chart - 4

In [None]:
#Creating a new dataframe with top 10 neighbourhoods
neighbourhood_df=data['neighbourhood'].value_counts().head(10).reset_index()
neighbourhood_df

In [None]:
# Chart - 4 visualization code
plt.figure(figsize=(6, 4))
sns.barplot(x='count', y='neighbourhood', data=neighbourhood_df, palette='viridis')
plt.title('Number of Listings per Neighbourhood')
plt.xlabel('Count')
plt.ylabel('Neighbourhood')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 5

**Geographical distribution**

In [None]:
# Chart - 5 visualization code
plt.scatter(data['longitude'], data['latitude'], c=data['price'], cmap='viridis')
plt.colorbar(label='Price')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title('Geographical Distribution of Airbnb Listings')
plt.show()

##### 1. Why did you pick the specific chart?

To see the geographical location of the hotels located.

##### 2. What is/are the insight(s) found from the chart?

Most of the hotels are located in a particular region

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

we can analyze more locations where the demand may be high and list hotels to those locations

#### Chart - 6

In [None]:
#Creating pyplot to see the distributon of neighbourhood_group
neighbourhood_group = data['neighbourhood_group']

neighbourhood_value_count = neighbourhood_group.value_counts().plot.pie(autopct='%1.0f%%', shadow=True)

##### 1. Why did you pick the specific chart?

To see the distribution of neighbourhood_group..

##### 2. What is/are the insight(s) found from the chart?

1. The highest percentage of neighbourhood group is of manhattan (44%).
2. the lowest percentage of neighbourhood group is of staten island (1%).

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

This shows that most number of customers are preferring to book hotels having neighbourhood groups as manhattan and brooklyn, this gives an idea of listing more hotels around these neighbourhood groups or surge the hotel prices.

#### Chart - 7

**Checking top 10 hotels based on number of reviews**

In [None]:
# Sort the dataframe by number of reviews in descending order and get the top 10
top_10_hotels = data.sort_values(by='number_of_reviews', ascending=False).head(10)

In [None]:
top_10_hotels

In [None]:
# Chart - 7 visualization code
# Create a bar plot
plt.figure(figsize=(6, 4))
sns.barplot(x='number_of_reviews', y='name', data=top_10_hotels, palette='viridis')
plt.xlabel('Number of Reviews')
plt.ylabel('Hotel Name')
plt.title('Top 10 Most Reviewed Hotels')
plt.show()

##### 1. Why did you pick the specific chart?

Check top 10 most reviewed hotels.

##### 2. What is/are the insight(s) found from the chart?

Room near jkf queen bed is the most reviewed hotel.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Because they are most reviewed hotel that menas they are in demand then we can increase a little bit price for more profit.

**Airbnb review analysis**

#### Chart - 8

In [None]:
data_review.head()

In [None]:
data_review['days_since_last_review'].sort_values(ascending=False).head(20)

In [None]:
# Sort the dataframe by number of reviews in descending order and get the top 10
top_10_hotels = data_review.sort_values(by='days_since_last_review', ascending=False).head(10)
top_10_hotels


In [None]:
# Chart - 8 visualization code
# Create a bar plot
plt.figure(figsize=(6, 4))
sns.barplot(x='days_since_last_review', y='name', data=top_10_hotels, palette='viridis')
plt.xlabel('days since last review')
plt.ylabel('Hotel Name')
plt.title('Top 10 last Reviewed Hotels')
plt.show()

These are the hotels which have the most reviews_per_month thats why these hotels should be promoted and also price should be increase

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Sort the dataframe by reviews per month in descending order and get the top 10
top_10_hotels = data_review.sort_values(by='reviews_per_month', ascending=False).head(10)

# Create a bar plot
plt.figure(figsize=(6, 4))
sns.barplot(x='reviews_per_month', y='name', data=top_10_hotels, palette='viridis')
plt.xlabel('reviews per month')
plt.ylabel('Hotel Name')
plt.title('Top 10 Reviewed per month Hotels')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

 **Bivariate analysis**

#### Chart - 10

In [None]:
columns_to_plot=['neighbourhood_group','room_type']

In [None]:
# Chart - 10 visualization code
# Setup the figure and axes for the subplots
fig, ax = plt.subplots(1, 2, figsize=(8, 5))  # Correct comment to match the grid setup
ax = ax.flatten()  # Flatten the array to use single index

# Loop over the columns and their corresponding axes
for i, col in enumerate(columns_to_plot):
    # bar pot for each categorical column
    sns.barplot(x=data[col],y=data['price'], ax=ax[i],palette='rainbow')
    plt.setp(ax[i].get_xticklabels(), rotation=90)  # Rotate x labels for better readability

plt.tight_layout()  # Adjust layout
plt.show()

##### 1. Why did you pick the specific chart?

To see the price distribution of hotels with neighbourhood group and room type.

##### 2. What is/are the insight(s) found from the chart?

1. Average price of room type entire home /aprtment is high and shared room average price is low.
2. Average price of neighbourhood_group manhattan is high and Bronyx average price is low.


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

This will give an idea of price variation of different types of rooms and neighbourhood groups according to which decision can be made to control the price range of hotels

#### Chart - 11

In [None]:
# Chart - 11 visualization code
# Creating a bar plot of the average price for each neighbourhood group and room type
viz_3 = sns.catplot(x='neighbourhood_group', y='price', col='room_type', data=data, kind='bar', ci=None)
viz_3.set_xticklabels(rotation=90)

##### 1. Why did you pick the specific chart?

To see the distributon price for each neighbourhood group of each room type..

##### 2. What is/are the insight(s) found from the chart?

Distributon price for each neighbourhood group of each room type is similar except for entire home and apartment wher the price of staten island is more as compare to queens and bronx.

#### Chart - 12

In [None]:
# Chart - 12 visualization code
# Correlation Heatmap visualization code
# Calculate the correlation matrix
df=data.drop(['id','host_id','name','host_name','neighbourhood_group','neighbourhood','room_type'],axis=1)
corr_matrix =df.corr()

# Create a heatmap
plt.figure(figsize=(6, 4))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Correlation Heatmap')
plt.show()

##### 1. Why did you pick the specific chart?

To see the corelation between each variable.

##### 2. What is/are the insight(s) found from the chart?

1. Latitude and Longitude:

Latitude and longitude have a small positive correlation (0.084). This suggests that there is a slight relationship between these two geographical coordinates, but it is not strong.
Price Correlations:

2. Price has a negative correlation with longitude (-0.15), indicating that as longitude increases, the price tends to decrease slightly.
Price has a very weak positive correlation with minimum nights (0.049), availability 365 (0.076), calculated host listings count (0.025), and latitude (0.034). These correlations are very weak, indicating that price is not strongly influenced by these variables.
Minimum Nights:

3. Minimum nights has a moderate positive correlation with calculated host listings count (0.23) and availability 365 (0.16). This suggests that listings that require a higher minimum number of nights are more likely to have more listings by the same host and be available throughout the year.
Number of Reviews:

4. Number of reviews has weak correlations with other variables. It has a weak positive correlation with longitude (0.054) and availability 365 (0.18), and a weak negative correlation with minimum nights (-0.091).
Calculated Host Listings Count:

5. This variable has a weak negative correlation with longitude (-0.048) and a weak positive correlation with availability 365 (0.26). This suggests that hosts with more listings tend to have properties that are more available throughout the year.
Availability 365:

6. Availability 365 shows the highest positive correlation with calculated host listings count (0.26), indicating that properties that are available more days in a year are likely to belong to hosts with multiple listings.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Pair Plot visualization code
# Create a pair plot
sns.pairplot(data)
plt.show()

##### 1. Why did you pick the specific chart?

To see the distribution beween each variables.

##### 2. What is/are the insight(s) found from the chart?

There is no any such strong corelation between any features.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

1. Price Optimization:

* Analyze and adjust pricing strategies to optimize revenue. Focus on the most frequent price range (0-500) while considering exceptional cases for premium pricing.
* Implement dynamic pricing models for hotels available only seasonally (e.g., summer vacation bookings).

2. Market Expansion and Distribution:

* Address the uneven distribution of hotels across different regions. Strategically increase the presence in regions with lower counts, like Staten Island, to balance market coverage.
* Increase marketing efforts in regions with lower hotel counts to attract more bookings.

3. Room Type Strategy:

* Leverage the high demand for entire home/apartment bookings by expanding the inventory in this category.
* Develop targeted marketing campaigns to increase the bookings for shared rooms, which currently have the least number of bookings.

4. Location-Based Pricing Strategy:

* Capitalize on the high average prices in Manhattan by offering premium services and amenities to justify higher rates.
* Explore competitive pricing strategies in regions like Bronx where the average price is lower.

5. Enhancing Reviews and Ratings:

* Encourage guest reviews, especially for high-demand hotels like the one near JFK with the queen bed, to maintain and enhance reputation and visibility.

6. Correlation Insights:

*. Utilize the correlation data to understand the factors influencing pricing and availability. For example, use the positive correlation between minimum nights and calculated host listings count to encourage longer stays for multi-listing hosts.

# **Conclusion**

From the data analysis, several key insights have emerged:

1. Price and Outliers:

* By removing outliers, we achieved a more accurate and unbiased average price, enhancing the reliability of our insights.

2. Regional Distribution:

* The majority of hotel data comes from a specific region, highlighting an opportunity to expand and diversify our geographical presence.

3. Seasonal Availability:

* Some hotels are only available seasonally, suggesting a potential for optimizing revenue during peak seasons.

4. Booking Preferences:

* Entire home/apartment bookings dominate, indicating a preference that can be leveraged for business growth.

5. Regional Price Variations:

* Manhattan has the highest percentage of listings and the highest average prices, while Staten Island has the lowest. This points to potential market expansion and pricing strategy adjustments.

6. Review Trends:

* The most reviewed hotel is a key asset, and encouraging more reviews can enhance overall business visibility and credibility.

7. Correlation Analysis:

* The correlations between various factors such as price, longitude, and availability provide insights for refining pricing and availability strategies. For instance, the slight negative correlation between price and longitude suggests potential for pricing adjustments based on location.

By aligning our business strategies with these insights, we can optimize revenue, improve market coverage, and enhance customer satisfaction.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***