# **Project Name**    -
###Air bnb booking anaylsis


##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### **Name              - Karan Makwana

# **Project Summary -**

This project involves cleaning, exploring, and analyzing the Airbnb NYC 2019 dataset, which contains 48,895 listings with details about hosts, neighborhoods, room types, prices, reviews, and availability. The primary goal is to identify and resolve issues in the data while gaining insights into Airbnb trends in NYC.

Key Steps:
Data Cleaning:

Handle missing values in name, host_name, last_review, and reviews_per_month.
Correct data types for columns like last_review.

Data Exploration:

Examine relationships between room types, pricing, and neighborhoods.
Identify and address outliers in numerical columns such as price and minimum_nights.

Feature Engineering:

Derive insights by grouping listings by neighborhood or host activity.
Compute metrics like average price per neighborhood group or popular room types.

Insights:

Highlight trends in pricing, availability, and reviews.
Provide actionable recommendations for hosts or potential guests based on the data.










# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


The Airbnb NYC 2019 dataset contains information on nearly 49,000 listings, but the data is messy, with missing values, outliers, and inconsistencies. The challenge is to clean and transform the dataset to ensure accuracy and usability for analysis. By resolving these issues, we aim to uncover insights about pricing, availability, and popular locations, enabling better decision-making for hosts, guests, and stakeholders in the Airbnb ecosystem.

#### **Define Your Business Objective?**

To optimize the value of Airbnb listings in New York City by identifying trends, patterns, and anomalies in the data. Specifically, the project aims to:

1 Assist Hosts: Provide insights on competitive pricing, optimal availability, and strategies to attract more guests.

2 Support Guests: Identify affordable and highly rated accommodations based on location and preferences.

3 Improve Platform Performance: Help Airbnb better understand listing dynamics, host behaviors, and guest demands in NYC to enhance the user experience and maximize revenue opportunities.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# import libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import warnings


### Dataset Loading

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')

In [None]:
data=pd.read_csv('/content/drive/MyDrive/datasets/Airbnb NYC 2019.csv')

### Dataset First View

In [None]:
# Dataset First Look
data

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
data.shape

### Dataset Information

In [None]:
# Dataset Info
data.head()

In [None]:
data.tail()

In [None]:
data.info()

In [None]:
cat_data=data.select_dtypes(include=['object'])
cat_data


#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
data.duplicated().sum()

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
data.isnull().sum()

In [None]:
# Visualizing the missing values
(data.isnull().sum()/len(data)*100).reset_index()

In [None]:
data

In [None]:
plt.figure(figsize=(5,3))
sns.histplot(x=data['reviews_per_month'],kde=True)
plt.show()


In [None]:
data['reviews_per_month'].skew()

In [None]:
data['reviews_per_month'] = data['reviews_per_month'].fillna(value=data['reviews_per_month'].median())


In [None]:
missing_values_rows = data[data.isnull().any(axis=1)]
print(missing_values_rows)

In [None]:
data['name'] = data['name'].fillna("Unknown")
data['host_name'] = data['host_name'].fillna("Unknown")
data.isnull().sum()

### What did you know about your dataset?


Key Insights About the Dataset:

1: General Overview:

Contains 48,895 Airbnb listings in NYC with details about hosts, neighborhoods, room types, pricing, reviews, and availability.
Structure:


2: 16 columns: Includes both numerical (e.g., price, minimum_nights) and categorical (e.g., room_type, neighbourhood) data.
Missing Data:


3: last_review and reviews_per_month have ~20.56% missing values.
Minimal missing values in name and host_name.
Numerical Observations:


4: Price: Highly variable, ranging from $0 to $10,000.
Minimum Nights: Some extreme values (up to 1,250 nights).
Reviews Per Month: Average of 1.37, but outliers go up to 58.5.
Categorical Highlights:


5: Five neighbourhood groups: Manhattan, Brooklyn, Queens, Bronx, and Staten Island.
Popular room types: Entire home/apt, Private room, Shared room.

6: Potential Challenges:

Outliers in price and minimum_nights.
Missing values in last_review and reviews_per_month.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
data.columns

In [None]:
# Dataset Describe
data.describe()

In [None]:
data.describe(include='object')

### Variables Description

Answer Here

1. id: Unique identifier for each listing.
2. name: Name of the listing.
3. host_id: Unique identifier for each host.
4. host_name: Name of the host.
5. neighbourhood_group: Broad region in NYC (e.g., Manhattan, Brooklyn).
6. neighbourhood: Specific area within the neighbourhood group.
7. latitude: Latitude coordinate of the listing.
8. longitude: Longitude coordinate of the listing.
9. room_type: Type of accommodation (e.g., Entire home/apt, Private room).
10. price: Price per night in USD.
11. minimum_nights: Minimum number of nights required for booking.
12. number_of_reviews: Total number of reviews for the listing.
13. last_review: Date of the most recent review.
14. reviews_per_month: Average number of reviews per month.
15. calculated_host_listings_count: Number of active listings by the host.
16. availability_365: Number of days the listing is available in a year.











### Check Unique Values for each variable.

In [None]:
data

In [None]:
# Drop 'last_review' if it exists
data = data.drop(columns=['last_review'], errors='ignore')


In [None]:
data.columns

In [None]:
# Check Unique Values for each variable.
for x in cat_data.columns:
    if x == 'last_review':  # Skip 'last_review'
        continue
    print(x)
    print(data[x].unique())
    print("-------------------------------")

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
def categorize_price(price):
    if price < 50:
        return "Low"
    elif 50 <= price <= 150:
        return "Medium"
    else:
        return "High"

data['price_range'] = data['price'].apply(categorize_price)

In [None]:
def assign_popularity_tier(reviews):
    if reviews > 50:
        return "Popular"
    elif 10 <= reviews <= 50:
        return "Average"
    else:
        return "Low"

data['popularity_tier'] = data['number_of_reviews'].apply(assign_popularity_tier)

In [None]:
def is_high_demand(row):
    return row['availability_365'] > 300 and row['minimum_nights'] <= 3

data['high_demand_flag'] = data.apply(is_high_demand, axis=1)

### What all manipulations have you done and insights you found?

1. Handled Missing Values:

* Filled name and host_name with "Unknown".

* Replaced missing reviews_per_month with 0.

2. Added New Columns:

* Price Range: Categorized listings into Low, Medium, and High based on price.

3. Planned Additional Features:

* Popularity tiers based on the number of reviews.

* High-demand flag based on availability and minimum nights.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code
plt.figure(figsize=(5,3))
sns.countplot(data=data, x='neighbourhood_group', palette='viridis')
plt.title('Distribution of Listings by Neighbourhood Group')
plt.show()


##### 1. Why did you pick the specific chart?

1. Understand Regional Concentration:

The neighbourhood group is a key categorical variable. Chart 1 helps us visualize how listings are distributed across the five groups (Bronx, Brooklyn, Manhattan, Queens, and Staten Island).
This sets the stage for understanding regional trends and highlights areas with higher or lower Airbnb activity.

2. Business Relevance:

Identifying areas with the most listings helps businesses and stakeholders focus on regions with significant market activity.

3. Storytelling Potential:

The chart visually communicates regional dominance or scarcity in listings, making it easier to convey this aspect in a presentation or report.

4. Exploration Basis:

This foundational insight guides subsequent analyses, such as comparing pricing, availability, or reviews within and across these neighbourhood groups

##### 2. What is/are the insight(s) found from the chart?

 Understand which neighbourhood group has the highest concentration of listings.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights gained from  distribution of listings by neighbourhood group can directly contribute to a positive business impact, especially in terms of market strategy, resource allocation, and regional planning.

#### Chart - 2

In [None]:
# Chart - 2 visualization code
data.groupby('neighbourhood_group')['price'].mean().plot(kind='bar', color='skyblue')
plt.title('Average Price by Neighbourhood Group')
plt.show()
plt.figure(figsize=(2,3))

##### 1. Why did you pick the specific chart?

Helps understand the pricing landscape across different regions.

##### 2. What is/are the insight(s) found from the chart?

Compare average pricing across neighbourhood groups.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Negative Insight:

* Extremely high prices in premium areas may deter budget-conscious travelers, pushing them to competitors like hotels or other platforms.

* Consistently low prices in certain areas might indicate a lack of perceived value or poor market positioning.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
data['room_type'].value_counts().plot(kind='pie', autopct='%1.1f%%', colors=['lightblue', 'orange', 'lightgreen'])
plt.title('Room Type Distribution')
plt.show()
plt.figure(figsize=(3,3))


##### 1. Why did you pick the specific chart?

Shows the proportion of room types available (Entire home, Private room, Shared room).
Helps assess host preferences and customer demands.

##### 2. What is/are the insight(s) found from the chart?

 Show the proportion of different room types.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

* If "Entire home/apartment" dominates, businesses can develop services tailored for vacation rentals.
* A higher proportion of "Private rooms" might indicate demand for short stays or budget-friendly options.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
plt.figure(figsize=(5,3))
sns.scatterplot(data=data, x='price', y='number_of_reviews', alpha=0.5)
plt.title('Price vs. Number of Reviews')
plt.show()


##### 1. Why did you pick the specific chart?

To explore the relationship between pricing and customer engagement (reviews).

##### 2. What is/are the insight(s) found from the chart?

Explore the relationship between price and popularity.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

Positive impact:

* Detect whether lower prices correlate with more reviews, suggesting pricing as a key driver of demand.
* Identify high-priced listings with many reviews, indicating premium experiences in demand.



#### Chart - 5

In [None]:
# Chart - 5 visualization code
plt.figure(figsize=(5,3))
sns.countplot(data=data, x='neighbourhood_group', hue='room_type', palette='coolwarm')
plt.title('Room Type Popularity by Neighbourhood Group')
plt.show()


##### 1. Why did you pick the specific chart?

To compare room type preferences across different regions.

##### 2. What is/are the insight(s) found from the chart?

Compare room type preferences across neighbourhood groups.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

* Align room-type-specific marketing campaigns with neighborhood trends (e.g., promote private rooms in Brooklyn).
* Identify mismatches where demand doesn’t align with room types offered.

Negative impact:


* Imbalanced room-type distribution (e.g., many private rooms in tourist-heavy neighborhoods) may not align with guest preferences, leading to missed bookings.
* Low demand for certain room types might discourage hosts from listing those, shrinking diversity in offerings.


#### Chart - 6

In [None]:
# Chart - 6 visualization code
plt.figure(figsize=(5,3))
sns.boxplot(data=data, x='neighbourhood_group', y='availability_365', palette='Set3')
plt.title('Availability of Listings by Neighbourhood Group')
plt.show()


##### 1. Why did you pick the specific chart?

To understand how often listings are available across different regions.

##### 2. What is/are the insight(s) found from the chart?

 Understand which areas have more available listings.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

* High availability in a region indicates strong host engagement.
* Low availability suggests seasonal demand or limited host participation, signaling where to focus efforts.

Negative impact:

* Low availability in high-demand areas might suggest missed revenue opportunities due to inactive hosts or limited supply.
* Extremely high availability for some listings may indicate low occupancy rates, reducing profitability for those hosts.

#### Chart - 7

In [None]:
# Chart - 7 visualization code
plt.figure(figsize=(5,3))
sns.violinplot(data=data, x='neighbourhood_group', y='minimum_nights', scale='width', palette='muted')
plt.title('Distribution of Minimum Nights by Neighbourhood Group')
plt.xlabel('Neighbourhood Group')
plt.ylabel('Minimum Nights')
plt.yscale('log')
plt.show()


##### 1. Why did you pick the specific chart?

To analyze flexibility in stay requirements across neighborhoods.

##### 2. What is/are the insight(s) found from the chart?

 Analyze the variation in minimum stay requirements.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

*  Regions with shorter minimum nights may attract tourists looking for weekend trips.
* Areas with high minimum nights can target long-term stays or digital nomads.

Negative impact:

* High minimum night requirements in certain areas might alienate short-term travelers, causing them to switch to competitors.
* Inflexible policies could reduce booking frequency and impact overall revenue.

#### Chart - 8

In [None]:
# Chart - 8 visualization code
plt.figure(figsize=(5,3))
sns.countplot(data=data, x='price_range', palette='Set2')
plt.title('Price Range Distribution')
plt.xlabel('Price Range')
plt.ylabel('Count')
plt.show()


##### 1. Why did you pick the specific chart?

Segments listings into price categories for easier analysis.

##### 2. What is/are the insight(s) found from the chart?

Show the distribution of listings across different price categories.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

* Helps businesses understand what price points dominate the market and align services accordingly.
* Identify opportunities for filling gaps in specific price categories (e.g., luxury or budget-friendly).

#### Chart - 9

In [None]:
# Chart - 9 visualization code
plt.figure(figsize=(8, 8))
sns.scatterplot(data=data, x='longitude', y='latitude', hue='neighbourhood_group', alpha=0.6)
plt.title('Geographical Distribution of Listings')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.legend(title='Neighbourhood Group')
plt.show()

##### 1. Why did you pick the specific chart?

Provides a spatial overview of listing density across the city.

##### 2. What is/are the insight(s) found from the chart?

 Identify clusters of listings geographically.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

* Identifies areas of high demand (clusters) for prioritizing resources like cleaning services or customer support.
* Highlights underserved areas with potential for growth

#### Chart - 10

In [None]:
# Chart - 10 visualization code
plt.figure(figsize=(5,3))
sns.countplot(data=data, x='neighbourhood_group', hue='high_demand_flag', palette='cool')
plt.title('High Demand Listings by Neighbourhood Group')
plt.xlabel('Neighbourhood Group')
plt.ylabel('Count')
plt.legend(title='High Demand')
plt.show()


##### 1. Why did you pick the specific chart?

To compare the distribution of high-demand listings across regions.

##### 2. What is/are the insight(s) found from the chart?

Visualize the proportion of high-demand listings.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

* Helps identify regions where listings consistently meet customer expectations, guiding service enhancements.
* Regions with fewer high-demand listings need better marketing or service improvements.

#### Chart - 11

In [None]:
# Chart - 11 visualization code
plt.figure(figsize=(5,3))
sns.scatterplot(data=data, x='availability_365', y='price', alpha=0.5)
plt.title('Availability vs. Price')
plt.xlabel('Availability (Days)')
plt.ylabel('Price')
plt.show()


##### 1. Why did you pick the specific chart?

Explores the relationship between listing price and its availability throughout the year.

##### 2. What is/are the insight(s) found from the chart?

Determine how availability impacts pricing.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

* High availability for low-priced listings may indicate over-saturation, calling for pricing strategies.
* Limited availability for high-priced listings suggests premium offerings that can be expanded.

Negative impact:

* High availability at low prices might indicate that hosts are undervaluing their properties or that demand is too low, leading to reduced profitability.
* Low availability at high prices may suggest a perception of overpricing, deterring potential bookings.

#### Chart - 12

In [None]:
# Chart - 12 visualization code
plt.figure(figsize=(5,3))
data['popularity_tier'].value_counts().plot(kind='bar', color='green', edgecolor='black')
plt.title('Popularity Tier Distribution')
plt.xlabel('Tier')
plt.ylabel('Count')
plt.show()


##### 1. Why did you pick the specific chart?

Highlights the popularity tiers of listings, segmented by their reviews

##### 2. What is/are the insight(s) found from the chart?

Display how listings are distributed across popularity tiers.


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive impact:

High popularity tiers suggest areas or hosts excelling in customer satisfaction, worth replicating in other areas.
Lower tiers indicate opportunities for improving service quality or guest experiences.

#### Chart - 13

In [None]:
# Chart - 13 visualization code
plt.figure(figsize=(5,3))
data['revenue_potential'] = data['price'] * data['availability_365']
data.groupby('neighbourhood_group')['revenue_potential'].sum().plot(kind='bar', color='coral')
plt.title('Revenue Potential by Neighbourhood Group')
plt.xlabel('Neighbourhood Group')
plt.ylabel('Total Revenue Potential')
plt.show()


##### 1. Why did you pick the specific chart?

Calculates potential revenue based on price and availabilit

##### 2. What is/are the insight(s) found from the chart?

Estimate revenue potential for different neighbourhood groups

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive inmpact:

* High-revenue neighborhoods may warrant additional investments or host incentives.
* Low-revenue areas highlight where to focus on increasing occupancy or pricing optimization.

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code
# Select numeric columns for correlation analysis
numeric_cols = data.select_dtypes(include=['float64', 'int64']).columns

correlation_matrix = data[numeric_cols].corr()

plt.figure(figsize=(6, 5))
sns.heatmap(correlation_matrix, annot=True, fmt=".2f", cmap="viridis", cbar=True)
plt.title("Correlation Heatmap", fontsize=16)
plt.show()


##### 1. Why did you pick the specific chart?

* Identify relationship between variables
* Feature engineering insights
* Business strategy optimisation
* Data driven storytelling



##### 2. What is/are the insight(s) found from the chart?

* Strong correlations between number_of_reviews and reviews_per_month highlight that customer satisfaction drives continuous engagement.
* Price-related correlations are generally weak, suggesting that pricing is not the sole driver of demand, and other factors (e.g., location, quality, amenities) play a role.

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code
subset_cols = ['price', 'minimum_nights', 'number_of_reviews', 'reviews_per_month', 'availability_365']

sns.pairplot(data[subset_cols], diag_kind="kde", plot_kws={'alpha': 0.6})
plt.suptitle("Pair Plot of Selected Variables", y=1.02, fontsize=16)
plt.show()


##### 1. Why did you pick the specific chart?

* The pair plot allows us to visually examine how numerical variables (e.g., price, availability_365, reviews_per_month) interact with each other.
* Outliers, such as listings with extremely high prices or unusually low availability, can be easily identified for further investigation.
* It provides a compelling narrative about the behavior of listings, guest preferences, and host strategies.

##### 2. What is/are the insight(s) found from the chart?

* Listings with higher prices tend to have fewer reviews
* High availability is often observed for low-to-mid-priced listings
* Listings with higher availability tend to receive more consistent monthly reviews.
* Budget listings receive more reviews per month
* Listings with high minimum night requirements tend to have lower availability.
* A strong positive correlation exists, where listings with a high total number of reviews also have more monthly reviews.
* Outliers, such as listings with extremely high prices and low reviews, are also visible

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

1. Dynamic Pricing Strategies

Insights Used:

Listings with lower prices tend to have more reviews and higher availability. However, premium-priced listings have fewer reviews but cater to niche markets.

Recommendation:

Implement dynamic pricing models that adjust prices based on seasonality, location, and demand.
Offer competitive pricing for budget-conscious travelers while maintaining higher pricing for premium listings in high-demand areas like Manhattan.

2. Optimize Minimum Night Policies

Insights Used:
Listings with higher minimum nights often have lower availability and fewer reviews.

Recommendation:

Encourage hosts to lower minimum night requirements to attract short-term travelers, who are a significant segment of the market.
Highlight the potential revenue increase from higher occupancy due to shorter stays.

3. Enhance Location-Based Strategies

Insights Used:
Manhattan listings have the highest prices, while Staten Island listings are budget-friendly. Demand varies by borough.

Recommendation:

Focus marketing efforts on budget-conscious travelers for areas like Staten Island and Bronx.
Promote high-end experiences for premium locations such as Manhattan.
Use location-specific amenities or events to attract guests.

4. Improve Host Engagement

Insights Used:
Listings with higher reviews per month and overall reviews indicate active, well-maintained properties.

Recommendation:

Educate hosts about the importance of prompt responses, frequent availability, and maintaining their properties.
Provide incentives for hosts with excellent ratings, such as priority visibility on the platform.

5. Leverage Popular Listings

Insights Used:

High-demand listings with consistent reviews can serve as benchmarks for success.

Recommendation:

Identify popular listings and analyze what makes them successful (e.g., pricing, amenities, location).
Share these best practices with other hosts to improve their performance.

6. Encourage Reviews and Ratings

Insights Used:

Listings with higher reviews tend to attract more bookings and maintain consistent demand.

Recommendation:

Encourage guests to leave reviews through automated follow-ups and small incentives (e.g., discounts on future stays).
Highlight guest reviews prominently on the platform to build trust with potential customers.

7. Address Outliers and Underperforming Listings

Insights Used:

Some listings with extremely high prices or restrictive policies perform poorly in terms of availability and reviews.

Recommendation:

Work with hosts of underperforming listings to review their pricing and policies.
Provide personalized recommendations to improve listing performance.

8. Promote Year-Round Availability

Insights Used:

Listings with higher availability receive more reviews and are more active overall.

Recommendation:

Encourage hosts to maintain year-round availability.
Offer seasonal promotions to attract off-season travelers.

9. Market Differentiation for Unique Listings

Insights Used:

Unique or high-priced listings cater to niche markets but require more visibility to attract bookings.

Recommendation:

Promote luxury or unique experiences (e.g., themed stays, premium amenities) to differentiate listings.
Target these to travelers looking for curated experiences.

10. Feature and Trend Analysis

Insights Used:

Correlation between features like reviews, price, and availability drives booking behavior.

Recommendation:

Continuously monitor trends in guest preferences and update platform recommendations to reflect popular features.

Expected Outcomes

Increased Occupancy: Optimizing pricing and availability will attract more bookings.

Enhanced Revenue: Dynamic pricing and improved policies will drive higher
revenue for hosts and the platform.

Improved Customer Experience: Encouraging reviews and active host engagement ensures guest satisfaction.

Strategic Growth: Leveraging location-based strategies and popular listings will drive demand across all boroughs.

# **Conclusion**

Key Takeaways

1. Pricing and Demand

Budget listings are more frequently booked and reviewed.
Premium listings cater to niche markets and require tailored strategies to sustain demand.

2. Availability and Policies

Listings with higher availability and flexible minimum night policies perform better in terms of occupancy and reviews.

3. Geographic Impact

Location significantly influences pricing and demand, with boroughs like Manhattan commanding higher prices.

4. Reviews and Guest Satisfaction

Listings with frequent and positive reviews maintain steady demand, indicating the importance of guest engagement and feedback.

5. Host Engagement

Active and responsive hosts significantly improve listing performance and guest satisfaction.

6. Outliers and Opportunities

High-priced or restrictive listings often underperform, highlighting opportunities to optimize underutilized properties.

###Business Impact

By implementing the recommendations derived from this analysis, hosts and the platform can:

* Increase occupancy rates and revenue.
* Enhance the guest experience through better availability and improved policies.
* Strengthen the platform’s position in the competitive short-term rental market by leveraging location-based marketing and targeted pricing.

###Final Note

This project demonstrates the power of data-driven decision-making. By continuously analyzing trends and updating strategies, Airbnb and its hosts can remain agile and adaptive to market demands, ensuring sustained growth and profitability.










### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***