# **Project Name**    -



##### **Project Type**    - EDA with python, Power BI
##### **Contribution**    - Individual
##### **Name**  - Tanvi Pawar


# **Project Summary -**

For this EDA project, we have chosen the "Airbnb Listings Data" dataset from two major cities: Chicago and New Orleans. This dataset provides a comprehensive snapshot of various attributes related to Airbnb listings, such as property type, neighborhood, pricing, availability, and more. The dataset is ideal for conducting an in-depth exploration of the local Airbnb market and deriving actionable insights.

**Steps to Proceed with the Dashboard**

**Data Cleaning**
Begin by addressing the disorder and inconsistency within the dataset. Utilized Google Colab Notebook and Power BI to systematically cleanse the data, rectify discrepancies, eliminate duplicates, and standardize formats.

**Data Transformation**
Generate supplementary columns by leveraging pre-existing categorical data. These columns were derived from extensive descriptive text, which, in its original form, was arduous to comprehend and unsuitable for visualization purposes. The additional columns provided a much clearer perspective, enabling the creation of effective visualizations.

**Power BI**
Employ Power BI's data preparation features to resolve inconsistencies. For instance, under the "Neighborhood" column, identical entities were variably represented due to disparities in letter casing, spelling variations, or phonetic similarity. Power BI’s built-in transformation tools, such as "Replace Values" and other data cleaning features, were instrumental in standardizing the data for analysis and visualization.

# **GitHub Link -**

https://github.com/Tanvipawar10/Air_BNB_PowerBI-Data-visualization

# **Problem Statement**


The operations of Airbnb vary significantly across different urban environments due to diverse demographic, economic, and cultural factors. This project aims to conduct a comprehensive comparative examination of Airbnb's presence in two major cities, Chicago and New Orleans, using Python for data exploration and analysis, and Power BI for advanced data cleaning, transformation, and visualization.

The objective is to uncover shared attributes, key disparities, and unique patterns in the Airbnb market between these cities. By leveraging Power BI's robust data modeling and visual analytics capabilities, the study seeks to present actionable insights and highlight trends in property types, pricing strategies, neighborhood dynamics, and availability. This will provide a deeper understanding of the factors shaping Airbnb's operations in these distinct urban landscapes.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
from google.colab import drive
import matplotlib.pyplot as plt
import seaborn as sns

### Dataset Loading

In [None]:
# Load Dataset
#Loading the repository files from the github to collab.
!git clone https://github.com/Tanvipawar10/Air_BNB_PowerBI-Data-visualization.git

In [None]:
# Giving a path to the files
Chicago_listings_path = "/content/Air_BNB_PowerBI-Data-visualization/Chicago listing.csv"
New_Orleans_listings_path =  "/content/Air_BNB_PowerBI-Data-visualization/New orleans listing.csv"

In [None]:
#Load dataset from the loaded repositary files
Chicago_listing = pd.read_csv(Chicago_listings_path)
New_orleans_listing = pd.read_csv(New_Orleans_listings_path)

### Dataset First View

In [None]:
# Dataset First Look
New_orleans_listing.head()

In [None]:
Chicago_listing.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
New_orleans_listing.shape


In [None]:
Chicago_listing.shape

### Dataset Information

In [None]:
# Dataset Info
New_orleans_listing.info()

In [None]:
Chicago_listing.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
print("Number of duplicate rows in New orleans:" , New_orleans_listing.duplicated().sum())

New_orleans_listing.drop_duplicates(inplace=True)

In [None]:
print("Number of duplicate rows in Chicago:" , Chicago_listing.duplicated().sum())

Chicago_listing.drop_duplicates(inplace=True)

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
New_orleans_listing.isnull().sum()

In [None]:
Chicago_listing.isnull().sum()

In [None]:
# Visualizing the missing values
sns.heatmap(New_orleans_listing.isnull() , cbar=False , cmap='viridis')
plt.title("Visualizing missing values")
plt.show()

In [None]:
sns.heatmap(Chicago_listing.isnull() , cbar=False , cmap='viridis')
plt.title("Visualizing missing values")
plt.show()

### What did you know about your dataset?

The Airbnb Listings Data contains information about different properties available for rent on Airbnb in a specific city. Each record represents a unique listing and includes attributes such as property type, neighbourhood, number of bedrooms, pricing, availability, host information, and more.

There are no duplicate rows in our dataset.

The last_review and reviews_per_month column may have null values because of 0 number_of_reviews for that listing, hence the listing having 0 reviews don't have last review or reviews per month value.

Last_review was in object datatype, while should be in datetime dtype.

## ***2. Understanding Your Variables***

In [None]:
# New orleans Columns
New_orleans_listing.columns

In [None]:
# Chicago Columns
Chicago_listing.columns

In [None]:
# New orleans Describe
New_orleans_listing.describe()

In [None]:
# Chicago Describe
Chicago_listing.describe()

### Variables Description

Key Attributes:

id: Unique identifier for each listing.

name: The title or name of the listing.

host_id: Unique identifier for the host of the property.

host_name: Name of the host.

neighbourhood_group: The broader area or group that the neighbourhood belongs to.

neighbourhood: Specific neighbourhood where the property is located.

latitude: Latitude coordinate of the property.

longitude: Longitude coordinate of the property.

room_type: Type of room (e.g., Private room, Entire home/apt, Shared room).

price: Price of the listing per night.

minimum_nights: Minimum number of nights required for booking.

number_of_reviews: Total number of reviews received for the listing.

last_review: Date of the last review.

reviews_per_month: Average number of reviews per month.

availability_365: Number of days the listing is available for booking in a year.

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
unique_counts_New_orleans = New_orleans_listing.nunique()
print(unique_counts_New_orleans)

In [None]:
unique_counts_Chicago = Chicago_listing.nunique()
print(unique_counts_Chicago)

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Appending the dataset vertically
Both_listings = pd.concat([New_orleans_listing, Chicago_listing], axis=0 , ignore_index=True)

In [None]:
# Appending the dataset resulting in increase of rows
Both_listings.shape

In [None]:
# Any duplicate id values are not present in the dataset after merging
Both_listings["id"].duplicated().sum()

In [None]:
# Total null values in the dataset
Both_listings.isnull().sum()

In [None]:
# Dropped the unnecessary columns
Both_listings.drop(["neighbourhood_group","number_of_reviews_ltm","license"], axis=1 , inplace=True)

In [None]:
# Removing Null values in Price column
Both_listings.dropna(subset=["price"], axis= 0 , inplace = True)

In [None]:
# Updated null values count
Both_listings.isnull().sum()

In [None]:
# The listings having 0 reviews on AirBNB
# number_of_reviews = 0, be the reason behind null values in last_review and reviews_per_month column.
zero_reviews_listing = Both_listings[Both_listings["number_of_reviews"]==0][["number_of_reviews"]].count()
print(zero_reviews_listing)

In [None]:
# Replacing null values that are in column last_review and reviews_per_month column
Both_listings.fillna(0, inplace=True)

In [None]:
# Updated dataset null values
Both_listings.isnull().sum()

In [None]:
# Updated Dataset Shape
Both_listings.shape

In [None]:
# Save the DataFrame to a CSV file, excluding the index values
Both_listings.to_csv('AirBNB_PowerBI_Prep.csv', index= False)

### What all manipulations have you done and insights you found?

The datasets from Chicago and New Orleans have been appended together.
Some columns have been removed from the dataset to focus on relevant attributes.
Null values present in the name and price columns have been removed.
Null values in the last_review and reviews_per_month columns have been replaced with zero.
The appended dataset has been saved as AirBNB_PowerBI_Prep.csv and is ready for transformation and analysis in Power BI.




```
# This is formatted as code
```

## **5. Solution to Business Objective**

What do you suggest the client to achieve Business Objective ?

Explain Briefly.

To help the client improve Airbnb operations in Chicago and New Orleans, here are some simple and effective suggestions:

Adjust Pricing Smartly: Use data to set prices based on demand. Charge higher prices in popular neighborhoods or during busy seasons and offer discounts when demand is low.

Focus on Popular Property Types: Find out which types of properties (like apartments or houses) are most in demand and encourage hosts to offer more of those.

Target High-Demand Areas: Concentrate on neighborhoods that attract the most visitors. Promote listings in these areas to increase bookings.

Improve Listings: Suggest hosts add better photos, detailed descriptions, and popular amenities to make their properties more attractive to customers.

Pay Attention to Reviews: Use customer reviews to find areas to improve, like cleanliness or communication. Satisfied guests are more likely to leave good reviews and book again.

Plan for Seasonal Demand: Look at booking trends during different times of the year and run special promotions during festivals or events in each city to attract more guests.

Stay Flexible: Keep an eye on market trends and update strategies as needed. Use Power BI dashboards to track important data and make quick decisions.

These simple steps will help the client attract more customers, improve operations, and grow their Airbnb business in both cities.

# **Conclusion**

By carefully analyzing Airbnb data from Chicago and New Orleans, we gained valuable insights into how the market works in these two cities. The data showed us patterns in pricing, property types, customer preferences, and neighborhood popularity. Using Power BI helped us clean, transform, and visualize the data effectively, making it easier to understand these trends.

With these insights, the client can focus on improving listings, targeting popular neighborhoods, adjusting prices, and meeting customer needs better. These steps will help attract more guests, increase bookings, and grow their Airbnb presence in both cities.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***