<a href="https://colab.research.google.com/github/SankhadipSN99/Data-Visualisation-Challenge/blob/main/Airbnb_Data_Visualisation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **The DataViz Challenge - Transforming EDA Projects to Dashboards**
## **Team Member: Sankhadip Nath**

## **Link to Tableau Dashboard: https://public.tableau.com/shared/S9W5Y5HKH?:display_count=n&:origin=viz_share_link**

## **GitHub Link**


## **The DataViz Challenge** - **Transforming EDA Projects to Dashboards**

**Create Stunning Tableau Dashboard from the same dataset on which EDA was performed for the capstone project.**

.

**Problem Statement:**

In the context of Airbnb operations, how can the utilisation of Tableau facilitate a comprehensive comparative examination between Chicago and New Orleans, two diverse urban environments?

This inquiry seeks to leverage Tableau's visual analytics capabilities to uncover and illustrate the shared attributes, disparities, and distinctive patterns inherent to Airbnb's presence in these cities, thus elevating the depth and insightfulness of the study.

**Dataset Selection:**

For this EDA project, we have chosen the "Airbnb Listings Data" dataset from 2 major cities: Chicago and New Orleans. This dataset provides a comprehensive snapshot of various attributes related to Airbnb listings, such as property type, neighbourhood, pricing, availability, and more. The dataset is ideal for conducting an in-depth exploration of the local Airbnb market and deriving actionable insights.

**Why Airbnb:**

Airbnb, a prominent online platform, enables individuals to reserve accommodations spanning a spectrum from beds and rooms to apartments and entire homes across global locales. This user-centric platform serves as a conduit for seamless property rentals, negating the need for intricate intermediaries or substantial capital outlays. Notably, users can secure lodgings at significantly competitive rates relative to traditional hotels. Distinctively, Airbnb extends its reach to regions where convectional hotel presence might be limited, offering an avenue for lodging acquisition in underserved locales. Moreover, the inclination towards immersive local experiences often steers individuals towards selecting accommodations embedded within native communities, fostering a distinctive preference for authenticity and cultural engagement.

Airbnb Statistics • Over 4 million listings worldwide • 150 million users in 191 countries • Worldwide value is $32 billion • Global growth rate since 2009 - 153%

**Dataset Details:**

- Dataset Name: Airbnb Listings Data
- Source: [Link to dataset](http://insideairbnb.com/get-the-data/)
- Cities: Chicago & New Orleans
- Description: The Airbnb Listings Data contains information about different properties available for rent on Airbnb in a specific city. Each record represents a unique listing and includes attributes such as property type, neighbourhood, number of bedrooms, pricing, availability, host information, and more.

**Key Attributes**:

1. id: Unique identifier for each listing.

2. name: The title or name of the listing.

3. host_id: Unique identifier for the host of the property.

4. host_name: Name of the host.

5. neighbourhood_group: The broader area or group that the neighbourhood belongs to.

6. neighbourhood: Specific neighbourhood where the property is located.

7. latitude: Latitude coordinate of the property.

8. longitude: Longitude coordinate of the property.

9. room_type: Type of room (e.g., Private room, Entire home/apt, Shared room).

10. price: Price of the listing per night.

11. minimum_nights: Minimum number of nights required for booking.

12. number_of_reviews: Total number of reviews received for the listing.

13. last_review: Date of the last review.

14. reviews_per_month: Average number of reviews per month.

15. availability_365: Number of days the listing is available for booking in a year.

**Problem Areas to Explore:**

1. Which are the popular neighbourhoods, their average prices and no. of listings?

2. What is the percent share of different property types and room types?

3. How the pricing is varying with location, property type, and reviews?

4. What are the different correlations between type of hosts and factors like- reviews & price?

Divide the visualisation findings into 4 categories:

- Overview of Airbnb
- Property analysis
- Pricing analysis
- Host analysis

In [1]:
#Importing necessary libraries
import numpy as np
import pandas as pd

In [2]:
#Importing the downloaded csv files into pandas dataframe for further analysis
chicago = pd.read_csv('/content/drive/MyDrive/Visual Project Data Almabetter/Chicago listing.csv')
new_orelans = pd.read_csv('/content/drive/MyDrive/Visual Project Data Almabetter/New_orelans listings.csv')

In [3]:
#Top 5 rows and columns view of Chicago dataset
chicago.head()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,license
0,2384,Condo in Chicago · ★4.99 · 1 bedroom · 1 bed ·...,2613,Rebecca,,Hyde Park,41.7879,-87.5878,Private room,70.0,3,229,2023-10-31,2.1,1,362,19,R17000015609
1,7126,Rental unit in Chicago · ★4.70 · 1 bedroom · 1...,17928,Sarah,,West Town,41.90166,-87.68021,Entire home/apt,90.0,2,512,2023-10-09,2.91,1,335,36,R21000075737
2,10945,Rental unit in Chicago · ★4.63 · 2 bedrooms · ...,33004,At Home Inn,,Lincoln Park,41.91196,-87.63981,Entire home/apt,106.0,4,78,2023-11-06,0.66,5,225,19,2209984
3,1461451,Rental unit in Chicago · ★4.60 · 1 bedroom · 1...,2907254,Joe,,West Ridge,42.01653,-87.68788,Shared room,28.0,1,188,2023-11-21,1.49,15,180,10,R21000075752
4,1502674,Rental unit in Chicago · ★4.79 · 2 bedrooms · ...,33004,At Home Inn,,Lincoln Park,41.91175,-87.63834,Entire home/apt,146.0,4,110,2023-12-10,0.88,5,0,27,2209985


In [4]:
#Bottom 5 rows and columns view of Chicago dataset
chicago.tail()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,license
8944,1044986838764233788,Rental unit in Chicago · ★New · 3 bedrooms · 5...,345779031,Sofia,,West Town,41.907587,-87.690124,Entire home/apt,113.0,2,0,,,3,262,0,R23000108515
8945,1045186641282805256,Rental unit in Chicago · ★New · 1 bedroom · 1 ...,38986401,Matt,,Near North Side,41.89301,-87.62954,Entire home/apt,118.0,32,0,,,1,77,0,
8946,1045430795416239989,Home in Chicago · ★New · 2 bedrooms · 2 beds ·...,22608682,Ryan,,East Garfield Park,41.882076,-87.710654,Private room,64.0,1,0,,,12,350,0,R22000094288
8947,1045611315361653973,Rental unit in Chicago · ★New · 4 bedrooms · 6...,464407683,Hubbard Holdings,,Lincoln Park,41.91447,-87.65216,Entire home/apt,327.0,1,0,,,10,254,0,R23000109167
8948,1045650409760380671,Condo in Chicago · ★New · 7 bedrooms · 7 beds ...,169297663,William,,West Town,41.8925,-87.668218,Entire home/apt,178.0,1,0,,,21,127,0,R18000036353


In [None]:
#Getting information about all the null values across the chicago dataset
chicago.isnull().sum()

id                                   0
name                                 0
host_id                              0
host_name                            0
neighbourhood_group               8949
neighbourhood                        0
latitude                             0
longitude                            0
room_type                            0
price                              611
minimum_nights                       0
number_of_reviews                    0
last_review                       1907
reviews_per_month                 1907
calculated_host_listings_count       0
availability_365                     0
number_of_reviews_ltm                0
license                           1671
dtype: int64

In [None]:
#Getting the description of all the numerical columns of Chicago dataset
chicago.describe()

Unnamed: 0,id,host_id,neighbourhood_group,latitude,longitude,price,minimum_nights,number_of_reviews,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm
count,8949.0,8949.0,0.0,8949.0,8949.0,8338.0,8949.0,8949.0,7042.0,8949.0,8949.0,8949.0
mean,4.189903e+17,182997300.0,,41.892335,-87.661974,173.167786,15.881439,46.773159,1.834915,55.985361,189.932283,12.933624
std,4.246817e+17,170826400.0,,0.062265,0.043951,1112.349527,41.144839,89.977542,1.858998,164.076829,140.088536,20.119904
min,2384.0,2153.0,,41.646767,-87.847243,7.0,1.0,0.0,0.01,1.0,0.0,0.0
25%,37432440.0,35342460.0,,41.86274,-87.68536,70.25,2.0,1.0,0.48,1.0,45.0,0.0
50%,5.585691e+17,108211000.0,,41.896218,-87.65724,115.0,2.0,14.0,1.49,3.0,180.0,4.0
75%,8.471068e+17,327709600.0,,41.93515,-87.63025,178.0,32.0,55.0,2.71,19.0,339.0,21.0
max,1.048046e+18,549378000.0,,42.0222,-87.52842,99998.0,1125.0,3501.0,56.2,643.0,365.0,596.0


In [None]:
#Extracting the datatype of all the columns of the Chicago dataset
chicago.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8949 entries, 0 to 8948
Data columns (total 18 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   id                              8949 non-null   int64  
 1   name                            8949 non-null   object 
 2   host_id                         8949 non-null   int64  
 3   host_name                       8949 non-null   object 
 4   neighbourhood_group             0 non-null      float64
 5   neighbourhood                   8949 non-null   object 
 6   latitude                        8949 non-null   float64
 7   longitude                       8949 non-null   float64
 8   room_type                       8949 non-null   object 
 9   price                           8338 non-null   float64
 10  minimum_nights                  8949 non-null   int64  
 11  number_of_reviews               8949 non-null   int64  
 12  last_review                     70

In [None]:
#Changing the datatype of 'neighbourhood_group' column from float to string
chicago['neighbourhood_group'].astype('str')

0       nan
1       nan
2       nan
3       nan
4       nan
       ... 
8944    nan
8945    nan
8946    nan
8947    nan
8948    nan
Name: neighbourhood_group, Length: 8949, dtype: object

In [None]:
#Filling the null values with some string
chicago['neighbourhood_group'].fillna('Chicago', inplace = True)

In [None]:
#Filling the 'price' column with the mean price whereever the null values are present
chicago['price'].fillna(chicago['price'].mean(), inplace = True)

In [None]:
#Saving the cleaned Chicago dataset in my google drive as csv file
chicago.to_csv('/content/drive/MyDrive/Visual Project Data Almabetter/Chicago final.csv')

In [5]:
#Top 5 rows and columns view of New_orelans dataset
new_orelans.head()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,license
0,19091,Rental unit in New Orleans · ★4.89 · 1 bedroom...,72880,John,,Leonidas,29.961,-90.1195,Entire home/apt,60,1,503,2023-11-06,3.03,1,0,7,"22-RSTR-14107, 22-OSTR-14105"
1,71624,Rental unit in New Orleans · ★4.94 · 1 bedroom...,367223,Susan,,Bywater,29.96153,-90.04364,Entire home/apt,150,3,293,2023-10-23,1.88,1,336,17,"21-RSTR-18609, 22-OSTR-20720"
2,74498,Rental unit in New Orleans · ★4.90 · 1 bedroom...,391462,Georgia,,St. Roch,29.96986,-90.05172,Entire home/apt,91,3,603,2023-11-19,3.89,3,237,40,"Exempt: This listing is a licensed hotel, mote..."
3,79536,Rental unit in New Orleans · ★4.89 · 1 bedroom...,428362,Miriam,,Seventh Ward,29.97803,-90.0745,Entire home/apt,31,14,702,2023-11-26,4.58,1,0,102,"21-RSTR-18550, 21-OSTR-18392"
4,79609,Guest suite in New Orleans · ★4.96 · 2 bedroom...,428909,Stephen,,St. Claude,29.96448,-90.03667,Private room,119,3,487,2023-11-28,3.14,1,296,39,"23-ISTR-02823, 23-OSTR-02823"


In [6]:
#Bottom 5 rows and columns view of New_orelans dataset
new_orelans.tail()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,license
7070,1036328873581338848,Home in New Orleans · ★New · 6 bedrooms · 7 be...,548874676,Danielle,,Uptown,29.930358,-90.109556,Entire home/apt,999,2,0,,,1,171,0,"23RSTR-12345, 23OSTR-12345"
7071,1036503430674158280,Home in New Orleans · ★New · 3 bedrooms · 4 be...,77804807,Hope-Sutton,,East Carrollton,29.950222,-90.121077,Entire home/apt,168,30,0,,,1,263,0,"23RSTR-11753, 20-OSTR-00398"
7072,1036910390307329631,Home in New Orleans · ★New · 2 bedrooms · 2 be...,2807218,John,,Bywater,29.965295,-90.044793,Private room,450,3,0,,,2,117,0,"23-NSTR-15801, 23-OSTR-12406"
7073,1037067216535892554,Home in New Orleans · ★New · 1 bedroom · 1 bed...,546068095,Leslie,,Central City,29.934353,-90.083836,Entire home/apt,110,1,0,,,2,262,0,"22-CSTR-17419, 22-OSTR-20367"
7074,1037102444868484575,Home in New Orleans · ★New · 2 bedrooms · 3 be...,57793139,Rachel,,Dillard,29.993696,-90.059242,Entire home/apt,112,30,0,,,1,347,0,


In [None]:
##Getting information about all the null values across the new_orelans dataset
new_orelans.isna().sum()

id                                   0
name                                 0
host_id                              0
host_name                            0
neighbourhood_group               7075
neighbourhood                        0
latitude                             0
longitude                            0
room_type                            0
price                                0
minimum_nights                       0
number_of_reviews                    0
last_review                       1131
reviews_per_month                 1131
calculated_host_listings_count       0
availability_365                     0
number_of_reviews_ltm                0
license                           1299
dtype: int64

In [7]:
#Getting the description of all the numerical columns of new_orelans dataset
new_orelans.describe()

Unnamed: 0,id,host_id,neighbourhood_group,latitude,longitude,price,minimum_nights,number_of_reviews,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm
count,7075.0,7075.0,0.0,7075.0,7075.0,7075.0,7075.0,7075.0,5944.0,7075.0,7075.0,7075.0
mean,3.31198e+17,189468900.0,,29.958587,-90.074019,206.593922,12.376961,63.018799,1.752934,13.148975,162.713074,13.595053
std,4.035353e+17,172836300.0,,0.024083,0.030882,281.714131,19.891839,94.675786,1.69359,23.187869,130.67764,21.203788
min,19091.0,971.0,,29.89768,-90.13748,11.0,1.0,0.0,0.01,1.0,0.0,0.0
25%,25865720.0,37614660.0,,29.942259,-90.089765,95.0,2.0,3.0,0.57,1.0,28.0,0.0
50%,49127440.0,121682500.0,,29.958955,-90.073486,139.0,2.0,27.0,1.42,3.0,152.0,5.0
75%,7.42369e+17,342910000.0,,29.9698,-90.062535,235.0,30.0,88.0,2.4625,12.0,293.0,21.0
max,1.037102e+18,548874700.0,,30.16104,-89.73709,10000.0,365.0,1605.0,38.85,111.0,365.0,522.0


In [8]:
#Extracting the datatype of all the columns of the new_orelans dataset
new_orelans.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7075 entries, 0 to 7074
Data columns (total 18 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   id                              7075 non-null   int64  
 1   name                            7075 non-null   object 
 2   host_id                         7075 non-null   int64  
 3   host_name                       7075 non-null   object 
 4   neighbourhood_group             0 non-null      float64
 5   neighbourhood                   7075 non-null   object 
 6   latitude                        7075 non-null   float64
 7   longitude                       7075 non-null   float64
 8   room_type                       7075 non-null   object 
 9   price                           7075 non-null   int64  
 10  minimum_nights                  7075 non-null   int64  
 11  number_of_reviews               7075 non-null   int64  
 12  last_review                     59

In [9]:
#Changing the datatype of 'neighbourhood_group' column from float to string
new_orelans['neighbourhood_group'].astype('str')

0       nan
1       nan
2       nan
3       nan
4       nan
       ... 
7070    nan
7071    nan
7072    nan
7073    nan
7074    nan
Name: neighbourhood_group, Length: 7075, dtype: object

In [10]:
#Filling the null values with some string
new_orelans['neighbourhood_group'].fillna('New Orelans', inplace = True)

In [None]:
#Saving the cleaned new_orelans dataset in my google drive as csv file
new_orelans.to_csv('/content/drive/MyDrive/Visual Project Data Almabetter/New Orelans Final.csv')