<h2>Airbnb Listings in New York City</h2>

<b>Problem Statement:</b>

The goal of this analysis is to uncover valuable insights from Airbnb listings in New York City to assist hosts, guests, and stakeholders in making informed decisions. By examining various factors such as host experience, neighborhood characteristics, property types, room types, pricing, and review scores, we aim to answer the following questions:

 - How does host experience impact review scores?
 - Which neighborhoods offer the highest average prices and the best review scores?
 - How do property types and room types influence pricing and review scores?
 - What is the relationship between the number of listings and the number of reviews across different neighborhoods?
 - Is there an optimal price point that maximizes review scores?
 - How does the number of beds in a listing affect its price and review scores?
 - Which zip codes have the highest average prices and review scores?
 - How have prices and review scores trended over time?

By answering these questions, we aim to provide a comprehensive overview of the Airbnb market in New York City, offering actionable insights for improving listing performance and enhancing guest experiences.

In [1]:
# import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

In [3]:
# Load the dataset
data = pd.read_excel("archive\\airbnb.xlsx")

In [5]:
# check the shape of data
data.shape

(30478, 13)

In [7]:
# view some records from data
data.head()

Unnamed: 0,Host Id,Host Since,Name,Neighbourhood,Property Type,Review Scores Rating (bin),Room Type,Zipcode,Beds,Number of Records,Number Of Reviews,Price,Review Scores Rating
0,5162530,NaT,1 Bedroom in Prime Williamsburg,Brooklyn,Apartment,,Entire home/apt,11249.0,1.0,1,0,145,
1,33134899,NaT,"Sunny, Private room in Bushwick",Brooklyn,Apartment,,Private room,11206.0,1.0,1,1,37,
2,39608626,NaT,Sunny Room in Harlem,Manhattan,Apartment,,Private room,10032.0,1.0,1,1,28,
3,500,2008-06-26,Gorgeous 1 BR with Private Balcony,Manhattan,Apartment,,Entire home/apt,10024.0,3.0,1,0,199,
4,500,2008-06-26,Trendy Times Square Loft,Manhattan,Apartment,95.0,Private room,10036.0,3.0,1,39,549,96.0


In [6]:
# datatypes present
data.dtypes

Host Id                                int64
Host Since                    datetime64[ns]
Name                                  object
Neighbourhood                         object
Property Type                         object
Review Scores Rating (bin)           float64
Room Type                             object
Zipcode                              float64
Beds                                 float64
Number of Records                      int64
Number Of Reviews                      int64
Price                                  int64
Review Scores Rating                 float64
dtype: object

In [8]:
# descriptive statistics
data.describe(include='all').T

  data.describe(include='all').T


Unnamed: 0,count,unique,top,freq,first,last,mean,std,min,25%,50%,75%,max
Host Id,30478.0,,,,NaT,NaT,12731707.882243,11902702.99266,500.0,2701298.5,8551693.0,21206171.75,43033067.0
Host Since,30475.0,2240.0,2014-02-10 00:00:00,70.0,2008-06-26,2015-08-31,,,,,,,
Name,30478.0,29416.0,Charming West Village studio,15.0,NaT,NaT,,,,,,,
Neighbourhood,30478.0,5.0,Manhattan,16033.0,NaT,NaT,,,,,,,
Property Type,30475.0,19.0,Apartment,27102.0,NaT,NaT,,,,,,,
Review Scores Rating (bin),22155.0,,,,NaT,NaT,90.738659,9.059519,20.0,85.0,90.0,100.0,100.0
Room Type,30478.0,3.0,Entire home/apt,17024.0,NaT,NaT,,,,,,,
Zipcode,30344.0,,,,NaT,NaT,10584.854831,921.299397,1003.0,10017.0,10065.0,11216.0,99135.0
Beds,30393.0,,,,NaT,NaT,1.530089,1.015359,0.0,1.0,1.0,2.0,16.0
Number of Records,30478.0,,,,NaT,NaT,1.0,0.0,1.0,1.0,1.0,1.0,1.0


In [9]:
# concise summary
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30478 entries, 0 to 30477
Data columns (total 13 columns):
 #   Column                      Non-Null Count  Dtype         
---  ------                      --------------  -----         
 0   Host Id                     30478 non-null  int64         
 1   Host Since                  30475 non-null  datetime64[ns]
 2   Name                        30478 non-null  object        
 3   Neighbourhood               30478 non-null  object        
 4   Property Type               30475 non-null  object        
 5   Review Scores Rating (bin)  22155 non-null  float64       
 6   Room Type                   30478 non-null  object        
 7   Zipcode                     30344 non-null  float64       
 8   Beds                        30393 non-null  float64       
 9   Number of Records           30478 non-null  int64         
 10  Number Of Reviews           30478 non-null  int64         
 11  Price                       30478 non-null  int64     