# Airbnb Listing Analysis

Airbnb has drastically revolutionized the way people travel, impacting traditional hospitality sectors and offering a unique way for property owners to generate income. With this platform growing rapidly and its data becoming widely available, it offers a rich ground for analytical exploration.

This project aims to analyze an Airbnb listing dataset, comprised of various features like geographical location, host details, property specifics, pricing, and review scores, among others. The data consists of 6636 listings, each associated with 18 unique attributes. By examining this data, I hope to uncover insights that may help property owners optimize their listing features, price their offerings effectively, and better understand what influences traveler decisions.

My specific objective is to answer the question: "What factors significantly influence the pricing of an Airbnb listing?". To accomplish this, I will employ various data analysis techniques and comparing the performance of different data structures to store the data while conducting my investigation.

In the subsequent sections, I will clean and preprocess the data, perform exploratory data analysis, implement my chosen models, and finally evaluate their performance. I will also delve into the specific operations on the data and explain why they are essential to our investigation. Throughout this project, I aim to offer a comprehensive understanding of the data, the analysis process, and the final conclusions.

## Creating the DataFrame

I start this project off by reading in a flat file of my Airbnb listing data for Seattle, Washington. 

In [548]:
import pandas as pd
import numpy as np
import time

Airbnb = pd.read_csv('./listings.csv')
Airbnb

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,license
0,13226114,Home in Seattle · ★4.79 · 2 bedrooms · 4 beds ...,1884549,Denise & Sean,Rainier Valley,Columbia City,47.56555,-122.29385,Entire home/apt,240,4,24,2022-12-01,0.28,1,9,5,STR-OPLI-19-000171
1,12518952,Guest suite in Seattle · ★5.0 · 2 bedrooms · 6...,12677600,Joe,Other neighborhoods,Green Lake,47.68243,-122.33086,Entire home/apt,200,3,60,2023-06-13,0.71,1,113,9,STR-OPLI-19-002061
2,521597880867717063,Serviced apartment in Seattle · Studio · 1 bath,48005494,Zeus,Ballard,Adams,47.66646,-122.37650,Entire home/apt,81,30,2,2023-04-10,0.44,36,191,2,
3,17889172,Rental unit in Seattle · ★5.0 · 1 bedroom · 1 ...,66909032,Randy,Other neighborhoods,Wallingford,47.65480,-122.34042,Entire home/apt,125,30,28,2023-05-06,0.38,4,307,4,STR-OPLI-19-000865
4,15917796,Home in Seattle · ★4.60 · Studio · 3 beds · 1 ...,38021932,Rocky,Capitol Hill,Montlake,47.64017,-122.32271,Entire home/apt,128,30,31,2022-07-31,0.38,1,316,2,STR-OPLI-19-003016
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6631,33833730,Serviced apartment in Seattle · ★3.0 · 1 bedro...,124558729,Matt,Downtown,Central Business District,47.60874,-122.33744,Entire home/apt,215,30,3,2019-06-25,0.06,4,0,0,
6632,33845521,Serviced apartment in Seattle · 1 bedroom · 1 ...,124558729,Matt,Downtown,Central Business District,47.60888,-122.33545,Entire home/apt,200,30,0,,,4,0,0,
6633,4400811,Home in Seattle · ★4.85 · 2 bedrooms · 2 beds ...,4199445,Liam,Rainier Valley,Mount Baker,47.57129,-122.29086,Entire home/apt,189,4,119,2023-06-10,1.15,2,2,9,STR-OPLI-19-001423
6634,36519711,Home in Seattle · 1 bedroom · 1 bed · 1 privat...,6645582,John,Other neighborhoods,Wedgwood,47.69405,-122.30236,Private room,70,30,0,,,1,0,0,


## Inspecting the data

#### What are the column names?

In [549]:
Airbnb.columns

Index(['id', 'name', 'host_id', 'host_name', 'neighbourhood_group',
       'neighbourhood', 'latitude', 'longitude', 'room_type', 'price',
       'minimum_nights', 'number_of_reviews', 'last_review',
       'reviews_per_month', 'calculated_host_listings_count',
       'availability_365', 'number_of_reviews_ltm', 'license'],
      dtype='object')

#### How many rows and columns are there?

In [550]:
Airbnb.shape

(6636, 18)

#### What are the data types of each column?

In [551]:
Airbnb.dtypes

id                                  int64
name                               object
host_id                             int64
host_name                          object
neighbourhood_group                object
neighbourhood                      object
latitude                          float64
longitude                         float64
room_type                          object
price                               int64
minimum_nights                      int64
number_of_reviews                   int64
last_review                        object
reviews_per_month                 float64
calculated_host_listings_count      int64
availability_365                    int64
number_of_reviews_ltm               int64
license                            object
dtype: object

#### The head of the data

In [552]:
Airbnb.head()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,license
0,13226114,Home in Seattle · ★4.79 · 2 bedrooms · 4 beds ...,1884549,Denise & Sean,Rainier Valley,Columbia City,47.56555,-122.29385,Entire home/apt,240,4,24,2022-12-01,0.28,1,9,5,STR-OPLI-19-000171
1,12518952,Guest suite in Seattle · ★5.0 · 2 bedrooms · 6...,12677600,Joe,Other neighborhoods,Green Lake,47.68243,-122.33086,Entire home/apt,200,3,60,2023-06-13,0.71,1,113,9,STR-OPLI-19-002061
2,521597880867717063,Serviced apartment in Seattle · Studio · 1 bath,48005494,Zeus,Ballard,Adams,47.66646,-122.3765,Entire home/apt,81,30,2,2023-04-10,0.44,36,191,2,
3,17889172,Rental unit in Seattle · ★5.0 · 1 bedroom · 1 ...,66909032,Randy,Other neighborhoods,Wallingford,47.6548,-122.34042,Entire home/apt,125,30,28,2023-05-06,0.38,4,307,4,STR-OPLI-19-000865
4,15917796,Home in Seattle · ★4.60 · Studio · 3 beds · 1 ...,38021932,Rocky,Capitol Hill,Montlake,47.64017,-122.32271,Entire home/apt,128,30,31,2022-07-31,0.38,1,316,2,STR-OPLI-19-003016


#### The tail end of the data

In [553]:
Airbnb.tail()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,license
6631,33833730,Serviced apartment in Seattle · ★3.0 · 1 bedro...,124558729,Matt,Downtown,Central Business District,47.60874,-122.33744,Entire home/apt,215,30,3,2019-06-25,0.06,4,0,0,
6632,33845521,Serviced apartment in Seattle · 1 bedroom · 1 ...,124558729,Matt,Downtown,Central Business District,47.60888,-122.33545,Entire home/apt,200,30,0,,,4,0,0,
6633,4400811,Home in Seattle · ★4.85 · 2 bedrooms · 2 beds ...,4199445,Liam,Rainier Valley,Mount Baker,47.57129,-122.29086,Entire home/apt,189,4,119,2023-06-10,1.15,2,2,9,STR-OPLI-19-001423
6634,36519711,Home in Seattle · 1 bedroom · 1 bed · 1 privat...,6645582,John,Other neighborhoods,Wedgwood,47.69405,-122.30236,Private room,70,30,0,,,1,0,0,
6635,51107857,Serviced apartment in Seattle · ★4.30 · 1 bedr...,122723469,Pierre,Other neighborhoods,Fremont,47.65952,-122.34785,Entire home/apt,280,30,10,2021-10-24,0.43,1,0,0,Exempt


#### Getting information on the DataFrame

In [554]:
Airbnb.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6636 entries, 0 to 6635
Data columns (total 18 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   id                              6636 non-null   int64  
 1   name                            6636 non-null   object 
 2   host_id                         6636 non-null   int64  
 3   host_name                       6636 non-null   object 
 4   neighbourhood_group             6636 non-null   object 
 5   neighbourhood                   6636 non-null   object 
 6   latitude                        6636 non-null   float64
 7   longitude                       6636 non-null   float64
 8   room_type                       6636 non-null   object 
 9   price                           6636 non-null   int64  
 10  minimum_nights                  6636 non-null   int64  
 11  number_of_reviews               6636 non-null   int64  
 12  last_review                     55

## Extracting subsets

#### The name column stood out for its odd characters and string of information.

In [555]:
# Name column seems to consist of stars, seperating dots, and different categories of information
Airbnb.name


0       Home in Seattle · ★4.79 · 2 bedrooms · 4 beds ...
1       Guest suite in Seattle · ★5.0 · 2 bedrooms · 6...
2         Serviced apartment in Seattle · Studio · 1 bath
3       Rental unit in Seattle · ★5.0 · 1 bedroom · 1 ...
4       Home in Seattle · ★4.60 · Studio · 3 beds · 1 ...
                              ...                        
6631    Serviced apartment in Seattle · ★3.0 · 1 bedro...
6632    Serviced apartment in Seattle · 1 bedroom · 1 ...
6633    Home in Seattle · ★4.85 · 2 bedrooms · 2 beds ...
6634    Home in Seattle · 1 bedroom · 1 bed · 1 privat...
6635    Serviced apartment in Seattle · ★4.30 · 1 bedr...
Name: name, Length: 6636, dtype: object

#### Looking at a specific row of the name column to see full detail.

In [556]:
Airbnb.loc[81, 'name']

'Boutique hotel in Seattle · ★5.0 · 1 bedroom · 3 beds · 1 private bath'

#### Inspecting the last_review column since it showed null values when looking at the information on the data.

In [557]:
Airbnb.last_review.tail(10)

6626           NaN
6627           NaN
6628           NaN
6629    2023-03-12
6630           NaN
6631    2019-06-25
6632           NaN
6633    2023-06-10
6634           NaN
6635    2021-10-24
Name: last_review, dtype: object

#### Inspecting the reviews_per_month column since it showed null values when looking at the information on the data.

In [558]:
Airbnb.reviews_per_month.tail(30)

6606    2.78
6607    0.50
6608     NaN
6609     NaN
6610    1.32
6611    1.66
6612     NaN
6613    0.40
6614    0.07
6615    0.16
6616     NaN
6617    0.97
6618     NaN
6619    3.39
6620    2.05
6621     NaN
6622     NaN
6623     NaN
6624    0.06
6625    0.39
6626     NaN
6627     NaN
6628     NaN
6629    2.38
6630     NaN
6631    0.06
6632     NaN
6633    1.15
6634     NaN
6635    0.43
Name: reviews_per_month, dtype: float64

#### Inspecting the license column since it showed null values when looking at the information on the data.

In [559]:
Airbnb.license.tail(30)

6606    STR-OPLI-22-000717
6607    STR-OPLI-19-002092
6608                   NaN
6609                   NaN
6610    STR-OPLI-19-000121
6611                Exempt
6612                   NaN
6613                   NaN
6614                   NaN
6615                   NaN
6616                   NaN
6617    STR-OPLI-19-000701
6618                   NaN
6619    STR-OPLI-21-001395
6620                Exempt
6621                   NaN
6622                   NaN
6623                   NaN
6624                   NaN
6625    STR-OPLI-22-001283
6626                   NaN
6627    STR-OPLI-19-152434
6628                   NaN
6629                Exempt
6630                   NaN
6631                   NaN
6632                   NaN
6633    STR-OPLI-19-001423
6634                   NaN
6635                Exempt
Name: license, dtype: object

## Cleaning the data
The 'last_review', 'reviews_per_month', and 'license' contain null values I plan to adress by using the .fillna() and .drop() functions.

#### Replacing the null values in the last_review column with N/A.

In [560]:
# Missing dates are not applicable therefore N/A
Airbnb['last_review'] = Airbnb['last_review'].fillna('N/A')
# Checking the last_review column
Airbnb[['last_review']]

Unnamed: 0,last_review
0,2022-12-01
1,2023-06-13
2,2023-04-10
3,2023-05-06
4,2022-07-31
...,...
6631,2019-06-25
6632,
6633,2023-06-10
6634,


#### Replacing the null values in the reviews_per_month column with 0.

In [561]:
Airbnb['reviews_per_month'] = Airbnb['reviews_per_month'].fillna('0')
Airbnb[['reviews_per_month']]

Unnamed: 0,reviews_per_month
0,0.28
1,0.71
2,0.44
3,0.38
4,0.38
...,...
6631,0.06
6632,0
6633,1.15
6634,0


#### Dropping the license column

In [562]:
Airbnb.drop(columns=['license'], inplace=True)
Airbnb.head()

Unnamed: 0,id,name,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,number_of_reviews,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm
0,13226114,Home in Seattle · ★4.79 · 2 bedrooms · 4 beds ...,1884549,Denise & Sean,Rainier Valley,Columbia City,47.56555,-122.29385,Entire home/apt,240,4,24,2022-12-01,0.28,1,9,5
1,12518952,Guest suite in Seattle · ★5.0 · 2 bedrooms · 6...,12677600,Joe,Other neighborhoods,Green Lake,47.68243,-122.33086,Entire home/apt,200,3,60,2023-06-13,0.71,1,113,9
2,521597880867717063,Serviced apartment in Seattle · Studio · 1 bath,48005494,Zeus,Ballard,Adams,47.66646,-122.3765,Entire home/apt,81,30,2,2023-04-10,0.44,36,191,2
3,17889172,Rental unit in Seattle · ★5.0 · 1 bedroom · 1 ...,66909032,Randy,Other neighborhoods,Wallingford,47.6548,-122.34042,Entire home/apt,125,30,28,2023-05-06,0.38,4,307,4
4,15917796,Home in Seattle · ★4.60 · Studio · 3 beds · 1 ...,38021932,Rocky,Capitol Hill,Montlake,47.64017,-122.32271,Entire home/apt,128,30,31,2022-07-31,0.38,1,316,2


## Data Transformations
I will be using .extract(), .replace(), .loc(), .fillna(), and .drop() functions to parse the data from the 'name' 
column and create new columns using their details. 

#### Creating a rating column

In [563]:
# Extracting rating from the name column to be turned into a rating column
Airbnb['rating'] = Airbnb['name'].str.extract('(\★\d+\.\d+)', expand=False).str.replace('★','')
# replace null values with N/A
Airbnb['rating'] = Airbnb['rating'].fillna('N/A')

# Viewing the new rating column
Airbnb.rating.tail()

6631     3.0
6632     N/A
6633    4.85
6634     N/A
6635    4.30
Name: rating, dtype: object

#### Creating a bedrooms column

In [564]:
# Extracting bedrooms information from the name column to be turned into a bedrooms column
Airbnb['bedrooms'] = Airbnb['name'].str.extract('(\d+\sbedroom[s]*)', expand=False)
Airbnb['bedrooms'] = Airbnb['bedrooms'].str.extract('(\d+)', expand=False)
# Ensuring specific values are accounted for (ex. Studio = no bedrooms)
Airbnb.loc[Airbnb['name'].str.contains('studio', case=False), 'bedrooms'] = '0'
# Replacing NaN values with 0 in the 'bedrooms' column
Airbnb['bedrooms'] = Airbnb['bedrooms'].fillna(0)
# Converting 'bedrooms' column to integer
Airbnb['bedrooms'] = Airbnb['bedrooms'].astype(int)
# Viewing the new bedrooms column
Airbnb.bedrooms.tail()


6631    1
6632    1
6633    2
6634    1
6635    1
Name: bedrooms, dtype: int32

#### Creating a beds column

In [565]:
# Extracting beds information from the name column to be turned into a beds column
Airbnb['beds'] = Airbnb['name'].str.extract('(\d+\sbed[s]*)', expand=False)
Airbnb['beds'] = Airbnb['beds'].str.extract('(\d+)', expand=False)
# Null columns replaced with 0
Airbnb['beds'] = Airbnb['beds'].fillna('0')
# Converting 'beds' column to integer
Airbnb['beds'] = Airbnb['beds'].astype(int)
# Viewing the new beds column
Airbnb.beds.tail()

6631    1
6632    1
6633    2
6634    1
6635    1
Name: beds, dtype: int32

#### Creating a baths column

In [566]:
# Extracting baths information from the name column to be turned into a baths column
Airbnb['baths'] = Airbnb['name'].str.extract('(\d+\sbath[s]*)', expand=False)
Airbnb['baths'] = Airbnb['baths'].str.extract('(\d+)', expand=False)
# Null columns replaced with 0
Airbnb['baths'] = Airbnb['baths'].fillna('0')
# Converting 'baths' column to integer
Airbnb['baths'] = Airbnb['baths'].astype(int)
# Viewing the new baths column
Airbnb.baths.tail()

6631    1
6632    1
6633    2
6634    0
6635    1
Name: baths, dtype: int32

#### Rename the 'name' column to 'home_type'

In [567]:
Airbnb['home_type'] = Airbnb['name'].str.split('·').str[0]
df = Airbnb.drop(columns=['name'])
df.head()

Unnamed: 0,id,host_id,host_name,neighbourhood_group,neighbourhood,latitude,longitude,room_type,price,minimum_nights,...,last_review,reviews_per_month,calculated_host_listings_count,availability_365,number_of_reviews_ltm,rating,bedrooms,beds,baths,home_type
0,13226114,1884549,Denise & Sean,Rainier Valley,Columbia City,47.56555,-122.29385,Entire home/apt,240,4,...,2022-12-01,0.28,1,9,5,4.79,2,2,2,Home in Seattle
1,12518952,12677600,Joe,Other neighborhoods,Green Lake,47.68243,-122.33086,Entire home/apt,200,3,...,2023-06-13,0.71,1,113,9,5.0,2,2,1,Guest suite in Seattle
2,521597880867717063,48005494,Zeus,Ballard,Adams,47.66646,-122.3765,Entire home/apt,81,30,...,2023-04-10,0.44,36,191,2,,0,0,1,Serviced apartment in Seattle
3,17889172,66909032,Randy,Other neighborhoods,Wallingford,47.6548,-122.34042,Entire home/apt,125,30,...,2023-05-06,0.38,4,307,4,5.0,1,1,1,Rental unit in Seattle
4,15917796,38021932,Rocky,Capitol Hill,Montlake,47.64017,-122.32271,Entire home/apt,128,30,...,2022-07-31,0.38,1,316,2,4.6,0,3,1,Home in Seattle


#### Viewing info on Cleaned and Transformed DataFrame

In [568]:
# The ratings column is the only column with null values therefore it is not worth deleting any information
Airbnb.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6636 entries, 0 to 6635
Data columns (total 22 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   id                              6636 non-null   int64  
 1   name                            6636 non-null   object 
 2   host_id                         6636 non-null   int64  
 3   host_name                       6636 non-null   object 
 4   neighbourhood_group             6636 non-null   object 
 5   neighbourhood                   6636 non-null   object 
 6   latitude                        6636 non-null   float64
 7   longitude                       6636 non-null   float64
 8   room_type                       6636 non-null   object 
 9   price                           6636 non-null   int64  
 10  minimum_nights                  6636 non-null   int64  
 11  number_of_reviews               6636 non-null   int64  
 12  last_review                     66

##  What factors significantly influence the pricing of an Airbnb listing?

##### Question 1: Average price of listings in each neighborhood

In [569]:
avg_price_neighborhood = Airbnb.groupby('neighbourhood')['price'].mean().sort_values(ascending=False)
a = avg_price_neighborhood.head(20)

print("Average price of listings in each neighborhood:\n", a)

Average price of listings in each neighborhood:
 neighbourhood
Industrial District          675.000000
Briarcliff                   361.529412
Greenwood                    299.721854
Harrison/Denny-Blaine        291.481481
West Queen Anne              287.508621
Fremont                      286.735178
Pike-Market                  278.778947
Laurelhurst                  275.884615
East Queen Anne              269.803922
Central Business District    265.060403
View Ridge                   251.200000
Madrona                      249.000000
Alki                         245.537037
West Woodland                232.421569
Southeast Magnolia           230.875000
North Queen Anne             228.184211
Phinney Ridge                226.762376
Windermere                   221.900000
Leschi                       221.875000
Stevens                      221.126984
Name: price, dtype: float64


#### Question 2: What home types are located in Briarcliff and how many listings are there?
Since Briarcliff is the neighborhood with the highest average price listing, lets look at what home types are located in this neighborhood.

In [570]:
# Filtering for neighborhoods
Briarcliff_homes = Airbnb.loc[Airbnb['neighbourhood'] == 'Briarcliff']
# Getting unique home types in Briarcliff
unique_home_types = Briarcliff_homes['home_type'].unique()
# Counting the number of listings in Briarcliff
total_listings = Briarcliff_homes['id'].count()

print("The home types in Briarcliff are: ", unique_home_types, " and there are a total of ", total_listings, " listings." )

The home types in Briarcliff are:  ['Home in Seattle ' 'Guest suite in Seattle ' 'Cottage in Seattle '
 'Guesthouse in Seattle ']  and there are a total of  17  listings.


#### Question 3: Which type of home has the highest average price?
Briarcliff has some homes that have higher average prices.

In [571]:
avg_price_hometype = Airbnb.groupby('home_type')['price'].mean().sort_values(ascending=False)

print("The home type with the highest average price is: \n", avg_price_hometype)

The home type with the highest average price is: 
 home_type
Resort in Seattle                          376.000000
Vacation home in Seattle                   334.111111
Home in Tukwila                            296.000000
Villa in Seattle                           294.200000
Home in Seattle                            249.221002
Serviced apartment in Seattle              239.081761
Townhouse in Seattle                       234.533541
Rental unit in Capitol Hill, Seattle       234.000000
Bed and breakfast in Seattle               221.384615
Boutique hotel in Seattle                  209.920000
Condo in Seattle                           206.644128
Bungalow in Seattle                        191.585366
Camper/RV in Seattle                       188.600000
Hotel in Seattle                           186.857143
Houseboat in Seattle                       186.400000
Cottage in Seattle                         169.954545
Loft in Seattle                            167.181818
Cabin in Seattle     

#### Question 4: Which neighborhoods have the most reviews?
Briarcliff has some of the lowest amount of reviews but somehow has much higher average prices while broadway with the most reviews is not even on the top 20 for average priced listings.

In [572]:
# Grouping by neighborhood and sum the number of reviews
neighborhood_reviews = Airbnb.groupby('neighbourhood')['number_of_reviews'].sum().sort_values(ascending=False)


print("Neighborhoods with the most number of reviews: \n", neighborhood_reviews)

Neighborhoods with the most number of reviews: 
 neighbourhood
Broadway               29518
Belltown               27976
Fremont                23333
Wallingford            16917
Minor                  15527
                       ...  
Briarcliff               431
Meadowbrook              306
Holly Park               301
Industrial District      105
Harbor Island              0
Name: number_of_reviews, Length: 89, dtype: int64


##### Question 5: Average price of listings based on number of bedrooms?

In [573]:
avg_price_bedrooms = Airbnb.groupby('bedrooms')['price'].mean().sort_values(ascending=False)

print("Average price of listings based on number of bedrooms:\n", avg_price_bedrooms)

Average price of listings based on number of bedrooms:
 bedrooms
13    919.000000
6     830.250000
7     561.562500
5     517.115385
8     464.200000
4     395.332061
3     329.811558
2     218.240522
1     138.247322
0     133.299720
Name: price, dtype: float64


#### Question 6: What are the average amount of bedrooms for each neighborhood.
Briarcliff has a higher average of bedrooms which makes sense since the more bedrooms there are the higher the average price is.

In [574]:
# Ensure 'bedrooms' is numerical 
Airbnb['bedrooms'] = Airbnb['bedrooms'].astype(float)
# Grouping by neighborhood and calculating the average bedrooms
avg_bedrooms = Airbnb.groupby('neighbourhood')['bedrooms'].mean().sort_values(ascending=False)

print("Average bedrooms per neighborhood: \n", avg_bedrooms)

Average bedrooms per neighborhood: 
 neighbourhood
View Ridge           2.466667
South Beacon Hill    2.361702
Briarcliff           2.352941
Windermere           2.300000
Rainier View         2.230769
                       ...   
Belltown             1.029940
Harbor Island        1.000000
Interbay             0.953488
Pioneer Square       0.816327
Yesler Terrace       0.724138
Name: bedrooms, Length: 89, dtype: float64


#### Question 7: Average price of listings based on number of beds?

In [575]:
avg_price_beds = Airbnb.groupby('beds')['price'].mean().sort_values(ascending=False)

print("Average price of listings based on number of beds:\n", avg_price_beds)

Average price of listings based on number of beds:
 beds
13    919.000000
6     830.250000
7     561.562500
5     517.115385
8     464.200000
4     394.307985
3     327.112346
2     211.818803
0     139.562500
1     136.995817
Name: price, dtype: float64


#### Question 8: What are the average amount of beds for each neighborhood.
Briarcliff has a higher average amount f beds which makes sense since the more beds there are the higher the average price is.

In [576]:
# Ensure 'beds' is numerical
Airbnb['bed'] = Airbnb['beds'].astype(float)
# Grouping by neighbourhood and calculating the average beds
avg_beds = Airbnb.groupby('neighbourhood')['beds'].mean().sort_values(ascending=False)

print("Average beds per neighborhood: \n", avg_beds)

Average beds per neighborhood: 
 neighbourhood
View Ridge                2.466667
Windermere                2.400000
Arbor Heights             2.380952
South Beacon Hill         2.361702
Briarcliff                2.352941
                            ...   
Holly Park                1.200000
Pioneer Square            1.142857
International District    1.043478
Harbor Island             1.000000
Interbay                  0.976744
Name: beds, Length: 89, dtype: float64


##### Question 9: Average price of listings based on number of baths?

In [577]:
avg_price_baths = Airbnb.groupby('baths')['price'].mean().sort_values(ascending=False)

print("Average price of listings based on number of baths:\n", avg_price_baths)

Average price of listings based on number of baths:
 baths
6    1061.750000
4     829.727273
3     401.658654
7     361.000000
5     326.996104
2     289.048304
1     160.335686
0      99.901340
Name: price, dtype: float64


#### Question 10: What are the average amount of baths for each neighborhood.
Briarcliff has a higher average which makes sense since the more bedrooms there are the higher the average price is.

In [578]:
# Ensure 'baths' is numerical
Airbnb['baths'] = Airbnb['baths'].astype(float)
# Grouping by neighborhood and calculating the average baths
avg_baths = Airbnb.groupby('neighbourhood')['baths'].mean().sort_values(ascending=False)

print("Average baths per neighborhood: \n", avg_baths)

Average baths per neighborhood: 
 neighbourhood
Industrial District       3.000000
Briarcliff                2.352941
Atlantic                  2.310345
Harrison/Denny-Blaine     2.222222
View Ridge                2.200000
                            ...   
Harbor Island             1.000000
Bitter Lake               1.000000
North College Park        0.950820
Holly Park                0.200000
International District    0.043478
Name: baths, Length: 89, dtype: float64


##### Question 11: Average price of listings based on ratings?

In [579]:
avg_price_ratings = Airbnb.groupby('rating')['price'].mean().sort_values(ascending=False)

print("Average price of listings based on ratings:\n", avg_price_ratings)

Average price of listings based on ratings:
 rating
4.25    1076.727273
4.33    1058.260870
4.37     894.000000
2.60     450.000000
4.67     298.976744
           ...     
3.74      53.000000
2.75      52.000000
3.81      49.000000
4.10      31.000000
0.0       28.000000
Name: price, Length: 129, dtype: float64


## Conclusion

Based on the data analysis, several factors significantly influence the pricing of an Airbnb listing.

Neighborhood: The neighborhood in which the listing is located plays a crucial role in its price. For example, listings in the Industrial District and Briarcliff are priced significantly higher than in other neighborhoods.

Property type: The type of property also affects the price. For instance, Resorts and Vacation homes in Seattle have higher average prices.
Number of Bedrooms: Generally, the more bedrooms a listing has, the higher the price. Listings with more than 6 bedrooms are the most expensive on average.

Number of Beds: Similar to bedrooms, listings with more beds tend to have higher prices.

Number of Baths: Again, listings with more baths tend to have higher prices.

Ratings: Listings with higher ratings also tend to be priced higher. However, there are some exceptions, such as the listing with a rating of 4.25 being the most expensive on average.

Interestingly, while the Briarcliff neighborhood has the highest average listing price, it doesn't have the most reviews, suggesting that price and number of reviews are not directly related. In contrast, Broadway, which has the most reviews, doesn't feature in the top 20 for average priced listings. Taking all these factors into account, it can be concluded that the price of an Airbnb listing is influenced by a combination of its location, property type, number of bedrooms, beds, and baths, and its rating. However, the number of reviews a listing has does not seem to significantly effect its price. This information can be used by both hosts and guests to make informed decisions. Hosts can use it to price their listings competitively, taking into account the features of their property and its location. Guests can use it to find listings that offer the best value for their needs and budget.