<a href="https://colab.research.google.com/github/EricAshby/EDA-Airbnb-Amenities/blob/main/TEDA1030_Mod4_project_EricAshby_08_17_23.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Airbnb Amenities Analysis
Eric Ashby

This analysis explores different amenities offered in Airbnb listings in Salem, Oregon. The objective of this analysis is to determine which amenities in Salem are the most common, and which ones are the least commonly observed.

## About the data
This data set `airbnb.csv` comes from the Airbnb website and describes about 300 Airbnb listings in the Salem, Oregon area. The data describes many aspects of the listings including information about the hosts, listing description, and amenities offered at the facilities.

In [None]:
import pandas as pd
df = pd.read_csv("airbnb.csv")

Below, the meta data is displayed. There are 302 entries and 19 columns. There are a number of empty cells in various entries. Specifically, note the apparently empty column called `bathroom`.

In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 302 entries, 0 to 301
Data columns (total 19 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   name                  302 non-null    object 
 1   description           302 non-null    object 
 2   host_name             302 non-null    object 
 3   host_since            302 non-null    object 
 4   host_location         220 non-null    object 
 5   host_about            176 non-null    object 
 6   host_response_time    273 non-null    object 
 7   host_response_rate    273 non-null    object 
 8   host_acceptance_rate  278 non-null    object 
 9   host_verifications    302 non-null    object 
 10  property_type         302 non-null    object 
 11  room_type             302 non-null    object 
 12  accommodates          302 non-null    int64  
 13  bathrooms             0 non-null      float64
 14  bathrooms_text        302 non-null    object 
 15  bedrooms              1

A quick look into the dataset confirms that the bathrooms column was left empty, perhaps unfinished.

In [None]:
df.head()

Unnamed: 0,name,description,host_name,host_since,host_location,host_about,host_response_time,host_response_rate,host_acceptance_rate,host_verifications,property_type,room_type,accommodates,bathrooms,bathrooms_text,bedrooms,beds,amenities,price
0,Guest suite in Salem · ★4.90 · 1 bedroom · 1 b...,"Light and roomy, opening onto garden, this apa...",Sara,8/15/2011,"Salem, OR","An avid traveler and lover of the outdoors, I ...",within an hour,100%,88%,"['email', 'phone']",Entire guest suite,Entire home/apt,2,,1 bath,1.0,1.0,"[""Host greets you"", ""Baby safety gates"", ""Esse...",$103.00
1,Home in Salem · ★5.0 · 2 bedrooms · 2 beds · 1...,Cute fully furnished house! Fresh bright paint...,Sean,1/10/2015,"Salem, OR",,within an hour,100%,73%,"['email', 'phone']",Entire home,Entire home/apt,4,,1 bath,2.0,2.0,"[""Smoke alarm"", ""Fire pit"", ""Kitchen"", ""Carbon...",$120.00
2,Guest suite in Salem · ★4.95 · Studio · 1 bed ...,NEWLY RENOVATED! Our home is in the quaint his...,Julie,7/11/2015,"Salem, OR","We are an active couple that loves our home, n...",within an hour,100%,100%,"['email', 'phone']",Entire guest suite,Entire home/apt,2,,1 bath,,1.0,"[""Free street parking"", ""Essentials"", ""Carbon ...",$103.00
3,Home in Salem · ★5.0 · 1 bedroom · 1 bed · 1 s...,Comfort Home was initially established for tra...,Nancy,5/20/2012,"Salem, OR",I have lived in this home 21 years and have tr...,within a few hours,100%,0%,"['email', 'phone']",Private room in home,Private room,1,,1 shared bath,,1.0,"[""Host greets you"", ""Free street parking"", ""Es...",$65.00
4,Home in Salem · ★4.80 · 3 bedrooms · 3 beds · ...,"The State Capitol, Salem's favorite breakfast ...",Christy,8/9/2012,"Salem, OR",I co-host with my husband Arnaud. I'll be the ...,within an hour,100%,100%,"['email', 'phone', 'work_email']",Entire home,Entire home/apt,6,,1 bath,3.0,3.0,"[""Free street parking"", ""Pets allowed"", ""Essen...",$225.00


Finally, the descriptive statistics for numerical data is displayed below. We can see a clear in the data for the `beds` and `accommodates` columns as their averages significantly exceed their associated medians. The column for bathrooms is not useful unless it can be populated.

In [None]:
df.describe()

Unnamed: 0,accommodates,bathrooms,bedrooms,beds
count,302.0,0.0,185.0,299.0
mean,3.360927,,2.07027,1.926421
std,2.302432,,1.137518,1.290623
min,1.0,,1.0,1.0
25%,2.0,,1.0,1.0
50%,2.0,,2.0,1.0
75%,4.0,,3.0,2.0
max,16.0,,6.0,8.0


## Analysis Purpose
The goal of this analysis is to determine which amenities are the most common and which are the least common in the Salem, Oregon area. However, because the amenities are stored as lists in a single column of the data set, this information is difficult to analyze without the use of for loops.

##Analysis
###How many of each amenity are in in Salem listings?
Here, we use nested for loops to count up how many of each amenity exists in all Airbnb listings in Salem, Oregon as a whole.

In [None]:
#this code creates and initializes a dictionary with a key-value pair for each amenity
#where the key is the amenity name and the value is initialized to be zero
totalAmenitiesCount = {}
for index , listing in df.iterrows():
  for amenity in eval(listing['amenities']):
    totalAmenitiesCount[amenity] = 0

#this code increments the values for each key found in the `amenities` coulumn
for index , listing in df.iterrows():
  for amenity in eval(listing['amenities']):
    totalAmenitiesCount[amenity] += 1

To make this analysis even easier to read, let's loop through each amenity in the counter and only print out the ones that have more than 200 occurences.

We find that the most common amenity is a smoke alarm (301). The second and third most common amenities are, respectively, are a carbon monoxide alarm (275) and a kitchen (272).

Note the presence of a wildcard amenity: essentials.  This categorization is subjective and could represent any number of amenities. It is possible, therefore, that the counts shown here are not fully accurate as some of their number could be wrapped up in "essentials".

In [None]:
for amenity in totalAmenitiesCount:
  if totalAmenitiesCount[amenity] > 200:
    print(amenity , ":" , totalAmenitiesCount[amenity])

Essentials : 269
Carbon monoxide alarm : 275
Refrigerator : 230
Iron : 221
Hair dryer : 219
Fire extinguisher : 256
Microwave : 247
Cooking basics : 229
Smoke alarm : 301
First aid kit : 236
Dishes and silverware : 260
Wifi : 252
Hangers : 259
Hot water : 245
Kitchen : 272
Bed linens : 234
Free parking on premises : 238
Self check-in : 206


Note that a smoke alarm (301) is just shy of the total number of listings (302). As Oregon law mandates that all homes being sold or rented to have a smoke alarm (https://www.oregon.gov/osfm/Documents/SmokeAlarmLaw2016.pdf), I hypothesize that the missing smoke alarm is hiding in the "essentials" category.

To check this, we can loop through the data set once again, filtering for entires without smoke alarms. We find that the listing under index 26 does not have a smoke alarm listed under amenities. They do, however, have "essentials" listed. It is reasonable to assume, therefore, that a smoke alarm is included in the "essentials" category for this listing.

In [None]:
#create boolean list for each entry
hasSmokeAlarm = []
for index , listing in df.iterrows():
  for amenity in eval(listing['amenities']):
    if len(hasSmokeAlarm) < index + 1:  #if hasSmokeAlarm has not yet been given a bool for this index

      if amenity == "Smoke alarm":  #if first amenity is smoke alarm, append True, break loop for this index
        hasSmokeAlarm.append(True)
        break
      else:
        hasSmokeAlarm.append(False)  #if first amenity is not smoke alarm, append False, continue loop for this index

    elif  amenity == "Smoke alarm":   #if later amenity is smoke alarm, change to True, break loop for this index
      hasSmokeAlarm[index] = True
      break
    #if later amenity is still not smoke alarm, change nothing, continue loop

#identify index of listing with no smoke alarm
noSmokeAlarmIndex = [] #in case there are more entries without smoke alarms than expected
for index in range(len(hasSmokeAlarm)):
  if not(hasSmokeAlarm[index]):
    noSmokeAlarmIndex.append(index)
    # print("index:" , index)

for index , entry in df.iterrows():
  for amenity in eval(entry["amenities"]):
    if index == noSmokeAlarmIndex[0]:
      print (amenity)

Hangers
Kitchen
Essentials
Washer
Private living room
Air conditioning
Indoor fireplace
Heating
Dryer
Iron
Free parking on premises
Lock on bedroom door
Shampoo
Wifi


## Conclusion
In this analysis, we find that the most common ammenities are a smoke alarm, carbon monoxide alarm, and kitchen. We also find that while a smoke alarm is not explicitly present in the amenities list for every listing, it is likely included in the "essentials" category for the one listing without a smoke alarm listed.  This may be true for many other amenities as well so counts are not necessarily reliable.