The aim of this analysis was to understand Airbnb costs in Boston and Seattle, and how these were linked to location reviews. I have asked four questions: Question 1: Between Boston and Seattle which is the more expensive city? Question 2: What factors are important in pricing an airbnb listing? Question 3: Does geographically mapping the listings shed more light on the cost differences? Question 4: Are the location review scores correlated to the pricing for different neighbourhoods?
Python code which details library imports, data manipulation and tidying, analysis and a model build to predict the price of Airbnb rooms in each city.
• Pandas
• Numpy
• Matplotlib
• Seaborn
• Sklearn
This project uses data from the below two sources.
https://www.kaggle.com/airbnb/seattle/data
https://www.kaggle.com/datasets/airbnb/boston
As both datasets contain the same three files (listings, reviews, calendar) save these in their own files within the working directory called “boston” and “seattle”, and ensure these are saved as CSVs. For example the first file the code imports is at “seattle/calendar.csv”. They have not been uploaded to this repo for size reasons.
The findings can be found in this blog post:
https://medium.com/@chloe.gillham/sleepless-in-seattle-or-baseball-in-boston-655bc30d743b
#Acknowledgements As well as the above data, I have also used snips from google maps (searched by latitude and longitude provided in the data) as reference points in my blog post