This project sets out to analyse Airbnb data for London, UK compiled in 2021.
Under the guise of working for a rental property investor, I will aim to answer, or at the very least provide insight into, the following questions.
- On average, where in London are Airbnb's most expensive listings?
- Which of London's boroughs are the most popular for Airbnb listings?
- Can we predict the price for a London Airbnb listing?
I also look at the following questions but there isn't much to say about them (see the notebook as to why):
- Do visitors prefer to hire the entire house/apartment or a single room?
- Which area in London has the best AirBnB ratings?
For more detailed discussions of the findings please refer to the notebook or its corresponding medium article. Below are a summary of the answers to the questions posed above.
-
On average, where in London are Airbnb's most expensive listings?
- Westminster: £ 258.0 per night.
- City of London: £ 237.0 per night.
- Kensington and Chelsea: £ 222.0 per night.
-
Which of London's boroughs are the most popular for Airbnb listings?
- Westminister or City of London depending on feature used.
-
Can we predict the price for a London Airbnb listing?
- Not very well, r^2 score ~ 0.4 or an RMSE ~ £50 when filtering data to include only listings with a price lower than £1000 per night.
I'll be using Airbnb listing data for London which was compiled in December 2021. You can find links to the full dataset and some AirBnB visualisations. Airbnb also provide a spreadsheet with explanations for each data column and some assumptions in the data.
For this particular analysis I've only used the following three data files (some of which are stored with git lfs):
listings.csv
: Detailed Listings data. Size: 150 Mb.listings_summary.csv
: Summary information and metrics for listings in London (good for visualisations). Size: 8.9 Mb.neighbourhoods.geojson
: GeoJSON file of neighbourhoods of the city. Size: 1 Mb.
I've used Anaconda with Python 3.9.2 to create the environment for this work. You can use the requirement.yml
file to create the environment locally using:
conda env create -f requirement.yml
You can then activate it with
conda activate airbnb_london
This will install numpy
, pandas
, geopandas
, matplotlib
, sklearn
, seaborn
, plotly
and their dependencies.
To run the python implementation in src
please enter the following commands in your terminal:
python src/process_data.py data/listings.csv data/listings_summary.csv data/neighbourhoods.geojson data/LondonAirbnbDatabase.db
It will take about 3 minutes to run from end-to-end and will display some key findings around price and model output metrics.
airbnb.ipynb
: Jupyter notebook containing all the analysis within this project..gitattributes
: containing details of which file types are tracked bygit lfs
.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.