Business analytics of Vienna's Airbnb listings and price modelling using geospatial OpenStreetMap features
This project aims at predicting benchmark charging prices of a new host, based on existing airbnb listings across Vienna. There is public information available about roughly 12,000 airbnb listings and their hosts.
A machine learning pipline has been established, where OpenStreetMap has been used for feature generation. These features include the location of amenities such as shops, bars, restaurants, tourist destinations etc., and are used for price modelling.
The project includes a web app that displays some statistics as well as an interface for benchmark price
prediction, based on the geolocation of a new listing. The visualization is implemented as a
plotly Dash app which is on deploy on Aszure and accessible here.
app.py
main dash app
requirements.txt
python modules that will be installed for the web application at build.
/assets
this directory is to serve the CSS files and images for the app. charts.py
is used for generating the figures.
layout.py
defines the html web layout, callbacks.py
handles all the callbacks and data_wrangling.py
is used
for all the data queries and date manipulation.
/data
contains the raw input tar.gz files
from insideairbnb.
/data/geojson/vienna.geojson
geojson file with the geospatial data of Viennas neighbourhoods.
/data/osm/
contains all the OSM feature pre-saved as .csv
files for a better app performance
/model/
has the model (generated in the notebook) as a Pickle file
/nb/airbnb_vienna.ipynb
jupyter notebook used for data exploration, feature extraction etc. and the analysis outcome
runtime.txt
tells (the Gunicorn HTTP server) which python version to use (only needed for Heroku deployment)
Procfile
defines what type of process is going to run (Gunicorn web process) and the Python app entrypoint
(only needed for a deployment on Heroku)
.gitignore
- Change the current directory to the location where you want to clone the repository and run:
$ git clone https://github.com/AReburg/Airbnb-Price-Prediction.git
- Make sure that the app is running on the local webserver before deployment. Setup your virtualenv (or don't) and ensure you have all the modules installed before running the app.
Install the modules from the requirements.txt
with pip3 or conda from a terminal in the project root folder:
pip install -r requirements.txt
conda install --file requirements.txt
(Anaconda)
Executing the notebook is tested on anaconda distribution 6.4.12. with python 3.9.13. Since the notebook is >100 mb without the cleared outputs it is converted to html and can be viewed here.
-
Run the app from your IDE direct, or from the terminal in the projects root directory:
python app.py
-
It should be accessible on the browser
http://127.0.0.1:8050/
-
Open the app in the browser and start playing around
The main findings will be published in a Medium post. Feel free to contact me if you have any questions or suggestions. To view the rendered geospatial charts of the Jupyter notebook go to nbviewer and copy the notebooks link.
OpenStreetMap could be a powerful tool for feature generation. With OSM it is possible to determine how many restaurants, bars, cafes, shops etc. there are within a given radius. The number of amenities based on the geolocation can be used for price modelling. In the XGBoost model, some OSM features are important drivers for prices in Vienna.
This project is licensed under the terms of the MIT license