How should I charge my Airbnb listing? Owners typically consider several factors including prices around the neighborhood, the configuration of the property, and the cost of owning and listing the unit. While some systems have provided solutions to a reasonable pricing recommendation, they are either less comprehensive in features input or did not mention the “competitiveness” at that price. This project aims to create a web application to allow users to figure out the optimal price point that is competitive with the market while not undervaluing the property for new property owners looking to list on Airbnb. Besides that, we aim to provide users with further helpful information about why they should charge at this range and what additional features they can include improving their pricing power.
This project focuses on Los Angeles (LA) using data downloaded from Airbnb (http://insideairbnb.com/get-the-data/). LA is one of the most popular locations for property lister and that offers a large dataset and high usability for the final web application of this project The data set contains over 40,000 unique listings with detailed information about the configuration of the properties, amenities, location coordinates, description, price, and more. As our project focuses on the perspective of property hosts, particularly new hosts, only features available at the point of sign-up are utilized.
This dataset is a collection of common amenities in Airbnb listings collected by a previous project (Lewis, Data cleaning in Python: examples from cleaning Airbnb data, 2019)
This dataset contains 128 daily objects and is commonly used for training an object detection model in YOLO (https://github.com/ultralytics/yolov5) series of models.
This data set is provided by Redfin (https://www.redfin.com/news/data-center/), and house market data is aggregated by Zipcode. This data has been generated monthly since 2000, and up till January 2023 for our model.
The project can be broken down into several broad categories (see flow chart):
- Initial exploratory data analysis
- Feature engineering (four parts)
- Supervised machine learning model construction
- Web app.
pip install -r requirements.txt
Version 1.0.0
To run the model:
- Download the latest version
- Go to
Project_Airbnb2/Scripts/the_pipelines
python update_features.py
. This will generate processing pipelines and features that will be utilized in the training.python train.py
. This will train all five quantile regression LightGBM models, and store them. Other preprocessing like imputation, dimensionality reduction are also involved.- Go to
../Capstone_app
, runstreamline run Home.py
to launch the web application. You can directly go to our web platform: http://18.205.39.151:8502
Web application: http://18.205.39.151:8502/my_app
GitHub: https://github.com/foye501/Capstone_GMT89
Report: https://docs.google.com/document/d/161fEv0t4Ops9SG5NPMAXZgnTigGPNvgPrR8gCyeM7x0/
Video explanation: https://www.youtube.com/playlist?list=PL-Ih8lEqwhvFGjcMphHh4x3e4OoW4smun
Poster:
Airbnb data source: http://insideairbnb.com/get-the-data/
- Wei Li Tan: Amenities analysis, final report consolidation
- Yangkang Chen: Image analysis, model consolidation
- Masato Ando: Location analysis, github
- Dongyao Wang: Description NLP analysis, Web application