# Predicting Airbnb Listing Prices in Paris  
*Using machine learning to understand key pricing drivers*

## 1. Business Understanding

<div style="text-align: center;">
  <img src="../illustration-images/images.png" width="610" style="display:inline-block; margin-right: 20px;">
  <img src="../illustration-images/paris.jpg" width="340" style="display:inline-block;">
</div>

## 1.1 Business Objectives

Predicting Airbnb listing prices and understanding the most influential features can be highly beneficial to stakeholders.
Renters want proposed prices for accommodations to be attractive, while hosts, seeking profit, want fair pricing guidance without feeling shortchanged. Ultimately, both parties aim for a fair price based on the apartment's characteristics and location.

The goal of this project is to develop a *supervised machine learning* model to:

* **Understand** the impact of individual features typically found in listings.
* **Predict** accurate prices based on the characteristics of the apartment.

To carry out this project, we will use Airbnb listing prices data in the city of **Paris** from the past 12 months. Using this dataset, we will train a machine learning model, analyze feature importance, and ultimately deploy the model into production as web app.

<div style='text-align: center;'>
    <img src='../illustration-images/diagram-pipeline.png' width="600">
</div>

This project will interest and benefit the following stakeholders:

* **Renters**: The model will help renters assess whether the listed price is fair by comparing it to the predicted price. For price-sensitive renters, it also offers insight into which features to prioritize or compromise on to stay within their budget.
* **Hosts**: Setting a fair and competitive price can be challenging, especially for new hosts. Our model will assist them by offering data-driven price suggestions based on similar apartments. It can also help hosts identify which features to highlight or improve to increase the value of their listing.

## 2.2. Business Success Criteria

In the scope of this project, we aim to achieve the following:

* **Low RMSE**: We will use the Root Mean Square Error (RMSE) to evaluate the model's performance. Our target is an RMSE below €50, indicating that price estimates are reasonably close to actual values.
* **Identified key features**: At the end of the project, we aim to clearly identify the most important features that influence pricing.
* **Lightweight model**: The final model should be as lightweight as possible to enable fast and efficient deployment.




# Data Understanding

# Data Collection

In [None]:
import sys
import os

# Add the project root (or src/) to sys.path
sys.path.append(os.path.abspath(os.path.join("..", "src")))

import importlib
import src.data_scraper

importlib.reload(src.data_scraper)