# AirBnB Data Analysis Project

This is a capstone project for the Code:You Data Analysis course. This project 
will analyze Airbnb data to uncover useful correlations/trends and guide 
decision making in investing in STR properties.

## Project Goal

**Determine the optimal location and house size for a short term rental investment 
property in Seattle.**

The end result of this project is to classify properties in terms of STR 
Performance vs. Purchase Price:
- High STR Performance, Low Purchase Price
- High STR Performance, High Purchase Price
- Low STR Performance, Low Purchase Price
- Low STR Perormance, High Purchase Price

In this data discovery notebook, I will:
1. Research key metrics for Short Term Rentals.
1. Research the Seattle housing market.
1. Determine data needed to support.
1. Review available data.
1. Identify the type of analysis that can be done and list the questions that 
can be answered.
1. List the cleaning steps that will be needed.


## 1. Short Term Rental Metrics Research

A quick google search gives us some of the key metrics used to evaluate the 
financial performance of a short term rental property. NOI and RPP appear to be
most related to our goal of finding a good investment property.

> **Net operating income (NOI)** is another key Airbnb metric for Airbnb hosts to evaluate the profitability of their listing. NOI measures the total revenue generated by a property from all sources and subtracts operating expenses, including management, maintenance, and cleaning fees. Hosts can use NOI to accurately determine their net profit and make informed decisions about pricing strategy, cost-saving measures, or investing in property improvements. By focusing on maximizing their NOI, hosts can drive profitability on the Airbnb platform and make sure they achieve sustainable success. Therefore, tracking NOI is an essential factor for any host looking to optimize their financial performance and grow their vacation rental business.

> **Revenue per property (RPP)** calculates the total earnings generated by a property within a defined timeframe. Monitoring RPP allows hosts to discern which listings are the highest revenue generators, enabling them to fine-tune pricing and marketing strategies accordingly. Furthermore, RPP aids hosts in assessing the financial feasibility of investing in new properties or enhancing existing ones. By optimizing RPP, hosts can enhance operational efficiency on Airbnb and thrive in the fiercely competitive vacation rental market. Tracking RPP becomes a pivotal aspect for hosts seeking to maximize financial performance and expand their business endeavors.

Sources: Short Term Rental KPIs

- [Key STR Metrics](https://hostify.com/blog/key-airbnb-performance-metrics-hosts-should-track)



## 2. Seattle Real Estate Market Research 

In order to determine the best location and size of house to buy for a short
term rental, we will need a way to compare the relative price of different
properties. A quick google search yielded house prices for the Seattle Market 
broken down by neighborhood and number of bedrooms.


### House Sale Price by Neighborhood


In [None]:
import pandas as pd

cost_by_neighborhood = pd.read_csv("../data/raw/Home_Cost_By_Neighborhood.csv")
cost_by_neighborhood

### House Sale Price by Number of Bedrooms


In [None]:
import pandas as pd

cost_by_bedrooms = pd.read_csv("../data/raw/Home_Cost_by_Bedrooms.csv")
cost_by_bedrooms

**Sources** 

Seattle Real Estate Market Data

- [Home Price by Neighborhood](https://www.seattlemet.com/home-and-real-estate/how-much-do-homes-cost-in-seattle-area-neighborhoods-real-estate)
- [Home Price by Number of Bedrooms](https://www.myseattlehomesearch.com/blog/seattle-house-prices-versus-the-number-of-bedrooms/#:~:text=Not%20shown%20in%20the%20data,just%2012%25%20of%20the%20market.)



## 3. Data Needed To Support Topic

Housing Market
- Sales Price 
- number of bedrooms
- location

AirBnB Data
- number / percent of booked days 
- number of bedrooms
- location
- nightly rate

Other (Needed for NOI calculations)
- mortgage rate - we can use 7.5% and adjust if needed
- STR management fee - we can use 10%

## 4. Data Discovery - Kaggle AirBnB Data

The next step is to see what data is available in the Kaggle data set.

Sources:
- [Kaggle Seattle Listing Data Set](https://www.kaggle.com/datasets/airbnb/seattle)


There are three primary data sets available from the Kaggle data.
- AirBnB Listings
- AirBnB Reviews
- AirBnB Calendar

### Listings

The listings data set contains AirBnB listings from Seattle, Washington. To 
start off the analysis we will load and preview the listings data.

In [None]:
import pandas as pd

listings = pd.read_excel("../data/raw/Tableau Full Project.xlsx", sheet_name=0)
listings

In [None]:
listings.shape

There are 3,818 AirBnB listings included in the data. Each listing has 92 attributes.

Next we will look at the list of attributes available.

In [None]:
listings.info()

**Numeric Fields**

In [None]:
listings.describe()

In [None]:
listings.price.describe()

**Categorical Fields**

In [None]:
listings.neighbourhood_group_cleansed.unique()

In [None]:
listings.neighbourhood_group_cleansed.value_counts()

**Fields to Keep and Clean**

- id
- name
- street
- neighbourhood 
- neighbourhood_cleansed 
- neighbourhood_group_cleansed 
- city    
- state    
- zipcode   
- market    
- smart_location 
- country_code 
- country
- latitude
- longitude
- is_location_exact
- property_type
- room_type
- accommodates
- bathrooms
- bedrooms
- beds  
- bed_type
- amenities
- square_feet
- price
- weekly_price
- monthly_price
- security_deposit
- cleaning_fee
- guests_included
- extra_people
- minimum_nights
- maximum_nights
- calendar_updated
- has_availability
- availability_30
- availability_60
- availability_90
- availability_365
- calendar_last_scraped
- number_of_reviews
- first_review
- last_review
- review_scores_rating
- review_scores_accuracy
- review_scores_cleanliness
- review_scores_checkin
- review_scores_communication
- review_scores_location
- review_scores_value

## 4. Questions that can be answered

The purpose of this project is to analyze AirBnB data to determine the most 
attractive attributes of a short term rental investment.


- Question 1
- Question 2

## 5. Cleaning Needed

- Remove Columns
- Rename Columns
- Filter Rows
- Create Calculated Fields