## Problem Statement
#### Compare return on investment for a rental property that is offered on the private market vs on AirBnb


## US Markets to Explore,
* San Francisco 
* New York
* Las Vegas
* Lake Tahoe
* Miami
* Los Angeles

### Property Type
The property must be an Apartment, House, Condominium, Townhouse, Cottage, Guesthouse, Hostel, Bungalow, Boutique hotel, Bed and breakfast, Hotel, Tiny house, Resort, Villa, Timeshare, Cabin, Castle, Earth house with a minimum of 1 bedroom and maximum of 7 bedrooms

### Calculation
After selecting the properties,calculate the rental rates of different areas based on the features to bedroom ratio from traditional rentals and Airbnb listings.

For the Airbnb occupancy rate, we check the number of bookings on a monthly basis, while the Airbnb occupancy rate displayed on the platform is based on the number of bookings that took place in the past 10 months. From there, we are then able to project the Airbnb occupancy rates for other properties based on the property’s rental rate and the property type. Finally, the estimated Airbnb rental income is calculated by multiplying the rental rate by the occupancy rate. 

Data Obtained from:

#### SF, NY
Murray Cox, 2018, “Detailed Listings data for San Francisco”, Inside Airbnb, http://insideairbnb.com/get-the-data.html

#### Explore the following Datasets

* AirBnb
* HomeAway
* Travelocity
* ListHub
* Rent Jungle
* Hotpads
* Zillow

## Ideas to Explore

### Features:
* Investiate Zillow Rental Prices vs Monthyl AirBnb Prices, example compare 1 bedrooms in a specific area
* Seasonal(Destination Type)
* Weather Patters(Climate)
* Walkability Index
* From Neighborhood look at zillow average home prices

In [1]:
#Import libraries
import pandas as pd
import numpy as np

#Set display options
pd.set_option('display.max_columns', None)

In [2]:

# Create initial DataFrame
df_sf = pd.DataFrame()
sf_df = pd.read_csv('.\\Data\\San Francisco Data\\listings.csv.gz',
                   error_bad_lines=False)

sf_df = df_sf.append(sf_df)

#df_ny = pd.DataFrame()
#ny_df = pd.read_csv('.\\Data\\New York\\listings.csv.gz',
                   #error_bad_lines=False)
#ny_df = df_ny.append(ny_df)



In [3]:
#Function to preprocess data
def preprocess(df):
    
    #Standardize column names
    df.columns = df.columns.str.strip()
    
    #Keep columns of interest
    df = df[['bathrooms','bedrooms','accommodates','price','weekly_price','monthly_price','cleaning_fee',\
             'property_type','zipcode','neighbourhood_cleansed','transit','access','latitude','longitude',\
             'amenities','square_feet']]
                          
    #Update column names
    df.rename(columns={'neighbourhood_cleansed': 'neighbourhood','price': 'nightly_price' }, inplace=True)
    
    #Update property types
    #Modify listing names that might be considered a condo
    condo_list=['Guesthouse','Bungalow','Tiny house','Cottage']
    df = update_property_value('property_type',condo_list,'Condominium', df)
    
    #Modify listing names that might be considered an apartment
    apartment_list = ['Serviced apartment','Aparthotel','Timeshare']
    df = update_property_value('property_type',apartment_list,'Apartment', df)
    
    #Modify listing names that might be considered an apartment
    house_list = ['Hostel','Boutique hotel','Bed and breakfast','Villa','Cabin','Earth house']
    df = update_property_value('property_type',house_list,'House', df)
    
    #Keep properties that are an Apartment Townhouse Condominium or House
    df = df.loc[(df['property_type'] == 'Apartment') | (df['property_type'] == 'Townhouse') | (df['property_type'] == 'Condominium')\
                | (df['property_type'] == 'House')]
    
    #Update Price column name to nightly price
    df.rename(columns={'neighbourhood_cleansed': 'neighbourhood','price': 'nightly_price' }, inplace=True)
    
    #Drop bedroom data with non values and convert to integer type
    df = df[np.isfinite(df['bedrooms'])]
    df['bedrooms']= df['bedrooms'].astype(int)
    
    #Drop bedroom data with non values and convert to integer type
    df = df[np.isfinite(df['bathrooms'])]
    
    #Drop rows with 0 bedrooms or bedrooms less than 7 bedrooms
    df = df[(df.bedrooms > 0) & (df.bedrooms <=7)]
  
    return df

def update_property_value(column_name,listing_list, updated_listing_name,df):
    for listing in listing_list:
        df[column_name].replace(listing, updated_listing_name, inplace=True)
    return df

In [4]:
df_sf = preprocess(sf_df)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._update_inplace(new_data)


In [5]:
#Determine number of listings per property_type
def number_of_listing(df):
    dicts = {}
    keys = df.property_type.unique()
    for i in keys:
        dicts[i] = len(df[df.property_type==str(i)])
    
    return dicts

sf_listing_dict = number_of_listing(df_sf)

In [6]:
print(sf_listing_dict)

{'Apartment': 2690, 'House': 2162, 'Condominium': 783, 'Townhouse': 138}


# TODO List
* Average square footage of the neighborhoods around sf based on number of bedrooms and bathrooms
* Average price of a property with the above dimensions-Check
* Average monthly price of a property with the above dimensions-Check
* Walkability Index of Data
* Climate Patterns
* Determine number of days location was booked
* Update the following neighborhood names
    ##### Ocean View Terrace
    ##### Marina District
    ##### Presidio Heights
    ##### Civic Center / Van Ness
    ##### Castro
    ##### Upper Market

# Convert property type to reduce types of homes

### Condominum
* Guesthouse
* Bungalow
* Tiny House
* Cottage

### Apartment
* Serviced Apartment 
* Aparthotel
* Timeshare

### House
* Hostel
* Boutique Hotel 
* Bed and Breakfast
* Villa
* Cabin 
* Earth house

### Drop
* Hotel
* Resort
* Castle
