## Problem Statement
#### Compare return on investment for a rental property that is offered on the private market vs on AirBnb


## US Markets to Explore,
* San Francisco 
* New York
* Las Vegas
* Lake Tahoe
* Miami
* Los Angeles

### Property Type
The property must be an Apartment, House, Condominium, Townhouse, Cottage, Guesthouse, Hostel, Bungalow, Boutique hotel, Bed and breakfast, Hotel, Tiny house, Resort, Villa, Timeshare, Cabin, Castle, Earth house with a minimum of 1 bedroom and maximum of 7 bedrooms

### Calculation
After selecting the properties,calculate the rental rates of different areas based on the features to bedroom ratio from traditional rentals and Airbnb listings.

For the Airbnb occupancy rate, we check the number of bookings on a monthly basis, while the Airbnb occupancy rate displayed on the platform is based on the number of bookings that took place in the past 10 months. From there, we are then able to project the Airbnb occupancy rates for other properties based on the property’s rental rate and the property type. Finally, the estimated Airbnb rental income is calculated by multiplying the rental rate by the occupancy rate. 

Data Obtained from:

#### SF, NY
Murray Cox, 2018, “Detailed Listings data for San Francisco”, Inside Airbnb, http://insideairbnb.com/get-the-data.html

#### Explore the following Datasets

* AirBnb
* HomeAway
* Travelocity
* ListHub
* Rent Jungle
* Hotpads
* Zillow

## Ideas to Explore

### Features:
* Investiate Zillow Rental Prices vs Monthyl AirBnb Prices, example compare 1 bedrooms in a specific area
* Seasonal(Destination Type)
* Weather Patters(Climate)
* Walkability Index
* From Neighborhood look at zillow average home prices

In [47]:
#Import libraries
import pandas as pd
import numpy as np

import statsmodels.api as sm
import statsmodels.formula.api as smf
import patsy

import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import RidgeCV
%matplotlib inline

#Set display options
pd.set_option('display.max_columns', None)

In [48]:
# Create initial DataFrame
df_sf = pd.DataFrame()
sf_df = pd.read_csv('..\\Data\\San Francisco Data\\listings.csv.gz',
                   error_bad_lines=False)
sf_df = df_sf.append(sf_df)

In [49]:
#Function to preprocess data
def preprocess(df):
    
    #Standardize column names
    df.columns = df.columns.str.strip()
    
    #Keep columns of interest
    df = df[['bathrooms','bedrooms','accommodates','price','monthly_price',\
             'property_type','zipcode','neighbourhood_cleansed','access','amenities','availability_30','availability_60',\
             'availability_90','availability_365','number_of_reviews','review_scores_rating','review_scores_accuracy',\
             'review_scores_cleanliness','review_scores_checkin','review_scores_communication','review_scores_location',\
             'review_scores_value']]
                              
    #Update property types
    #Modify listing names that might be considered a condo
    condo_list=['Guesthouse','Bungalow','Tiny house','Cottage']
    df = update_property_value('property_type',condo_list,'Condominium', df)
    
    #Modify listing names that might be considered an apartment
    apartment_list = ['Serviced apartment','Aparthotel','Timeshare']
    df = update_property_value('property_type',apartment_list,'Apartment', df)
    
    #Modify listing names that might be considered an apartment
    house_list = ['Hostel','Boutique hotel','Bed and breakfast','Villa','Cabin','Earth house']
    df = update_property_value('property_type',house_list,'House', df)
    
    #Keep properties that are an Apartment Townhouse Condominium or House
    df = df.loc[(df['property_type'] == 'Apartment') | (df['property_type'] == 'Townhouse') | (df['property_type'] == 'Condominium')\
                | (df['property_type'] == 'House')]
    
    #Update column names
    df.rename(columns={'neighbourhood_cleansed': 'neighborhood','price': 'nightly_price_$','cleaning_fee':'cleaning_fee_$',\
                       'monthly_price':'monthly_price_$'}, inplace=True)
    
    #Drop bedroom data with non values and convert to integer type
    df = df[np.isfinite(df['bedrooms'])]
    df['bedrooms']= df['bedrooms'].astype(int)
    
    #Drop bedroom data with non values and convert to integer type
    df = df[np.isfinite(df['bathrooms'])]
    
    #Drop rows with 0 bedrooms or bedrooms less than 7 bedrooms
    df = df[(df.bedrooms > 0) & (df.bedrooms <=4)]
    
    #Modify string dollar value of nightly price to float
    df['nightly_price_$']=df['nightly_price_$'].replace('[\$,]', '', regex=True).astype(float)
        
    #Update monthly price by multiplying each value of nightly_price by 30
    df['monthly_price_$']=df['nightly_price_$']*30
    
    #Convert Zipcode values to an integer
    
  
    return df

def update_property_value(column_name,listing_list, updated_listing_name,df):
    for listing in listing_list:
        df[column_name].replace(listing, updated_listing_name, inplace=True)
    return df

In [50]:
df_sf = preprocess(sf_df)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._update_inplace(new_data)


In [51]:
#Determine number of listings per property_type
def number_of_listing(df):
    dicts = {}
    keys = df.property_type.unique()
    for i in keys:
        dicts[i] = len(df[df.property_type==str(i)])
    
    return dicts

sf_listing_dict = number_of_listing(df_sf)

In [52]:
print(sf_listing_dict)

{'Apartment': 2686, 'House': 2133, 'Condominium': 781, 'Townhouse': 137}


In [53]:
#Calculate average of reivew columns
mean_availability_30 = df_sf['availability_30'].mean(skipna=True)
mean_availability_60 = df_sf['availability_60'].mean(skipna=True)
mean_availability_90 = df_sf['availability_90'].mean(skipna=True)
mean_availability_365 = df_sf['availability_365'].mean(skipna=True)
mean_num_reviews = df_sf['number_of_reviews'].mean(skipna=True)
mean_scores_rating = df_sf['review_scores_rating'].mean(skipna=True)
mean_score_accur = df_sf['review_scores_accuracy'].mean(skipna=True)
mean_score_clean = df_sf['review_scores_cleanliness'].mean(skipna=True)
mean_score_checkin = df_sf['review_scores_checkin'].mean(skipna=True)
mean_score_comm = df_sf['review_scores_communication'].mean(skipna=True)
mean_score_loc = df_sf['review_scores_location'].mean(skipna=True)
mean_review_score= df_sf['review_scores_value'].mean(skipna=True)

In [54]:
#Replace column NaN values with mean of column
df_sf['availability_30'].fillna(int(round(mean_availability_30)), inplace=True)
df_sf['availability_60'].fillna(int(round(mean_availability_60)), inplace=True)
df_sf['availability_90'].fillna(int(round(mean_availability_90)), inplace=True)
df_sf['availability_365'].fillna(int(round(mean_availability_365)), inplace=True)

df_sf['number_of_reviews'].fillna(int(round(mean_num_reviews)), inplace=True)
df_sf['review_scores_rating'].fillna(int(round(mean_scores_rating)), inplace=True)
df_sf['review_scores_accuracy'].fillna(int(round(mean_score_accur)), inplace=True)
df_sf['review_scores_cleanliness'].fillna(int(round(mean_score_clean)), inplace=True)
df_sf['review_scores_checkin'].fillna(int(round(mean_score_checkin)), inplace=True)
df_sf['review_scores_communication'].fillna(int(round(mean_score_comm)), inplace=True)

df_sf['review_scores_location'].fillna(int(round(mean_score_loc)), inplace=True)
df_sf['review_scores_value'].fillna(int(round(mean_review_score)), inplace=True)

In [55]:
#Updated neighbourhood names in airbnb dataframe to match those from redfin
old_names = ['Ocean View ','Marina','Presidio','Downtown/Civic Center','Castro/Upper Market']
new_names = ['Ocean View Terrace','Marina District','Presidio Heights','Civic Center / Van Ness','Castro']

df_sf.replace('Ocean View', 'Ocean View Terrace', inplace=True)
df_sf.replace('Marina', 'Marina District',inplace=True)
df_sf.replace('Presidio', 'Presidio Heights',inplace=True)
df_sf.replace('Downtown/Civic Center', 'Civic Center / Van Ness',inplace=True)
df_sf.replace('Castro/Upper Market', 'Castro',inplace=True)


In [39]:
#Bring in zillow home value index and zillow rent index
xlsx_file_z_index = pd.ExcelFile('..\\Data\\San Francisco Data\\bedrooms_average.xlsx')
all_tier_z_index = pd.read_excel(xlsx_file_z_index, 'all_beds')


In [56]:
#Add Zillow Home Value and Rent Index Information for individual neighborhoods
df_sf = pd.merge(df_sf, all_tier_z_index, left_on=['neighborhood','bedrooms'], right_on=['neighborhood','bedrooms']);

In [64]:
df_sf.sample(6)

Unnamed: 0,bathrooms,bedrooms,accommodates,nightly_price_$,monthly_price_$,property_type,zipcode,neighborhood,access,amenities,availability_30,availability_60,availability_90,availability_365,number_of_reviews,review_scores_rating,review_scores_accuracy,review_scores_cleanliness,review_scores_checkin,review_scores_communication,review_scores_location,review_scores_value,Zillow Home Value Index $,Zillow Rent Index $
3356,1.0,1,4,118.0,3540.0,House,94122,Outer Sunset,PLENTY FREE STREET PARKING available outside d...,"{TV,""Cable TV"",Wifi,""Free parking on premises""...",0,3,10,45,146,99.0,10.0,10.0,10.0,10.0,9.0,10.0,1304200.0,3948.94
573,1.0,1,1,145.0,4350.0,Apartment,94117,Haight Ashbury,"Guests have access to the dining area, kitchen...","{Wifi,Kitchen,Heating,Washer,Dryer,""Smoke dete...",3,20,20,20,6,97.0,10.0,10.0,10.0,10.0,10.0,9.0,1616300.0,5079.0
135,1.0,1,6,275.0,8250.0,House,94115,Western Addition,Coffeemaker/mini-fridge No kitchen or cooking,"{TV,""Cable TV"",Internet,Wifi,""Free parking on ...",0,0,1,1,16,99.0,10.0,10.0,10.0,10.0,10.0,9.0,1053400.0,3448.0
5216,1.5,1,4,150.0,4500.0,House,94132,Lakeshore,,"{TV,Wifi,Kitchen,""Free street parking"",Heating...",6,11,11,11,25,98.0,10.0,10.0,10.0,10.0,10.0,9.0,1560700.0,4479.0
3173,2.0,2,4,200.0,6000.0,Condominium,94117,Haight Ashbury,You will have access to all of the rooms in my...,"{TV,Wifi,Kitchen,""Indoor fireplace"",Heating,""F...",5,23,53,53,65,99.0,10.0,9.0,10.0,10.0,10.0,10.0,1484600.0,4850.0
4029,2.0,3,6,700.0,21000.0,House,94107,Mission,Guests have access to the entire home aside fr...,"{TV,""Cable TV"",Internet,Wifi,""Air conditioning...",0,0,0,0,11,98.0,10.0,10.0,10.0,10.0,9.0,10.0,1713400.0,5529.0
3106,1.0,1,2,170.0,5100.0,Apartment,94105,Financial District,,"{TV,""Cable TV"",Wifi,Kitchen,Gym,Elevator,""Hot ...",8,38,68,343,0,96.0,10.0,10.0,10.0,10.0,10.0,9.0,1710600.0,4070.0
5468,2.0,3,6,845.0,25350.0,Apartment,94133,Russian Hill,You will have sole access to the full amenitie...,"{TV,""Cable TV"",Internet,Wifi,""Air conditioning...",30,60,90,365,0,96.0,10.0,10.0,10.0,10.0,10.0,9.0,3232100.0,7952.0
3664,1.0,2,5,150.0,4500.0,Apartment,94118,Inner Richmond,Separate apartment entrance on Lake Street.,"{TV,""Cable TV"",Internet,Wifi,Kitchen,""Free str...",9,17,40,294,4,93.0,10.0,10.0,10.0,10.0,9.0,10.0,1334500.0,4511.0
1177,1.0,1,2,160.0,4800.0,Apartment,94107,Potrero Hill,,"{TV,""Cable TV"",Internet,Wifi,Kitchen,""Free str...",7,27,57,147,0,96.0,10.0,10.0,10.0,10.0,10.0,9.0,1516500.0,3841.0


# SF Walkability Info

In [10]:
#Bring in walkability and sales data from redfin
df_sf_info = pd.DataFrame()
sf_df_info = pd.read_csv('..\\Data\\San Francisco Data\\sf_walk_and_sale_df.csv')
sf_df_info = df_sf_info.append(sf_df_info)

In [11]:
#Clean additional Data Points
sf_df_info['walk_score_of_100'][2] = 81
sf_df_info['transit_score_of_100'][2] = 75
sf_df_info['bike_score_of_100'][2] = 71
sf_df_info['median_sale_price'][19] = '$1.25M'
sf_df_info['median_sale_price'][22] = '$1.01M'
sf_df_info['median_sale_price'][22] = '$1.01M'
sf_df_info['median_sale_price'][31] = '$761K'
sf_df_info['median_sale_price'][33] = '$1.35M'

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://p

In [12]:
#Convert Median Home Prices to Int Value
import re
def convert_string_to_int(df_column):
    for index, row in df_column.iteritems():
        if type(row) == float:
            pass
        elif (row[-1] == 'M') | (row[-1] == 'm'):
            number = re.findall("\d+\.\d+", row)
            df_column[index] = int(float(number[0])*1000000)
                      
        elif (row[-1] == 'K') | (row[-1] == 'k'):
            number = re.findall("\d+", row)
            df_column[index] = int(float(number[0])*100000)

    return df_column

sf_df_info['median_sale_price'] = convert_string_to_int(sf_df_info['median_sale_price'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  del sys.path[0]


In [13]:
#Calculate average of sf_df_info columns
mean_walk_score_of_100 = sf_df_info['walk_score_of_100'].mean(skipna=True)
mean_transit_score_of_100 = sf_df_info['transit_score_of_100'].mean(skipna=True)
mean_bike_score_of_100 = sf_df_info['bike_score_of_100'].mean(skipna=True)
mean_median_sale_price = sf_df_info['median_sale_price'].mean(skipna=True)

In [14]:
#Replace column NaN values with mean of column
sf_df_info['walk_score_of_100'].fillna(int(round(mean_walk_score_of_100)), inplace=True)
sf_df_info['transit_score_of_100'].fillna(int(round(mean_transit_score_of_100)), inplace=True)
sf_df_info['bike_score_of_100'].fillna(int(round(mean_bike_score_of_100)), inplace=True)
sf_df_info['median_sale_price'].fillna(int(round(mean_median_sale_price)), inplace=True)

In [15]:
#Save Cleaned SF Info Data to CSV
#Save data to csv file
sf_df_info.to_csv('..\Data\San Francisco Data\cleaned_sf_walk_and_sale_df.csv', index=False)

# One Hot Encoding

In [16]:
#One Hot encode property type data and join to dataframe
df_sf = df_sf.join(pd.get_dummies(df_sf['property_type']));

In [17]:
#One Hot encode neighorhood data and join to dataframe
df_sf = df_sf.join(pd.get_dummies(df_sf['neighborhood']));

In [18]:
#Bring in Walkability and Median Home Value Data
df_sf = pd.merge(df_sf, sf_df_info, left_on='neighborhood', right_on='neighborhood');

In [19]:
#Remove all non values from zipcode column
df_sf = df_sf[(df_sf['zipcode'].notnull())]

In [20]:
#Remove all 'CA' values from zipcode column
df_sf = df_sf[(df_sf['zipcode'] != 'CA')]

In [21]:
df_sf['zipcode'] = df_sf['zipcode'].apply(pd.to_numeric)

In [22]:
df_sf.to_csv('..\Data\San Francisco Data\cleaned_sf_data_with_info.csv', index=False)

In [23]:
df_sf.head(3)

Unnamed: 0,bathrooms,bedrooms,accommodates,nightly_price_$,monthly_price_$,property_type,zipcode,neighborhood,access,amenities,availability_30,availability_60,availability_90,availability_365,number_of_reviews,review_scores_rating,review_scores_accuracy,review_scores_cleanliness,review_scores_checkin,review_scores_communication,review_scores_location,review_scores_value,Apartment,Condominium,House,Townhouse,Bayview,Bernal Heights,Castro,Chinatown,Civic Center / Van Ness,Crocker Amazon,Diamond Heights,Excelsior,Financial District,Glen Park,Golden Gate Park,Haight Ashbury,Inner Richmond,Inner Sunset,Lakeshore,Marina District,Mission,Nob Hill,Noe Valley,North Beach,Ocean View Terrace,Outer Mission,Outer Richmond,Outer Sunset,Pacific Heights,Parkside,Potrero Hill,Presidio Heights,Russian Hill,Seacliff,South of Market,Twin Peaks,Visitacion Valley,West of Twin Peaks,Western Addition,walk_score_of_100,transit_score_of_100,bike_score_of_100,median_sale_price
0,1.0,1,3,170.0,5100.0,Apartment,94117,Western Addition,*Full access to patio and backyard (shared wit...,"{TV,""Cable TV"",Internet,Wifi,Kitchen,""Pets liv...",1,1,2,64,180,97.0,10.0,10.0,10.0,10.0,10.0,10.0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,96.0,89.0,89.0,76500000
1,1.5,2,5,785.0,23550.0,House,94117,Western Addition,Guests have access to everything listed and sh...,"{TV,Internet,Wifi,Kitchen,Heating,""Family/kid ...",30,60,90,90,27,97.0,10.0,10.0,10.0,10.0,10.0,9.0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,96.0,89.0,89.0,76500000
2,1.0,2,6,255.0,7650.0,Apartment,94115,Western Addition,,"{TV,""Cable TV"",Internet,Wifi,Kitchen,""Free par...",5,7,15,25,31,90.0,9.0,8.0,10.0,10.0,9.0,9.0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,96.0,89.0,89.0,76500000


# Master Cell

In [None]:
#Drop Property Type After One Hot Encoding
df_sf.drop(['property_type'], axis=1, inplace=True)

In [None]:
access_key_words = ['patio','backyard','garden','washer','dryer,grill','kitchen',]
amenities = ['Pool','Kitchen','Free Parking','Pet','Toilet paper','Soap','towel','pillow','linens']
             
'TV', 'Internet','Wifi','Kitchen','Paid parking off premises','Free street parking','Indoor fireplace','Heating',\
'Family/kid friendly','Smoke detector','Carbon monoxide detector','Fire extinguisher','Essentials,Shampoo',\
'24-hour check-in','Hangers','Hair dryer','Iron','Laptop friendly', 'workspace','Self check-in','Lockbox',\
'Bathtub','Hot water','Microwave','Coffee maker','Refrigerator','Dishes and silverware','Cooking basics',\
'Patio or balcony','Garden','backyard','Luggage','Long term','Wide hallway clearance','Wide doorway',\
'Flat path to front door','Well-lit','path','Step-free','Wide doorway','Wide clearance','Accessible',\
'Step-free access','Indoor fireplace','Heating','Family/kid friendly','Bathtub','Self check-in','Laundry',\
'Fire extinguisher','deck','fridge','Fireplace','Heating','lock','iron','garage','beach','','','','','','',\
'','','','','','','','','','','','','','','',\
             

In [None]:
len(result)

In [None]:
#TODO
#Figure out a way to create a dictionary from this list, and from this dictionary randomly fill in zip code data in the master data frame.

#For now we are dropping the 157 zipcodes that have no zip code and those with non zipcode values

df_missing_zip_codes = df_sf.loc[df_sf['zipcode'].astype(str).str.contains('nan')]

#List of neighborhoods with missing zipcodies
df_missing_zip_codes = list(df_missing_zip_codes.groupby(['neighborhood','zipcode']).groups.keys())

df_zip_code_list = list(df_sf.groupby(['neighborhood','zipcode']).groups.keys())

In [None]:
df_sf.to_csv('..\Data\San Francisco Data\merge_sf_walk_and_sale_df.csv', index=False)

In [None]:
#df_sf_with_info = df_sf.merge(sf_df_info)
-Forcasting Wine Price per my Data

* Get Dataset, run.info, what type my data is, look at heat map, postively and negatively corelated things, pair plot, general
patterns, box and violen plots, categorical bar charts, Look at how patterns change

Create a hypothesis of what my linear model will tell me

Fit Basic Model, before feature extraction, Look at features, and evaluation 

Linear regression assumptions, residuals are normall distributed

Grid search and standard scalar

# Data Analysis

In [None]:
df_sf_test=df_sf.copy()

In [None]:
df_sf_test.drop(['neighborhood','access','amenities'], axis=1, inplace=True);

In [None]:
df_sf_test.corr()

In [None]:
fig, ax = plt.subplots(figsize=(10,10))
sns.heatmap(df_sf_test.corr(), cmap="seismic", annot=True, vmin=-1, vmax=1, ax=ax);

In [None]:
sns.pairplot(df_sf_test, aspect=1.5);

In [None]:
columns = ['X0' , 'X1' , 'X2' , 'X3' ,'X4', 'X5' ,'X6' ,'X7' ,'X8' , 'X9' , 'X10' , 'X11' , 'X12' , 'X13' , 'X14' , 'X15' , 'X16' , 'X17' , 'X18' , 'X19' , 'X20' , 'X21' , 'X22' , 'X23' , 'X24' , 'X25' , 'X26' , 'X27' , 'X28' , 'X29' , 'X30' , 'X31' , 'X32' , 'X33' , 'X34' , 'X35' , 'X36' , 'X37' , 'X38' , 'X39' , 'X40' , 'X41' , 'X42' , 'X43' , 'X44' , 'X45' , 'X46' , 'X47' , 'X48', 'X49' , 'X50','X51']

In [None]:
y,X = patsy.dmatrices('X4 ~ X0 + X1 + X2 + X5 + X9 + X10 + X11 + X12 + X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 + X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30 + X31 + X32 + X33 + X34 + X35 + X36 + X37 + X38 + X39 + X40 + X41 + X42 + X43 + X44 + X45 + X46 + X47 + X48 + X49 + X50' , data=df_sf_test, return_type="dataframe")

# TODO List
* Average square footage of the neighborhoods around sf based on number of bedrooms and bathrooms
* Average price of a property with the above dimensions-Check
* Average monthly price of a property with the above dimensions-Check
* Walkability Index of Data-Check
* Climate Patterns
* Determine number of days location was booked
* Update the following neighborhood names - Check
    ##### Ocean View Terrace
    ##### Marina District
    ##### Presidio Heights
    ##### Civic Center / Van Ness
    ##### Castro
    ##### Upper Market
* Drop transit data as we have a transit score available to us via redfind

# Convert property type to reduce types of homes

### Condominum
* Guesthouse
* Bungalow
* Tiny House
* Cottage

### Apartment
* Serviced Apartment 
* Aparthotel
* Timeshare

### House
* Hostel
* Boutique Hotel 
* Bed and Breakfast
* Villa
* Cabin 
* Earth house

### Drop
* Hotel
* Resort
* Castle
