*In this notebook:*<br>
CRISP-DM: Step 1 Step 2 and Step 3<br>
Posing questions for **Business Understanding** and further **Understand the Data** while **Preparing** it for analysis
___

The listings data-set contains listings from Seattle with descriptive and rating information about host and property.<br>
There are also transactional information, like the price, fees or guest-requirements.<br>
By focusing on the listings data set one keeps the option open to later add new feature from the other two data-sets if necessary.<br>

## The Business Objective

The Airbnb platform is a service to bring two parties together, the host and guest.<br>
*For further analysis let us assume the perspective of a host.*<br>

**What is the hosts objective?**

A host usually already has a property and does not need to acquire one to rent it out.<br>
The host wants to make extra money by renting the property out to guests.<br>
Obviously a host wants to maximize the income from a property. One way to achieve that is to adjust ask a high price per night.<br>
But how far can one go reasonably without getting unrealistic and probably drive away potential guests?<br>

**Therefore the questions we want to answer based on the given data are:**
1. Which parameters influence a listings price?
1. What parameter can the host use to improve price and value?
1. Can we make a good price estimation for a new offer to assist the (new) host?

In [1]:
import pandas as pd

from ExploreData import value_counter, index_by_key
from TransformData import \
    price_transform, rate_transform, split_column_values, date_transform

# Load the listings document to adata frame
df_listings = pd.read_csv('./data/listings.csv')

## Understanding and Preparing the Data for Further Analysis

Before analyzing the data some minimal cleaning needs to be done.<br>

1. We learned already, that some columns can be neglectec.
2. Some very important columns for our questions have the wrong data type.
Since our questions are phrased openly we might want to work with many columns in the data set.
Thus a close look at all categorical columns, which might be of interest, should be taken.

### 1. Drop Columns

It hast already been established in notebook number 00, that some columns can be dropped since they carry no information for the questions at hand or no information at all.

In [2]:
# Create an overview over the columns
list_column_values = value_counter(df_listings)

# Identify columns with constant values
list_const_drop = list_column_values.loc[
    list_column_values['val_count']==1
    ].index

# Identify columns with no values
list_nan_drop = list_column_values.loc[
    list_column_values['nan_pcnt']>=90
    ].index

# Identify url columns
list_url_drop= df_listings.columns[
    df_listings.columns.to_series().str.contains('url',case=False)]

# Additional colums to drop
list_else_drop = df_listings[[
    'city', 'state', 'smart_location',
    'host_name', 'host_location',
    'neighbourhood', 'neighbourhood_cleansed'
    ]].columns

# Merge all column names into one list
drop_columns_list = list_const_drop.append(
    list_nan_drop).append(
    list_url_drop).append(
    list_else_drop)

# Drop the columns
listings_drop_col = df_listings.drop(columns=drop_columns_list)

# Check what has been droped
df_listings[drop_columns_list].head()


Unnamed: 0,scrape_id,last_scraped,experiences_offered,market,country_code,country,has_availability,calendar_last_scraped,requires_license,jurisdiction_names,...,host_url,host_thumbnail_url,host_picture_url,city,state,smart_location,host_name,host_location,neighbourhood,neighbourhood_cleansed
0,20160104002432,2016-01-04,none,Seattle,US,United States,t,2016-01-04,f,WASHINGTON,...,https://www.airbnb.com/users/show/956883,https://a0.muscache.com/ac/users/956883/profil...,https://a0.muscache.com/ac/users/956883/profil...,Seattle,WA,"Seattle, WA",Maija,"Seattle, Washington, United States",Queen Anne,West Queen Anne
1,20160104002432,2016-01-04,none,Seattle,US,United States,t,2016-01-04,f,WASHINGTON,...,https://www.airbnb.com/users/show/5177328,https://a0.muscache.com/ac/users/5177328/profi...,https://a0.muscache.com/ac/users/5177328/profi...,Seattle,WA,"Seattle, WA",Andrea,"Seattle, Washington, United States",Queen Anne,West Queen Anne
2,20160104002432,2016-01-04,none,Seattle,US,United States,t,2016-01-04,f,WASHINGTON,...,https://www.airbnb.com/users/show/16708587,https://a1.muscache.com/ac/users/16708587/prof...,https://a1.muscache.com/ac/users/16708587/prof...,Seattle,WA,"Seattle, WA",Jill,"Seattle, Washington, United States",Queen Anne,West Queen Anne
3,20160104002432,2016-01-04,none,Seattle,US,United States,t,2016-01-04,f,WASHINGTON,...,https://www.airbnb.com/users/show/9851441,https://a2.muscache.com/ac/users/9851441/profi...,https://a2.muscache.com/ac/users/9851441/profi...,Seattle,WA,"Seattle, WA",Emily,"Seattle, Washington, United States",Queen Anne,West Queen Anne
4,20160104002432,2016-01-04,none,Seattle,US,United States,t,2016-01-04,f,WASHINGTON,...,https://www.airbnb.com/users/show/1452570,https://a0.muscache.com/ac/users/1452570/profi...,https://a0.muscache.com/ac/users/1452570/profi...,Seattle,WA,"Seattle, WA",Emily,"Seattle, Washington, United States",Queen Anne,West Queen Anne


### 2. Transform Object-Columns

In [3]:
# Isolate object column names
cat_listings = listings_drop_col.select_dtypes(include=['object'])

# Create a df with column information
cat_column_values = value_counter(cat_listings)

# How many categorical values are there?
cat_column_values.shape[0]


38

There are 38 object columns. Not all of them are of use for the questions at hand.<br>
So we look at them one by one.

**Starting with columns that have many different values**<br>
Many of these are text and have more than 80% unique values.<br>
But there are also the `amenities` - which are basically lists of values per column entry. They can be extracted into a set of amenity-columns.<br>
One can also transform the `price` to a numeric after dropping the $-sign.

In [4]:
# Find categorical columns with more than 200 unique values
columns = cat_column_values.loc[cat_column_values['val_count']>200]

columns

Unnamed: 0,val_count,nan_count,val_pcnt,nan_pcnt,col_dtype
name,3792,0,99.319015,0.0,object
summary,3478,177,91.094814,4.635935,object
space,3119,569,81.691985,14.903091,object
description,3742,0,98.009429,0.0,object
neighborhood_overview,2506,1032,65.636459,27.029859,object
notes,1999,1606,52.357255,42.063908,object
transit,2574,934,67.417496,24.46307,object
host_since,1380,2,36.144578,0.052383,object
host_about,2011,859,52.671556,22.49869,object
street,1442,0,37.768465,0.0,object


In [5]:
# Check the columns in the data-set
listings_drop_col[columns.index].head()

Unnamed: 0,name,summary,space,description,neighborhood_overview,notes,transit,host_since,host_about,street,amenities,price,weekly_price,monthly_price,first_review,last_review
0,Stylish Queen Anne Apartment,,Make your self at home in this charming one-be...,Make your self at home in this charming one-be...,,,,2011-08-11,"I am an artist, interior designer, and run a s...","Gilman Dr W, Seattle, WA 98119, United States","{TV,""Cable TV"",Internet,""Wireless Internet"",""A...",$85.00,,,2011-11-01,2016-01-02
1,Bright & Airy Queen Anne Apartment,Chemically sensitive? We've removed the irrita...,"Beautiful, hypoallergenic apartment in an extr...",Chemically sensitive? We've removed the irrita...,"Queen Anne is a wonderful, truly functional vi...",What's up with the free pillows? Our home was...,"Convenient bus stops are just down the block, ...",2013-02-21,Living east coast/left coast/overseas. Time i...,"7th Avenue West, Seattle, WA 98119, United States","{TV,Internet,""Wireless Internet"",Kitchen,""Free...",$150.00,"$1,000.00","$3,000.00",2013-08-19,2015-12-29
2,New Modern House-Amazing water view,New modern house built in 2013. Spectacular s...,"Our house is modern, light and fresh with a wa...",New modern house built in 2013. Spectacular s...,Upper Queen Anne is a charming neighborhood fu...,Our house is located just 5 short blocks to To...,A bus stop is just 2 blocks away. Easy bus a...,2014-06-12,i love living in Seattle. i grew up in the mi...,"West Lee Street, Seattle, WA 98119, United States","{TV,""Cable TV"",Internet,""Wireless Internet"",""A...",$975.00,,,2014-07-30,2015-09-03
3,Queen Anne Chateau,A charming apartment that sits atop Queen Anne...,,A charming apartment that sits atop Queen Anne...,,,,2013-11-06,,"8th Avenue West, Seattle, WA 98119, United States","{Internet,""Wireless Internet"",Kitchen,""Indoor ...",$100.00,$650.00,"$2,300.00",,
4,Charming craftsman 3 bdm house,Cozy family craftman house in beautiful neighb...,Cozy family craftman house in beautiful neighb...,Cozy family craftman house in beautiful neighb...,We are in the beautiful neighborhood of Queen ...,Belltown,The nearest public transit bus (D Line) is 2 b...,2011-11-29,"Hi, I live in Seattle, Washington but I'm orig...","14th Ave W, Seattle, WA 98119, United States","{TV,""Cable TV"",Internet,""Wireless Internet"",Ki...",$450.00,,,2012-07-10,2015-10-24


#### Amenities
The column has many unique values. But each entry is a list and the set of unique list-entries is not so big.<br>

**Create a column for each unique amenity with values 0 and 1**

In [6]:
# First: Identify all possible values:
# Create an auxilliary list to carry all values from splitted entries
all_splitted = []
# Split every entry into a list of amenities and append to 'all_splitted'
for i in listings_drop_col.index:
    entry_split = listings_drop_col['amenities'][i].replace(
        '{', '').replace('}', '').replace('"', '').split(sep=",")
    all_splitted = all_splitted+entry_split

# Create 'all_entries' by removing dublicates from 'all_splitted'
all_entries = list(set(all_splitted))
all_entries.remove('')

# Second: Use the 'split_column_values' function
# to create a new column for every value
split_column_values(
    listings_drop_col,  # data-frame
    'amenities',  # column name
    all_entries,  # values in a list
    'amenities_'  # prefix for new columns
    )

# Drop the original column 'amenities'
listings_drop_col.drop(columns=['amenities'], inplace= True)

# Check the result
listings_drop_col[index_by_key(listings_drop_col, ['amenities'])].head()

Unnamed: 0,amenities_Other pet(s),amenities_Washer / Dryer,amenities_Indoor Fireplace,amenities_Cat(s),amenities_Air Conditioning,amenities_Free Parking on Premises,amenities_Dog(s),amenities_Gym,amenities_Carbon Monoxide Detector,amenities_TV,...,amenities_First Aid Kit,amenities_Wheelchair Accessible,amenities_Fire Extinguisher,amenities_Suitable for Events,amenities_Wireless Internet,amenities_Heating,amenities_Safety Card,amenities_Pets live on this property,amenities_Hair Dryer,amenities_Pool
0,0,0,0,0,1,0,0,0,0,1,...,0,0,0,0,1,1,0,0,0,0
1,0,0,0,0,0,1,0,0,1,1,...,1,0,1,0,1,1,1,0,0,0
2,0,0,1,1,1,1,1,0,1,1,...,0,0,0,0,1,1,0,1,0,0
3,0,0,1,0,0,0,0,0,1,0,...,0,0,1,0,1,1,1,0,0,0
4,0,0,0,0,0,0,0,0,1,1,...,1,0,1,0,1,1,0,0,0,0


#### Date Columns

There are three dates in the above extract.
* `first_review`
* `last_review`
* `host_since`

At this point it is not clear weather they are all needed.<br>
But it is no beg step to transform them into a date-type.<br>
The function date_transform() also creates separate columns for day, month and year.

In [7]:
# Transform all dates into three columns
date_transform(listings_drop_col, 'host_since', 'host_since_')
date_transform(listings_drop_col, 'first_review', 'first_review_')
date_transform(listings_drop_col, 'last_review', 'last_review_')
# Drop original columns
listings_drop_col = listings_drop_col.drop(
    columns=['host_since','first_review','last_review'],
    axis=1
    )
# Check the result:
listings_drop_col[['host_since_year']].head()

Unnamed: 0,host_since_year
0,2011.0
1,2013.0
2,2014.0
3,2013.0
4,2011.0


Columns with more than 200 unique values also contain the `price` column. But there were more columns with currency values.<br>
We check the remaining columns first and transform all currency-values at once.

In [8]:
# Create an overview over the columns values
columns = cat_column_values.loc[cat_column_values['val_count']<=200]

columns

Unnamed: 0,val_count,nan_count,val_pcnt,nan_pcnt,col_dtype
host_response_time,4,523,0.104767,13.698271,object
host_response_rate,45,523,1.178628,13.698271,object
host_acceptance_rate,2,773,0.052383,20.246202,object
host_is_superhost,2,2,0.052383,0.052383,object
host_neighbourhood,102,300,2.671556,7.857517,object
host_verifications,116,0,3.03824,0.0,object
host_has_profile_pic,2,2,0.052383,0.052383,object
host_identity_verified,2,2,0.052383,0.052383,object
neighbourhood_group_cleansed,17,0,0.445259,0.0,object
zipcode,28,7,0.733368,0.183342,object


In [9]:
# Inspect the selected columns
listings_drop_col[columns.index].head()

Unnamed: 0,host_response_time,host_response_rate,host_acceptance_rate,host_is_superhost,host_neighbourhood,host_verifications,host_has_profile_pic,host_identity_verified,neighbourhood_group_cleansed,zipcode,...,room_type,bed_type,security_deposit,cleaning_fee,extra_people,calendar_updated,instant_bookable,cancellation_policy,require_guest_profile_picture,require_guest_phone_verification
0,within a few hours,96%,100%,f,Queen Anne,"['email', 'phone', 'reviews', 'kba']",t,t,Queen Anne,98119,...,Entire home/apt,Real Bed,,,$5.00,4 weeks ago,f,moderate,f,f
1,within an hour,98%,100%,t,Queen Anne,"['email', 'phone', 'facebook', 'linkedin', 're...",t,t,Queen Anne,98119,...,Entire home/apt,Real Bed,$100.00,$40.00,$0.00,today,f,strict,t,t
2,within a few hours,67%,100%,f,Queen Anne,"['email', 'phone', 'google', 'reviews', 'jumio']",t,t,Queen Anne,98119,...,Entire home/apt,Real Bed,"$1,000.00",$300.00,$25.00,5 weeks ago,f,strict,f,f
3,,,,f,Queen Anne,"['email', 'phone', 'facebook', 'reviews', 'jum...",t,t,Queen Anne,98119,...,Entire home/apt,Real Bed,,,$0.00,6 months ago,f,flexible,f,f
4,within an hour,100%,,f,Queen Anne,"['email', 'phone', 'facebook', 'reviews', 'kba']",t,t,Queen Anne,98119,...,Entire home/apt,Real Bed,$700.00,$125.00,$15.00,7 weeks ago,f,strict,f,f


#### Currency Columns

There are six columns containing $-values. To transform them into a number one can use the `price_transform` function.

In [10]:
# Identify currancy columns
search_values = ['price', 'fee', 'deposit', 'extra']
price_column_names  = index_by_key(listings_drop_col, search_values)

# Check before transformation
listings_drop_col[price_column_names].head()

Unnamed: 0,price,weekly_price,monthly_price,security_deposit,cleaning_fee,extra_people
0,$85.00,,,,,$5.00
1,$150.00,"$1,000.00","$3,000.00",$100.00,$40.00,$0.00
2,$975.00,,,"$1,000.00",$300.00,$25.00
3,$100.00,$650.00,"$2,300.00",,,$0.00
4,$450.00,,,$700.00,$125.00,$15.00


In [11]:
# price is a string type: convert to float
for col in price_column_names :
    listings_drop_col[col] = price_transform(listings_drop_col[col])

# Check after transformation
listings_drop_col[price_column_names].head()

Unnamed: 0,price,weekly_price,monthly_price,security_deposit,cleaning_fee,extra_people
0,85.0,,,,,5.0
1,150.0,1000.0,3000.0,100.0,40.0,0.0
2,975.0,,,1000.0,300.0,25.0
3,100.0,650.0,2300.0,,,0.0
4,450.0,,,700.0,125.0,15.0


#### Rates

Columns containing a rate can be dealt with just like currencies, just with the `rate_transform` function.

In [12]:
# Identify columns
search_values_rate = ['rate']
rate_column_names  = index_by_key(listings_drop_col, search_values_rate)

listings_drop_col[rate_column_names].head()

Unnamed: 0,host_response_rate,host_acceptance_rate
0,96%,100%
1,98%,100%
2,67%,100%
3,,
4,100%,


In [13]:
# Transform to float and check the result
for col in rate_column_names :
    listings_drop_col[col] = rate_transform(listings_drop_col[col])

listings_drop_col[rate_column_names].head()

Unnamed: 0,host_response_rate,host_acceptance_rate
0,96.0,100.0
1,98.0,100.0
2,67.0,100.0
3,,
4,100.0,


#### Binary Object-Columns

Many columns contain only boolean information stored in strings: 't' and 'f'. These get mapped to 1 and 0.

In [14]:
# Identify binary object columns
binary_cols = cat_column_values.loc[cat_column_values['val_count']==2 ].index

listings_drop_col[binary_cols].head()

Unnamed: 0,host_acceptance_rate,host_is_superhost,host_has_profile_pic,host_identity_verified,is_location_exact,instant_bookable,require_guest_profile_picture,require_guest_phone_verification
0,100.0,f,t,t,t,f,f,f
1,100.0,t,t,t,t,f,t,t
2,100.0,f,t,t,t,f,f,f
3,,f,t,t,t,f,f,f
4,,f,t,t,t,f,f,f


In [15]:
# Define a value map
binary_map = {'t': 1, 'f': 0}

# Map binary object-columns to 0 and 1
for col in binary_cols:
    if listings_drop_col[col].dtype == 'object':
        listings_drop_col[col] = listings_drop_col[col].map(binary_map)

listings_drop_col[binary_cols].head()

Unnamed: 0,host_acceptance_rate,host_is_superhost,host_has_profile_pic,host_identity_verified,is_location_exact,instant_bookable,require_guest_profile_picture,require_guest_phone_verification
0,100.0,0.0,1.0,1.0,1,0,0,0
1,100.0,1.0,1.0,1.0,1,0,1,1
2,100.0,0.0,1.0,1.0,1,0,0,0
3,,0.0,1.0,1.0,1,0,0,0
4,,0.0,1.0,1.0,1,0,0,0


#### Ordinal Object-Columns

Two columns, host_response_time and cancellation_policy, have very few values that can be sorted in some way.<br>
We sort them and replace strings with increasing numbers.

In [16]:
listings_drop_col[['cancellation_policy', 'host_response_time']].head()

Unnamed: 0,cancellation_policy,host_response_time
0,moderate,within a few hours
1,strict,within an hour
2,strict,within a few hours
3,flexible,
4,strict,within an hour


In [17]:
# Values for cancellation_policy
listings_drop_col['host_response_time'].value_counts()

within an hour        1692
within a few hours     968
within a day           597
a few days or more      38
Name: host_response_time, dtype: int64

In [18]:
# Values for cancellation_policy
listings_drop_col['cancellation_policy'].value_counts()

strict      1417
moderate    1251
flexible    1150
Name: cancellation_policy, dtype: int64

In [19]:
# Create value maps for both columns
policy_map = {'strict':2, 'moderate':1, 'flexible':0}
response_map = {'a few days or more':3, 'within a day':2, 'within a few hours':1, 'within an hour':0}

# Map both columns to new values
listings_drop_col['host_response_time'] = listings_drop_col['host_response_time'].map(response_map)
listings_drop_col['cancellation_policy'] = listings_drop_col['cancellation_policy'].map(policy_map)

# Check
listings_drop_col[['cancellation_policy', 'host_response_time']].head()

Unnamed: 0,cancellation_policy,host_response_time
0,1,1.0
1,2,0.0
2,2,1.0
3,0,
4,2,0.0


#### host_verifications

There is one column left the should be transformed at this point.<br>
`host_verifications` looks similar to `amenities`, with only small differences.

**Create a column for each unique verification method with values 0 and 1**

In [20]:
# How do the host columns look like?
search_values_host = ['host', 'type']
host_column_names  = listings_drop_col.columns[
    listings_drop_col.columns.to_series().str.contains(
        '|'.join(search_values_host),case=False
        )
    ]

listings_drop_col[host_column_names].head()

Unnamed: 0,host_id,host_about,host_response_time,host_response_rate,host_acceptance_rate,host_is_superhost,host_neighbourhood,host_listings_count,host_total_listings_count,host_verifications,host_has_profile_pic,host_identity_verified,property_type,room_type,bed_type,calculated_host_listings_count,host_since_day,host_since_month,host_since_year
0,956883,"I am an artist, interior designer, and run a s...",1.0,96.0,100.0,0.0,Queen Anne,3.0,3.0,"['email', 'phone', 'reviews', 'kba']",1.0,1.0,Apartment,Entire home/apt,Real Bed,2,11.0,8.0,2011.0
1,5177328,Living east coast/left coast/overseas. Time i...,0.0,98.0,100.0,1.0,Queen Anne,6.0,6.0,"['email', 'phone', 'facebook', 'linkedin', 're...",1.0,1.0,Apartment,Entire home/apt,Real Bed,6,21.0,2.0,2013.0
2,16708587,i love living in Seattle. i grew up in the mi...,1.0,67.0,100.0,0.0,Queen Anne,2.0,2.0,"['email', 'phone', 'google', 'reviews', 'jumio']",1.0,1.0,House,Entire home/apt,Real Bed,2,12.0,6.0,2014.0
3,9851441,,,,,0.0,Queen Anne,1.0,1.0,"['email', 'phone', 'facebook', 'reviews', 'jum...",1.0,1.0,Apartment,Entire home/apt,Real Bed,1,6.0,11.0,2013.0
4,1452570,"Hi, I live in Seattle, Washington but I'm orig...",0.0,100.0,,0.0,Queen Anne,2.0,2.0,"['email', 'phone', 'facebook', 'reviews', 'kba']",1.0,1.0,House,Entire home/apt,Real Bed,1,29.0,11.0,2011.0


In [21]:
listings_drop_col['host_verifications'].head()

0                 ['email', 'phone', 'reviews', 'kba']
1    ['email', 'phone', 'facebook', 'linkedin', 're...
2     ['email', 'phone', 'google', 'reviews', 'jumio']
3    ['email', 'phone', 'facebook', 'reviews', 'jum...
4     ['email', 'phone', 'facebook', 'reviews', 'kba']
Name: host_verifications, dtype: object

In [22]:
# Split entries into single values and create one column per value

# Find all values
all_splitted = []
for i in listings_drop_col.index:
        entry_split = listings_drop_col['host_verifications'].loc[i].replace(
                '[', '').replace(']', '').replace("'", "").split(sep=", ")
        all_splitted = all_splitted+entry_split

# Remove dublicates
all_entries = list(set(all_splitted))
all_entries.remove('')
all_entries.remove('None')

# Create new columns and add a prefix to new column names
split_column_values(
        listings_drop_col, 
        'host_verifications', 
        all_entries, 
        'host_verifications_'
        )

# Drop the original column 
listings_drop_col = listings_drop_col.drop(
        columns='host_verifications', axis=1)

# Check the result
listings_drop_col[
        index_by_key(listings_drop_col, ['host_verifications'])
        ].head()


Unnamed: 0,host_verifications_manual_online,host_verifications_facebook,host_verifications_manual_offline,host_verifications_linkedin,host_verifications_phone,host_verifications_amex,host_verifications_reviews,host_verifications_sent_id,host_verifications_photographer,host_verifications_google,host_verifications_email,host_verifications_weibo,host_verifications_jumio,host_verifications_kba
0,0,0,0,0,1,0,1,0,0,0,1,0,0,1
1,0,1,0,1,1,0,1,0,0,0,1,0,1,0
2,0,0,0,0,1,0,1,0,0,1,1,0,1,0
3,0,1,0,0,1,0,1,0,0,0,1,0,1,0
4,0,1,0,0,1,0,1,0,0,0,1,0,0,1


## Done

For now, all rellevant columns are transoformed into a type and shape that makes them accessable for the following analysis.<br>
The final look at the object columns confirm: They have been reduced in numbers significantly.<br>
The data-frames dimensions on the other hand shows an increased number of columns.

In [23]:
# The data-frames dimensions
listings_drop_col.shape

(3818, 124)

In [24]:
# How many object columns are left and are they relevant to the problem?

# Isolate object column names:
cat_listings = listings_drop_col.select_dtypes(include=['object'])
# Create a df with column information: 
cat_column_values = value_counter(cat_listings)

cat_column_values

Unnamed: 0,val_count,nan_count,val_pcnt,nan_pcnt,col_dtype
name,3792,0,99.319015,0.0,object
summary,3478,177,91.094814,4.635935,object
space,3119,569,81.691985,14.903091,object
description,3742,0,98.009429,0.0,object
neighborhood_overview,2506,1032,65.636459,27.029859,object
notes,1999,1606,52.357255,42.063908,object
transit,2574,934,67.417496,24.46307,object
host_about,2011,859,52.671556,22.49869,object
host_neighbourhood,102,300,2.671556,7.857517,object
street,1442,0,37.768465,0.0,object
