airbnb listing price & crime map in Boston 
My goal is to help visitors find a safe place to live in Boston based on the listing price and crime incidence report.
I use Boston crime incidence record dataset to draw the crime heatmap using folium package.
Visitors could choose neighborhood they interested, choose crime offence code they care most and using this project to
form a map, on that map they will get the price for each airbnb house in that neighborhood and find a safety location.
In this project I did not consider beds and price factor, just used location information to plot on the heatmap and
used price as label to pop price information. 

Copyright (c) 2018
Licensed
Written by Yidong Zhu

In [1]:
import pandas as pd
import folium
from folium import plugins
from folium.plugins import HeatMap

In [2]:
# download dataset
# dat1 is from https://data.boston.gov/dataset/crime-incident-reports-august-2015-to-date-source-new-system
# Since the crime report dataset is more than 25M and github doesn't allow me to upload it, I just use the download
# url here. https://data.boston.gov/dataset/6220d948-eae2-4e4b-8723-2dc8e67722a3/resource/12cb3883-56f5-47de-afa5-3b1cf61b257b/download/crime.csv
# dat2 is from https://www.kaggle.com/airbnb/boston and already uploaded to github
dat1 = pd.read_csv('https://data.boston.gov/dataset/6220d948-eae2-4e4b-8723-2dc8e67722a3/resource/12cb3883-56f5-47de-afa5-3b1cf61b257b/download/crime.csv',encoding='windows-1252')
dat2 = pd.read_csv('listings.csv')



In [3]:
# Check dat1
dat1.head()
dat1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 346607 entries, 0 to 346606
Data columns (total 17 columns):
INCIDENT_NUMBER        346607 non-null object
OFFENSE_CODE           346607 non-null int64
OFFENSE_CODE_GROUP     346607 non-null object
OFFENSE_DESCRIPTION    346607 non-null object
DISTRICT               344693 non-null object
REPORTING_AREA         346607 non-null object
SHOOTING               1121 non-null object
OCCURRED_ON_DATE       346607 non-null object
YEAR                   346607 non-null int64
MONTH                  346607 non-null int64
DAY_OF_WEEK            346607 non-null object
HOUR                   346607 non-null int64
UCR_PART               346512 non-null object
STREET                 335365 non-null object
Lat                    324898 non-null float64
Long                   324898 non-null float64
Location               346607 non-null object
dtypes: float64(2), int64(4), object(11)
memory usage: 45.0+ MB


In [4]:
#choose year 2016 for crime dataset,since another dataset only for 2016.
#Can choose any offense_code_group, here I just choose Larceny as example
data_2016 = dat1.loc[dat1['YEAR']==2016]
data_la =  data_2016.loc[data_2016['OFFENSE_CODE_GROUP']=='Larceny']
# show how many records for larceny
data_la.shape

(7904, 17)

In [5]:
#extract location data and deal with missing values
location_la= data_la[['Lat','Long']].dropna()
location_la.head()

Unnamed: 0,Lat,Long
10777,42.287433,-71.125103
82245,42.286654,-71.126654
87418,42.337314,-71.048523
99060,42.382292,-71.040364
101909,42.342414,-71.144605


In [6]:
# get the center lat & long for the map, using the mean of lat& long
c_Lat=location_la.Lat.mean()
c_Long=location_la.Long.mean()
center = [c_Lat,c_Long]
center

[42.247766495052275, -70.94368643788478]

In [7]:
#using folium's map function to draw the original map
m = folium.Map(location = center,tiles = 'stamentoner', zoom_start = 10)

In [8]:
m

In [9]:
#get all locations for larceny 
crime_locations = location_la[['Lat','Long']].values.tolist()
crime_locations

[[42.28743327, -71.12510302],
 [42.28665378, -71.12665398],
 [42.33731438, -71.04852327],
 [42.38229155, -71.04036424],
 [42.34241352, -71.14460462],
 [42.35972137, -71.0585236],
 [42.33767954, -71.05715261],
 [42.33219302, -71.04505385],
 [42.2921753, -71.17519031],
 [42.2539599, -71.11962283],
 [42.32829858, -71.05691804],
 [42.35909731, -71.14728364],
 [42.33760828, -71.07333879],
 [42.3635655, -71.06449385],
 [42.28647169, -71.12419976],
 [42.37949215, -71.03685008],
 [42.29714531, -71.07409528],
 [42.32790482, -71.05305583],
 [42.32491307, -71.0767717],
 [42.3021009, -71.07172101],
 [42.31904705, -71.09399992],
 [42.28275134, -71.14216426],
 [42.30540715, -71.06547135],
 [42.38399469, -71.01971252],
 [42.28446743, -71.11183089],
 [42.35044939, -71.06087148],
 [42.30137775, -71.05971354],
 [42.27057076, -71.1045161],
 [42.30684828, -71.06392365],
 [42.33744152, -71.10428096],
 [42.27230624, -71.06721386],
 [42.32053872, -71.11086853],
 [42.34748342, -71.13363158],
 [42.36173448, -7

In [10]:
#draw the heatmap and save to map2.html in case the map doesn't show in the notebook.
# tried three browsers:chrome, MS edge may not show the map but firefox works very well
HeatMap(crime_locations).add_to(m)
m.save('map88.html')
m

In [11]:
# check dat2 and see the data structure
dat2.head()

Unnamed: 0,id,listing_url,scrape_id,last_scraped,name,summary,space,description,experiences_offered,neighborhood_overview,...,review_scores_value,requires_license,license,jurisdiction_names,instant_bookable,cancellation_policy,require_guest_profile_picture,require_guest_phone_verification,calculated_host_listings_count,reviews_per_month
0,12147973,https://www.airbnb.com/rooms/12147973,20160906204935,2016-09-07,Sunny Bungalow in the City,"Cozy, sunny, family home. Master bedroom high...",The house has an open and cozy feel at the sam...,"Cozy, sunny, family home. Master bedroom high...",none,"Roslindale is quiet, convenient and friendly. ...",...,,f,,,f,moderate,f,f,1,
1,3075044,https://www.airbnb.com/rooms/3075044,20160906204935,2016-09-07,Charming room in pet friendly apt,Charming and quiet room in a second floor 1910...,Small but cozy and quite room with a full size...,Charming and quiet room in a second floor 1910...,none,"The room is in Roslindale, a diverse and prima...",...,9.0,f,,,t,moderate,f,f,1,1.3
2,6976,https://www.airbnb.com/rooms/6976,20160906204935,2016-09-07,Mexican Folk Art Haven in Boston,"Come stay with a friendly, middle-aged guy in ...","Come stay with a friendly, middle-aged guy in ...","Come stay with a friendly, middle-aged guy in ...",none,The LOCATION: Roslindale is a safe and diverse...,...,10.0,f,,,f,moderate,t,f,1,0.47
3,1436513,https://www.airbnb.com/rooms/1436513,20160906204935,2016-09-07,Spacious Sunny Bedroom Suite in Historic Home,Come experience the comforts of home away from...,Most places you find in Boston are small howev...,Come experience the comforts of home away from...,none,Roslindale is a lovely little neighborhood loc...,...,10.0,f,,,f,moderate,f,f,1,1.0
4,7651065,https://www.airbnb.com/rooms/7651065,20160906204935,2016-09-07,Come Home to Boston,"My comfy, clean and relaxing home is one block...","Clean, attractive, private room, one block fro...","My comfy, clean and relaxing home is one block...",none,"I love the proximity to downtown, the neighbor...",...,10.0,f,,,f,flexible,f,f,1,2.25


In [12]:
# check dat2's shape
dat2.shape

(3585, 95)

In [13]:
#extract some of important columns from dat2
dat2_clean= dat2[['id','neighbourhood_cleansed','latitude','longitude','property_type','room_type','beds','price']]
dat2_clean.head()

Unnamed: 0,id,neighbourhood_cleansed,latitude,longitude,property_type,room_type,beds,price
0,12147973,Roslindale,42.282619,-71.133068,House,Entire home/apt,3.0,$250.00
1,3075044,Roslindale,42.286241,-71.134374,Apartment,Private room,1.0,$65.00
2,6976,Roslindale,42.292438,-71.135765,Apartment,Private room,1.0,$65.00
3,1436513,Roslindale,42.281106,-71.121021,House,Private room,2.0,$75.00
4,7651065,Roslindale,42.284512,-71.136258,House,Private room,2.0,$79.00


In [14]:
# we can choose different neighborhood and here I choose downtown as example
dat_nb = dat2_clean.loc[dat2['neighbourhood_cleansed']== 'Downtown'].dropna()
dat_nb

Unnamed: 0,id,neighbourhood_cleansed,latitude,longitude,property_type,room_type,beds,price
2023,3068453,Downtown,42.351880,-71.064181,Apartment,Entire home/apt,1.0,$133.00
2024,9934771,Downtown,42.351977,-71.062727,Apartment,Entire home/apt,1.0,$120.00
2025,2821825,Downtown,42.353186,-71.062596,Apartment,Entire home/apt,2.0,$329.00
2026,21891,Downtown,42.352179,-71.063160,Apartment,Entire home/apt,2.0,$349.00
2027,13686161,Downtown,42.355478,-71.058088,Apartment,Entire home/apt,2.0,$209.00
2028,4759640,Downtown,42.364026,-71.059493,Apartment,Entire home/apt,1.0,$200.00
2029,4021522,Downtown,42.352781,-71.063325,Apartment,Entire home/apt,2.0,$329.00
2030,9905304,Downtown,42.351518,-71.063138,Apartment,Entire home/apt,2.0,$375.00
2031,3182652,Downtown,42.350572,-71.064228,Apartment,Entire home/apt,2.0,$350.00
2032,10742655,Downtown,42.359707,-71.055284,Apartment,Private room,2.0,$70.00


In [15]:
# extract house locations and price from dat4
locations_listing = dat_nb[['latitude','longitude']].values.tolist()
labels = dat_nb['price'].values.tolist()

In [16]:
#using for loop to plot each house location to crime map and save the map in case it cannot show in notebook
for i in range(len(locations_listing)):
    popup = folium.Popup(labels[i], parse_html=True)
    folium.Marker(locations_listing[i], popup = popup).add_to(m)
    m.save('map88.html')


In [17]:
# display the listing location on crime heatmap, vistor can click the popup to see the listing price
# and choose the right place to live
m