# Explore Crime Rate in San Francisco 

It is required to create a Choropleth map to visualize crime in San Francisco.

Before you are ready to start building the map, let's restructure the data so that it is in the right format for the Choropleth map. Essentially, you will need to create a dataframe that lists each neighborhood in San Francisco along with the corresponding total number of crimes.

Based on the San Francisco crime dataset, you will find that San Francisco consists of 10 main neighborhoods, namely:

    Central,
    Southern,
    Bayview,
    Mission,
    Park,
    Richmond,
    Ingleside,
    Taraval,
    Northern, and,
    Tenderloin.

Convert the San Francisco dataset, which you can also find here, https://cocl.us/sanfran_crime_dataset, into a pandas dataframe, like the one shown below, that represents the total number of crimes in each neighborhood.

In [20]:
#Import and read the dataset

dataset='https://cocl.us/sanfran_crime_dataset'
SF_crime = pd.read_csv(dataset)

SF_crime.head()

Unnamed: 0,IncidntNum,Category,Descript,DayOfWeek,Date,Time,PdDistrict,Resolution,Address,X,Y,Location,PdId
0,120058272,WEAPON LAWS,POSS OF PROHIBITED WEAPON,Friday,01/29/2016 12:00:00 AM,11:00,SOUTHERN,"ARREST, BOOKED",800 Block of BRYANT ST,-122.403405,37.775421,"(37.775420706711, -122.403404791479)",12005827212120
1,120058272,WEAPON LAWS,"FIREARM, LOADED, IN VEHICLE, POSSESSION OR USE",Friday,01/29/2016 12:00:00 AM,11:00,SOUTHERN,"ARREST, BOOKED",800 Block of BRYANT ST,-122.403405,37.775421,"(37.775420706711, -122.403404791479)",12005827212168
2,141059263,WARRANTS,WARRANT ARREST,Monday,04/25/2016 12:00:00 AM,14:59,BAYVIEW,"ARREST, BOOKED",KEITH ST / SHAFTER AV,-122.388856,37.729981,"(37.7299809672996, -122.388856204292)",14105926363010
3,160013662,NON-CRIMINAL,LOST PROPERTY,Tuesday,01/05/2016 12:00:00 AM,23:50,TENDERLOIN,NONE,JONES ST / OFARRELL ST,-122.412971,37.785788,"(37.7857883766888, -122.412970537591)",16001366271000
4,160002740,NON-CRIMINAL,LOST PROPERTY,Friday,01/01/2016 12:00:00 AM,00:30,MISSION,NONE,16TH ST / MISSION ST,-122.419672,37.76505,"(37.7650501214668, -122.419671780296)",16000274071000


In [21]:
# print the dimensions of the dataframe
print(SF_crime.shape)

(150500, 13)


Clean up data. We will make some modifications to the original dataset to make it easier to create our visualizations.


In [22]:
# clean up the dataset to remove unnecessary columns 

SF_crime.drop(['IncidntNum','Category','Descript','DayOfWeek','Date','Time','Resolution','Address','X','Y','Location','PdId'], 
              axis=1, 
              inplace=True
             )

SF_crime.head(10)

Unnamed: 0,PdDistrict
0,SOUTHERN
1,SOUTHERN
2,BAYVIEW
3,TENDERLOIN
4,MISSION
5,NORTHERN
6,SOUTHERN
7,TENDERLOIN
8,SOUTHERN
9,BAYVIEW


In [23]:
# let's rename the columns so that they make sense
SF_crime.rename (columns = {'PdDistrict':'Neighborhood'}, inplace = True)

SF_crime.head(10)

Unnamed: 0,Neighborhood
0,SOUTHERN
1,SOUTHERN
2,BAYVIEW
3,TENDERLOIN
4,MISSION
5,NORTHERN
6,SOUTHERN
7,TENDERLOIN
8,SOUTHERN
9,BAYVIEW


In [24]:
count = SF_crime['Neighborhood'].value_counts()

In [25]:
# Generate a new dataframe with Crimes per Neighborhood
SF_crime = pd.DataFrame(count)
SF_crime.reset_index(inplace=True)
SF_crime.columns = ['Neighborhood', 'Count']
SF_crime

Unnamed: 0,Neighborhood,Count
0,SOUTHERN,28445
1,NORTHERN,20100
2,MISSION,19503
3,CENTRAL,17666
4,BAYVIEW,14303
5,INGLESIDE,11594
6,TARAVAL,11325
7,TENDERLOIN,9942
8,RICHMOND,8922
9,PARK,8699


Now you should be ready to proceed with creating the Choropleth map.

As you learned in the Choropleth maps lab, you will need a GeoJSON file that marks the boundaries of the different neighborhoods in San Francisco. In order to save you the hassle of looking for the right file, I already downloaded it for you and I am making it available via this link: https://cocl.us/sanfran_geojson.

For the map, make sure that:

    it is centred around San Francisco,
    you use a zoom level of 12,
    you use fill_color = 'YlOrRd',
    you define fill_opacity = 0.7,
    you define line_opacity=0.2, and,
    you define a legend and use the default threshold scale.

If you follow the lab on Choropleth maps and use the GeoJSON correctly, you should be able to generate the following map:

In [None]:
# download countries geojson file

!wget --quiet https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DV0101EN/labs/Data_Files/world_countries.json -O world_countries.json

In [None]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium

In [None]:
# geojson file

world_geo = 'https://cocl.us/sanfran_geojson' 

In [None]:
# create a plain world map

world_map = folium.Map(location=[0, 0], zoom_start=2, tiles='Mapbox Bright')

In [None]:
# define the San Francisco map 

sanfran_map = folium.Map(location=[37.77, -122.42], zoom_start=12)

In [None]:
# Generate Choropleth Map 
sanfran_map.choropleth(
    geo_data=world_geo,
    data=SF_crime,
    columns=['Neighborhood', 'Count'],
    key_on='feature.properties.DISTRICT',
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='San Francisco Crime'
)

# display San Francisco map
sanfran_map