# Criminality Analysis in and around Eindhoven based on Nightclub density

## Abstract

In this project I will analyze the criminality in the communities (like Veldhoven, Best, Son en Breugel, etc.) that surround Eindhoven en within Eindhoven as well. The data will be normalized with the total number of citizens and then compared with the nightclub density in these communities centers equally normalized as above. The idea is to see whether there is a correlation between the number of nightclubs and the number of crimes in a big city and its surrounding neighbourhoods.

# Business case

The idea is that the community councellors have a tool to predict crime development in relation to the number of nightclubs in a certain neighbourhood. Based on these results they might adopt a different security strategy (e.g. police surveillance) per Neighbourhood.


# Data Sets

I will need the following datasets:

    Total number of citizens per community

    Crime numbers per community

    Extract the number of nightclubs in the center of the community
    
The first two data sets, I can obtain from the Dutch Office of Statistics (CBS.nl).
The number of nightclubs can be obtained from searches on Foursquare.


# Visualization

In a scatter plot I will present the results with on the x-axis the number of citizens and on the y-axis the crime number.
Another scatter plot will present the night clubs against the crime number.

The crime numbers per nightclub will be presented in a map.

In this first week, only a map of Eindhoven and its Neighbourhoods will be presented.

In [1]:
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
import sys
import types
from botocore.client import Config
import ibm_boto3

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    geographiclib: 1.49-py_0   conda-forge
    geopy:         1.17.0-py_0 conda-forge

geographiclib- 100% |################################| Time: 0:00:00  23.44 MB/s
geopy-1.17.0-p 100% |################################| Time: 0:00:00  34.75 MB/s
Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    altair:  2.2.2-py35_1 conda-forge
    branca:  0.3.1-py_0   conda-forge
    folium:  0.5.0-py_0   conda-forge
    vincent: 0.4.4-py_1   conda-forge

altair-2.2.2-p 100% |################################| Time: 0:00:00  54.84 MB/s
branca-0.3.1-p 100% |################################| Time: 0:00:00  31.47 MB/s
vincent-0.4.4- 100% |###################

In [2]:
# The code was removed by Watson Studio for sharing.

Your credentails:
CLIENT_ID: G0J5OAHPW0BXWMYQZZ1OWZHBA52QXULOEWEDSC2RBSVIVLHL
CLIENT_SECRET:QSIB15KATEOQOLA3VDOSJORDYLL5KDE1P5JSHVTHAW0FB2A2


In [3]:
# Make a list of cities/neighbourhods of Eindhoven:
cities = ["Eindhoven","Eersel","Best","Waalre","Valkenswaard","Geldrop","Nuenen","Son en Breugel", "Oirschot", "Veldhoven"]

# Define search strings for each neighbourhood to be used in getting its longitude and latitude:
addresses = dict([(k, "Center "+str(k)+" The Netherlands") for k in cities])

In [4]:
# Store location information in a list:
geolocator = Nominatim()
locations = []
for key, value in addresses.items(): locations.append(geolocator.geocode(addresses[key]))

#for item in range(0,len(locations)): print(locations[item].latitude, locations[item].longitude)



In [5]:
# Make dataframe with place & location:
headers = ["place","location"]
df=pd.DataFrame(locations, columns=headers)

# Split location in latitude and Longitude:
df[['lat', 'lng']]=df["location"].apply(pd.Series)
df.drop(["location"], axis=1, inplace=True)

# Clean up Place:
df['place']=df['place'].str.replace(',.*','')
df.set_index("place", inplace=True)
df.head(10)

Unnamed: 0_level_0,lat,lng
place,Unnamed: 1_level_1,Unnamed: 2_level_1
Waalre,51.38764,5.447687
Veldhoven,51.407706,5.392731
Oirschot,51.503239,5.315549
Nuenen,51.48603,5.544007
Geldrop,51.422199,5.559182
Son en Breugel,51.517813,5.491435
Eersel,51.358637,5.320131
Best,51.510017,5.398662
Eindhoven,51.439265,5.478633
Valkenswaard,51.350688,5.45945


In [6]:
# generate map centred around EindhovenL
Region_map = folium.Map(location=[df.lat['Eindhoven'], df.lng['Eindhoven']], zoom_start=12) 

# add a red circle marker to represent the center of Eindhoven
folium.features.CircleMarker(
    [df.lat['Eindhoven'], df.lng['Eindhoven']],
    radius=10,
    color='red',
    popup='Eindhoven',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(Region_map)

# add the Neighbourhoods as blue circle markers
for lat, lng, label in zip(df.lat, df.lng, df.index):
    if label != 'Eindhoven':
        folium.features.CircleMarker(
            [lat, lng],
            radius=5,
            color='blue',
            popup=label,
            fill = True,
            fill_color='blue',
            fill_opacity=0.6
        ).add_to(Region_map)
# display map
Region_map

In [7]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,Crime,Eindhoven,Eersel,Best,Waalre,Valkenswaard,Geldrop,Nuenen,SonenBreugel,Oirschot,Veldhoven
0,"Misdrijven, totaal",18600,700,1100,475,1220,1565,705,665,675,1370
1,1 Vermogensmisdrijven,11535,380,645,330,610,915,435,450,430,775
2,1.1 Diefstal/verduistering en inbraak,10315,325,540,235,475,775,375,375,355,620
3,1.1.1 Diefstal en inbraak met geweld,175,5,10,5,5,5,15,10,5,10
4,1.1.2 Diefstal en inbraak zonder geweld,10140,320,530,225,475,765,360,370,350,610


In [8]:
# Rename one column to get same signature as other table and transpose dataframe:
df_data_1.rename(columns={'SonenBreugel': 'Son en Breugel'}, inplace=True)
df_data_1.set_index("Crime", inplace=True)
df_data_1 = df_data_1.transpose()
# Keep only the relevant data being the total number of crimes:
df_data_1 = df_data_1[['Misdrijven, totaal']]
df_data_1.rename(columns={'Misdrijven, totaal': 'Totals'}, inplace=True)

In [9]:
df_data_1.head(10)

Crime,Totals
Eindhoven,18600
Eersel,700
Best,1100
Waalre,475
Valkenswaard,1220
Geldrop,1565
Nuenen,705
Son en Breugel,665
Oirschot,675
Veldhoven,1370
