# Import Libraries

In [2]:
import pandas as pd

import pip
pip.main(["install", "openpyxl"])

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
from scipy.stats import linregress
import gmaps 
from config import gkey
import requests
import json

# Configure gmaps
gmaps.configure(api_key=gkey)



Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.


ModuleNotFoundError: No module named 'config'

West Midlands crime data has already been merged (see 'data story so far.docx'). We had access to 12 CSV files with a months worth of 2019 data that we merged together. Read CSV and see dataframe.

In [3]:
data = pd.read_csv("reduced_2019.csv")

data_df = data

data_df.head()

FileNotFoundError: [Errno 2] No such file or directory: 'reduced_2019.csv'

# Cleaning data

Viewing shape and contents of data.

In [None]:
data_df.shape

In [None]:
data_df.columns

In [None]:
data_df.info()

In [None]:
data_df.duplicated()
boolean_series_of_duplicates = data_df.duplicated()
len(data_df.loc[boolean_series_of_duplicates, :])

In [None]:
data_df.loc[boolean_series_of_duplicates, :]

We investigated the duplicates and found that the abive number was the number of crime IDs that had null values in similar locations. They also tended to fall into the crime type of anti-social behaviour. 

In [None]:
data_df.isnull().sum()

The last outcome column shows many null values, we have decided to change this to 'No outcome recorded' rather than leave it blank.

In [None]:
# Fill in missing values in 'last outcome category'
data_df["Last outcome category"] = data_df["Last outcome category"].fillna('No outcome recorded')

## Filter West Midlands data

We only want the data for the follwing districts: Birmingham, Coventry, Dudley, Sandwell, Solihull, Walsall, Wolverhampton. We will then filter this data into a new West_Midlands_df

In [None]:
West_midlands_df = data_df[(data_df["District name (2019)"]=="Birmingham")|(data_df["District name (2019)"]=="Coventry")|(data_df["District name (2019)"]=="Dudley")|(data_df["District name (2019)"]=="Sandwell")|(data_df["District name (2019)"]=="Solihull")|(data_df["District name (2019)"]=="Walsall")|(data_df["District name (2019)"]=="Wolverhampton")]

West_midlands_df.head()

In [None]:
West_midlands_df.info()

In [None]:
#The Crime ID column shows many null values, we have decided to change this to 'No ID recorded' rather than leave it blank.

In [None]:
# Fill in missing values in 'Crime ID'
West_midlands_df["Crime ID"] = West_midlands_df["Crime ID"].fillna('No ID recorded')

In [None]:
West_midlands_df.info()

In [None]:
West_midlands_df.head()

In [None]:
West_midlands_df.to_csv('clean_2019.csv', index=False)

We have local income deprivation data in an excel file which we need to read in:

In [None]:
lidd = pd.read_excel("localincomedeprivationdata.xlsx", engine='openpyxl')

In [None]:
lidd.head()

Renaming columns

In [None]:
lidd = lidd.rename(columns={"Overall Index of Multiple Deprivation (IMD) Score":"LSOA IMD Score"})

In [None]:
lidd = lidd.rename(columns={"Index of Multiple Deprivation (IMD) Rank (where 1 is most deprived)":"LSOA IMD Rank"})

In [None]:
lidd = lidd.rename(columns={"Income Rank (where 1 is most deprived)":"LSOA Income Rank"})

In [None]:
del lidd['LSOA name (2011)']
del lidd['Local Authority District code (2019)']
del lidd['Local Authority District name (2019)']
del lidd['Index of Multiple Deprivation (IMD) Decile (where 1 is most deprived 10% of LSOAs)']
del lidd['Income Score (rate)']
del lidd['Income Decile (where 1 is most deprived 10% of LSOAs)']
del lidd['Total population: mid 2015 (excluding prisoners)']
del lidd['Dependent Children aged 0-15: mid 2015 (excluding prisoners)']
del lidd['Population aged 16-59: mid 2015 (excluding prisoners)']
del lidd['Older population aged 60 and over: mid 2015 (excluding prisoners)']
del lidd['Working age population 18-59/64: for use with Employment Deprivation Domain (excluding prisoners) ']

In [None]:
lidd.head()

In [None]:
lidd = lidd.rename(columns={"LSOA code (2011)":"LSOA code"})
lidd.head()

Reading in LSSOA geographical boundary data for local level data analysis.

In [None]:
llso_lat_long_df = pd.read_csv("Lower_Layer_Super_Output_Areas__December_2011__Boundaries_Full_Clipped__BFC__EW_V3.csv")

llso_lat_long_df.head()

In [None]:
llso_lat_long_df1 = llso_lat_long_df.rename(columns={"LSOA11CD":"LSOA code"})

In [None]:
lidd_1 = pd.merge(lidd,llso_lat_long_df1 , on = "LSOA code")
lidd_1.head()

Deleting columns not needed.

In [None]:
del lidd_1['OBJECTID']
del lidd_1['LSOA11NM']
del lidd_1['BNG_E']
del lidd_1['BNG_N']
del lidd_1['Shape_Leng']
del lidd_1['Shape__Area']
del lidd_1['Shape__Length']

Create new dataframe that merges West Midlands Data to LSOA data.

In [None]:
West_midlands_df1 = pd.merge(West_midlands_df, lidd_1, on = "LSOA code")
West_midlands_df1.head()

In [None]:
West_midlands_df1.columns

One last check for null values and duplicates. 

In [None]:
West_midlands_df1.isnull().sum()

In [None]:
West_midlands_df1.duplicated()
boolean_series_of_duplicates = West_midlands_df1.duplicated()
len(West_midlands_df1.loc[boolean_series_of_duplicates, :])

In [None]:
West_midlands_df1.loc[boolean_series_of_duplicates, :]

The above confirms that the duplicates 

Export to clean CSV file

In [None]:
West_midlands_df1.to_csv('clean_2019_1.csv', index=False)

# Data Visualisation 

Hypothesis: Within the West Midlands during 2019 the Index of Multiple Deprivation (IMD) Score of an area should influence the exposure to certain crimes in that area. The higher the level of crime the higher the IMD score.

## Research Question 1: Does the crime data sourced from the West Midlands Police Database correlate with the Index of Multiple Deprivation (IMD) score?

The IMD score is created with 7 domains of deprivation: Income (22.5%), Employment (22.5%), Education (13.5%), Health (13.5%), Crime (9.3%), Barriers to Housing & Services (9.3%), and Living Environment (9.3%). We expect that because crime is one domain used to create the IMD score that the crime rates in the LLSOAs should correlate with the IMD score. 

We also wanted to see if there are any outliers and what this would mean for those LLSOAs.

We will then use this data to establish which areas are exposed to which crimes.

In [None]:
# Viewing the data we want to work with
West_midlands_df1["LSOA name"].nunique()

In [None]:
West_midlands_df1["LSOA IMD Rank"].nunique()

In [None]:
West_midlands_df1["LSOA IMD Score"].max()

In [None]:
# Groupby LSOA
f = {"LSOA name": "count", "LSOA IMD Score":'first', "District code (2019)": "first", "Longitude": "first", "Latitude": "first"}
West_midlands_df1.groupby(["LSOA name"], as_index=False).agg(f)

LSOA_crime_count_df = West_midlands_df1.groupby(["LSOA name"]).agg(f)

LSOA_crime_count_df = LSOA_crime_count_df.rename(columns={"LSOA name":"LSOA crime count"})

LSOA_crime_count_df = LSOA_crime_count_df.reset_index()
LSOA_crime_count_df

In [None]:
LSOA_crime_count_df["LSOA crime count"].max()

In [None]:
# Scatter plot to show results
y_axis = LSOA_crime_count_df["LSOA crime count"]
x_axis = LSOA_crime_count_df["LSOA IMD Score"]

In [None]:
# Create a function to create Linear Regression plots
def plot_linear_regression(x_axis, y_axis, title, text_coordinates):
    
    # Run the regresson
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_axis, y_axis)
    regress_values = x_axis * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    plt.plot(x_axis,regress_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=50,color="red")
    print(f"The r-value is: {rvalue**2}")

In [None]:
# Plot the scatter 
plt.title("Crime Count all LSOA's VS IMD Score", fontsize=60)
plt.scatter(x_axis, y_axis, marker="o", color="blue")
plt.rcParams["figure.figsize"] = (50,30)
plt.xlim(0,80)
plt.ylim(0, 600)
plt.xticks(fontsize=35)
plt.yticks(fontsize=35)
plt.xlabel("LSOA IMD Score", fontsize=45)
plt.ylabel("LSOA Crime Count", fontsize=45)
plot_linear_regression(x_axis, y_axis, 'LSOA Crime Count',(65,400))
# Save the figure
plt.savefig("output/Crime Count all LSOA's VS IMD Score.png")
plt.show()

A scatter plot was chosen as we are working with a large dataset. This plot highlights a few outliers which for the statistical analysis can be omitted for the sake of finding out whether there is a correlation between the the LSOA crime count and LSOA IMD score. We have therefore reduced the y limits to 600 rather than over  4000 so that we can visualise any correlation and view the linear regression. 

The line of linear regression shows that there is a postive linear correlation but it is quite weak. 

The scatterplot shows that our hypothesis may not be correct. Whilst the amount of crimes increase the higher IMD score this is not uniform. There are also high crime rates in LSOAs with low IMD scores. 

We also recognise that crime is only one measurement used to created the IMD score. 

We also decided to see if there was any difference in correlation when plotting each district individually.

In [None]:
# Create local authority District DataFrames
birmingham_df = LSOA_crime_count_df.loc[(LSOA_crime_count_df["District code (2019)"] == "E08000025")]
coventry_df = LSOA_crime_count_df.loc[(LSOA_crime_count_df["District code (2019)"] == "E08000026")]
solihull_df = LSOA_crime_count_df.loc[(LSOA_crime_count_df["District code (2019)"] == "E08000029")]
sandwell_df = LSOA_crime_count_df.loc[(LSOA_crime_count_df["District code (2019)"] == "E08000028")]
dudley_df = LSOA_crime_count_df.loc[(LSOA_crime_count_df["District code (2019)"] == "E08000027")]
wolverhampton_df = LSOA_crime_count_df.loc[(LSOA_crime_count_df["District code (2019)"] == "E08000031")]
walsall_df = LSOA_crime_count_df.loc[(LSOA_crime_count_df["District code (2019)"] == "E08000030")]

Birmingham scatterplot to show LSOA crime count Vs LSOA IMD score

In [None]:
# Scatter plot to show results
y_axis = birmingham_df["LSOA crime count"]
x_axis = birmingham_df["LSOA IMD Score"]
# Create a function to create Linear Regression plots
def plot_linear_regression(x_axis, y_axis, title, text_coordinates):
    
    # Run the regresson
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_axis, y_axis)
    regress_values = x_axis * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    plt.plot(x_axis,regress_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=50,color="red")
    print(f"The r-value is: {rvalue**2}")
 # Linear regression?
#slope, intercept, r, p, se = stats.linregress(str(x_axis), str(y_axis))
#y_fit = slope*x_axis + intercept

plt.title("Birmingham Crime count VS IMD Score", fontsize=60)
plt.scatter(x_axis, y_axis, marker="o", color="blue")
#plt.plot(x_axis, y_fit)
plt.rcParams["figure.figsize"] = (50,30)
plt.xlim(0,80)
plt.ylim(0, 600)
plt.xticks(fontsize=35)
plt.yticks(fontsize=35)
plt.xlabel("LSOA IMD Score", fontsize=45)
plt.ylabel("LSOA Crime Count", fontsize=45)
plot_linear_regression(x_axis, y_axis, 'LSOA Crime Count',(35,250))
plt.savefig("output/Birmingham Crime count VS IMD Score.png")
plt.show()

The r value for the Birmingham scatterplot is lower than the r value for the total districts. This shows that there is an even weaker linear correlation and suggests that there would be more crime counts in areas with lower IMD scores. This plot also shows that for all areas regardless of IMD score there tends to be lower level of crimes.

Coventry scatterplot to show LSOA crime count Vs LSOA IMD score.

In [None]:
# Scatter plot to show results
y_axis = coventry_df["LSOA crime count"]
x_axis = coventry_df["LSOA IMD Score"]
# Create a function to create Linear Regression plots
def plot_linear_regression(x_axis, y_axis, title, text_coordinates):
    
    # Run the regresson
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_axis, y_axis)
    regress_values = x_axis * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    plt.plot(x_axis,regress_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=50,color="red")
    print(f"The r-value is: {rvalue**2}")
 # Linear regression?
#slope, intercept, r, p, se = stats.linregress(str(x_axis), str(y_axis))
#y_fit = slope*x_axis + intercept

plt.title("Coventry Crime count VS IMD Score", fontsize=60)
plt.scatter(x_axis, y_axis, marker="o", color="blue")
#plt.plot(x_axis, y_fit)
plt.rcParams["figure.figsize"] = (50,30)
plt.xlim(0,80)
plt.ylim(0, 600)
plt.xticks(fontsize=35)
plt.yticks(fontsize=35)
plt.xlabel("LSOA IMD Score", fontsize=45)
plt.ylabel("LSOA Crime Count", fontsize=45)
plot_linear_regression(x_axis, y_axis, 'LSOA Crime Count',(35,250))
plt.savefig("output/Coventry Crime count VS IMD Score.png")
plt.show()

The Coventry scatterplot shows that there are fewer crimes in areas with a LMD score but there are also a significant proportion of LSOAs that see higher number of crimes the higher the LMD score. The r value for this plot is very similar to the overall district plot which shows a weak postive linear correlation. 

Solihull scatterplot to show LSOA crime count Vs LSOA IMD score.

In [None]:
# Scatter plot to show results
y_axis = solihull_df["LSOA crime count"]
x_axis = solihull_df["LSOA IMD Score"]
# Create a function to create Linear Regression plots
def plot_linear_regression(x_axis, y_axis, title, text_coordinates):
    
    # Run the regresson
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_axis, y_axis)
    regress_values = x_axis * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    plt.plot(x_axis,regress_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=50,color="red")
    print(f"The r-value is: {rvalue**2}")
 # Linear regression?
#slope, intercept, r, p, se = stats.linregress(str(x_axis), str(y_axis))
#y_fit = slope*x_axis + intercept

plt.title("Solihull Crime count VS IMD Score", fontsize=60)
plt.scatter(x_axis, y_axis, marker="o", color="blue")
#plt.plot(x_axis, y_fit)
plt.rcParams["figure.figsize"] = (50,30)
plt.xlim(0,80)
plt.ylim(0, 600)
plt.xticks(fontsize=35)
plt.yticks(fontsize=35)
plt.xlabel("LSOA IMD Score", fontsize=45)
plt.ylabel("LSOA Crime Count", fontsize=45)
plot_linear_regression(x_axis, y_axis, 'LSOA Crime Count',(35,250))
plt.savefig("output/Solihull Crime count VS IMD Score.png")
plt.show()

Solihull has a r value simialr to the district plot as a whole. This shows that there is a weak oistive correlation between LSOA IMD score and crime rate. There are more LSOAs that have a lower LMD score and low crime rate levels. 

Sandwell scatterplot to show LSOA crime count Vs LSOA IMD score.

In [None]:
# Scatter plot to show results
y_axis = sandwell_df["LSOA crime count"]
x_axis = sandwell_df["LSOA IMD Score"]
# Create a function to create Linear Regression plots
def plot_linear_regression(x_axis, y_axis, title, text_coordinates):
    
    # Run the regresson
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_axis, y_axis)
    regress_values = x_axis * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    plt.plot(x_axis,regress_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=50,color="red")
    print(f"The r-value is: {rvalue**2}")
 # Linear regression?
#slope, intercept, r, p, se = stats.linregress(str(x_axis), str(y_axis))
#y_fit = slope*x_axis + intercept

plt.title("Sandwell Crime count VS IMD Score", fontsize=60)
plt.scatter(x_axis, y_axis, marker="o", color="blue")
#plt.plot(x_axis, y_fit)
plt.rcParams["figure.figsize"] = (50,30)
plt.xlim(0,80)
plt.ylim(0, 600)
plt.xticks(fontsize=35)
plt.yticks(fontsize=35)
plt.xlabel("LSOA IMD Score", fontsize=45)
plt.ylabel("LSOA Crime Count", fontsize=45)
plot_linear_regression(x_axis, y_axis, 'LSOA Crime Count',(35,250))
plt.savefig("output/Sandwell Crime count VS IMD Score.png")
plt.show()

The Sandwell scatterplot shows that there aren't any LSOAs with an IMD score below 10. Despite this the r value shows the weakest correlation of all districts. This suggests that in Sandwell the crime count does not correlate with the IMD score and that crime occurs in all LSOAs - although this plot is similar to the other districts in that there are not many LSOAs that witness more than 300 counts of crime in 2019.

Walsall scatterplot to show LSOA crime count Vs LSOA IMD score.

In [None]:
# Scatter plot to show results
y_axis = walsall_df["LSOA crime count"]
x_axis = walsall_df["LSOA IMD Score"]
# Create a function to create Linear Regression plots
def plot_linear_regression(x_axis, y_axis, title, text_coordinates):
    
    # Run the regresson
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_axis, y_axis)
    regress_values = x_axis * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    plt.plot(x_axis,regress_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=50,color="red")
    print(f"The r-value is: {rvalue**2}")
 # Linear regression?
#slope, intercept, r, p, se = stats.linregress(str(x_axis), str(y_axis))
#y_fit = slope*x_axis + intercept

plt.title("Walsall Crime count VS IMD Score", fontsize=60)
plt.scatter(x_axis, y_axis, marker="o", color="blue")
#plt.plot(x_axis, y_fit)
plt.rcParams["figure.figsize"] = (50,30)
plt.xlim(0,80)
plt.ylim(0, 600)
plt.xticks(fontsize=35)
plt.yticks(fontsize=35)
plt.xlabel("LSOA IMD Score", fontsize=45)
plt.ylabel("LSOA Crime Count", fontsize=45)
plot_linear_regression(x_axis, y_axis, 'LSOA Crime Count',(30,250))
plt.savefig("output/Walsall Crime count VS IMD Score.png")
plt.show()

The Walsall r value shows the strongest positive liner correlation. This supports our hypothesis that the higher the IMD score the greater the crime count. 

Wolverhampton scatterplot to show LSOA crime count Vs LSOA IMD score.

In [None]:
# Scatter plot to show results
y_axis = wolverhampton_df["LSOA crime count"]
x_axis = wolverhampton_df["LSOA IMD Score"]
# Create a function to create Linear Regression plots
def plot_linear_regression(x_axis, y_axis, title, text_coordinates):
    
    # Run the regresson
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_axis, y_axis)
    regress_values = x_axis * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    plt.plot(x_axis,regress_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=50,color="red")
    print(f"The r-value is: {rvalue**2}")
 # Linear regression?
#slope, intercept, r, p, se = stats.linregress(str(x_axis), str(y_axis))
#y_fit = slope*x_axis + intercept

plt.title("Wolverhampton Crime count VS IMD Score", fontsize=60)
plt.scatter(x_axis, y_axis, marker="o", color="blue")
#plt.plot(x_axis, y_fit)
plt.rcParams["figure.figsize"] = (50,30)
plt.xlim(0,80)
plt.ylim(0, 600)
plt.xticks(fontsize=35)
plt.yticks(fontsize=35)
plt.xlabel("LSOA IMD Score", fontsize=45)
plt.ylabel("LSOA Crime Count", fontsize=45)
plot_linear_regression(x_axis, y_axis, 'LSOA Crime Count',(30,250))
plt.savefig("output/Wolverhampton Crime count VS IMD Score.png")
plt.show()

Again, the Wolverhampton scatterplot is similar to the overall district plot. It shows a weak postive correlation with the majority of LSOA witnessing less than 300 counts of crime in 2019.

Overall, these scatterplots show that there is a weak positive correlation between LSOA Index of Multiple Deprivation score and crime count. Walsall has the strongest positive correlation and this supports our hypothesis. The other districts do not tend to have a signifcantly maority of LSOAs that have crime counts higher than 300. Sandwell had the weakest linear correlation with higher counts of crime in all LSOAs. This highlights that not all areas with a high deprivation score have high crime counts, and vice versa, the areas with lower deprivation levels still see high levels of crime. 

We will continue this analysis by looking into which areas have higher rates of different types of crime. This may allow us to understand why we haven not seen a strong positive correlation as expected. 

## Research Question 2: Which district and Lower Layer Super Output Areas (LSOAs) have higher rates of crime?

In [None]:
# Groupby district
d = {"LSOA name": "first", "Crime ID": "count", "Population": "first", "Index of Multiple Deprivation (IMD)": "first"}
District_crime_count_df = West_midlands_df1.groupby(["District code (2019)"], as_index=False).agg(d)

# Rename columns
District_crime_count_df = District_crime_count_df.rename(columns={"Crime ID":"District crime count"})

# Total District Crime Count 
Total_crime_sum = District_crime_count_df["District crime count"].sum()

# Total Population for all Districts 
Total_pop = District_crime_count_df["Population"].sum()

# Change District Crime Count and population to percentages to plot them on the same bar plot
District_pop = District_crime_count_df["Population"].div(Total_pop)*100
District_crime = District_crime_count_df["District crime count"].div(Total_crime_sum)*100

# Dataframe with added percentages 
District_crime_count_df["District Population %"] = District_pop
District_crime_count_df["District Crime Count %"] = District_crime

District_crime_count_df


The above df presents an interesting pattern which we did not expect. Solihull has a lower district crime count overall but a higher index of multiple deprivation than districts such as Wolverhampton, Walsall, Dudley, Sanwell, and Coventry which have higher crime counts but lower IMD scores. 

In [None]:
# Visualise data with bar plot 
District_crime_count_df.plot(x = "LSOA name", y = ["District Population %", "District Crime Count %"], kind ="bar", width=1, 
                             edgecolor="white", linewidth=10)

plt.title("Count of District Crime and Population", fontsize=40)
plt.xlabel("West Midlands District", fontsize=40)
plt.ylabel("Count of Crime and Population (%)", fontsize=40)
plt.xticks(rotation=55, fontsize=35)
plt.yticks(fontsize=35)
plt.legend(loc ="upper right", fontsize=40)
plt.tight_layout
plt.savefig("output/Count of Crime and Population (%).png")
plt.show()

The above bar plot shows that Birmingham sees the highest levels of crime by count but also shows that there is a higher percentage of crimes compared to the percentage of population in the West Midlands. This is also similar for Wolverhampton. This shows that the level of crime in Birmingham and Wolverhampton as districts is disproportionate to the population size. 

This bar chart also shows that it is significantly safer in Dudley and Solihull as the percentage of total crime is lower than the percentage of the total population.

We have used a heatmap to visualise the areas within districts that have a higher exposure to crime. 

In [None]:
# Crime counts in all LSOA areas
LSOA_crime_count_df
LSOA_crime_locations = LSOA_crime_count_df[["Latitude", "Longitude"]]
incedents = LSOA_crime_count_df["LSOA crime count"].astype(float)
figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig1 = gmaps.figure(layout=figure_layout)
# Create heat layer
heat_layer = gmaps.heatmap_layer(LSOA_crime_locations, weights=incedents,
                                 dissipating=False, max_intensity=600,
                                 point_radius=.01)
# Add layer
fig1.add_layer(heat_layer)

# Display figure
fig1

This map visualises that there are higher crime rates in different areas. Here we can see that there are higher crime rates nearer the city centres, by universities, and near hospitals.

## Research Question 3: What type of crime might you be exposed to in certain Districts/LSOAs?

## Birmingham crime data.

In [None]:
Birmingham = West_midlands_df1[(West_midlands_df1["District code (2019)"]=="E08000025")]
Birmingham.head()

Birmingham["Crime type"].unique()

In [None]:
b = {"Crime type": "first", "Crime ID": "count"}
Birmingham_crime = Birmingham.groupby(["Crime type"], as_index=False).agg(b)
Birmingham_crime = Birmingham_crime.rename(columns={"Crime ID":"Birmingham Crime count"})
Birmingham_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Birmingham_crime["Birmingham Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Birmingham Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Birmingham Crime Type (%) 2019.png")

In Birmingham you are more likley to be exposed to violence and sexual offences which made up nearly a third of the crime in 2019. Anti-social behaviour and vehicle crime are the next two significant crime types in Birmingham.

We have chosen the LSOA with the highest crime rate in order to deeper dive into the data and see if we can find any patterns.

In [None]:
# Find max count tyoe for Birmingaham 
B_LSOA_crime_count_df = LSOA_crime_count_df[(LSOA_crime_count_df["District code (2019)"]=="E08000025")]
print(B_LSOA_crime_count_df[B_LSOA_crime_count_df["LSOA crime count"] == B_LSOA_crime_count_df["LSOA crime count"].max()])

In [None]:
# # Here we have used the 'Maplt API' to find out which area this code represents
# url = mapit.mysociety.org/code/gss/
# query_url = url+code

# code = E08000025
# code/ons/138A

In [None]:
# Show crime types for LSOA
Birmingham_max_crime = West_midlands_df1[(West_midlands_df1["LSOA name"]=="Birmingham 138A")]

bm = {"Crime type": "first", "Crime ID": "count"}
Birmingham_max_crime_count = Birmingham_max_crime.groupby(["Crime type"], as_index=False).agg(b)
Birmingham_max_crime_count = Birmingham_max_crime_count.rename(columns={"Crime ID":"Birmingham 138A Crime count"})
Birmingham_max_crime_count

labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Birmingham_max_crime_count["Birmingham 138A Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.1, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Birmingham 138A Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Birmingham 138A Crime Type (%) 2019.png")

Interestingly, whilst violence and sexual offences make up a signficant proportion of crimes, shoplifting has a slightly higher percentage. This means that in the LSOA area with the max crime count, shoplifting is the crime that you would be the most exposed to. Vehicle crime is significantly lower than Birmingham as a whole. We believe that the reason for this is due to this LSOA covering the area between Birmingham New Street and Birmingham Snow Hill station. There are a higher desnity of shops and less cars in the city centre.

## Coventry crime data.

In [None]:
Coventry = West_midlands_df1[(West_midlands_df1["District code (2019)"]=="E08000026")]
c = {"Crime type": "first", "Crime ID": "count"}
Coventry_crime = Coventry.groupby(["Crime type"], as_index=False).agg(c)
Coventry_crime = Coventry_crime.rename(columns={"Crime ID":"Coventry Crime count"})
Coventry_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Coventry_crime["Coventry Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Coventry Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Coventry Crime Type (%) 2019.png")

The crime type in Coventry follows the same trend as the crime types in Birmingham. One notable difference, howeever, is the higher rate of bike theft. We will explore this further later on.

## Dudley crime data

In [None]:
Dudley = West_midlands_df1[(West_midlands_df1["District code (2019)"]=="E08000027")]
du = {"Crime type": "first", "Crime ID": "count"}
Dudley_crime = Dudley.groupby(["Crime type"], as_index=False).agg(du)
Dudley_crime = Dudley_crime.rename(columns={"Crime ID":"Dudley Crime count"})
Dudley_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Dudley_crime["Dudley Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Dudley Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Dudley Crime Type (%) 2019.png")

This pie chart shows that Dudley has a higher rate of violence and sexual offences and burglary but is the least riskiest place to leave your bike. Again, this data shows that the crime types and the percentage of these crimes are similar to the other districts.

## Sandwell crime data

In [None]:
Sandwell = West_midlands_df1[(West_midlands_df1["District code (2019)"]=="E08000028")]
s = {"Crime type": "first", "Crime ID": "count"}
Sandwell_crime = Sandwell.groupby(["Crime type"], as_index=False).agg(s)
Sandwell_crime = Sandwell_crime.rename(columns={"Crime ID":"Sandwell Crime count"})
Sandwell_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Sandwell_crime["Sandwell Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Sandwell Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Sandwell Crime Type (%) 2019.png")

In Sandwell bike theft is also a very small percentage, but the perecentage of violent and sexual offences is the second highest in the West Midlands. It made up over a third of crimes in 2019.

## Solihull crime data

In [None]:
Solihull = West_midlands_df1[(West_midlands_df1["District code (2019)"]=="E08000029")]
so = {"Crime type": "first", "Crime ID": "count"}
Solihull_crime = Solihull.groupby(["Crime type"], as_index=False).agg(so)
Solihull_crime = Solihull_crime.rename(columns={"Crime ID":"Solihull Crime count"})
Solihull_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Solihull_crime["Solihull Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Solihull Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Solihull Crime Type (%) 2019.png")

The violence and sexual offences rate in Solihull is still the most significant crime but the percentage is much lower than the other districts. The rate of vehicle crime, other theft, and burglary is higher than the other districts.

We decided to look into LSOAs for Solihull as it the crime data and IMD score did not behave in the way we expected.

In [None]:
# Find max count tyoe for Solihull 
B_LSOA_crime_count_df = LSOA_crime_count_df[(LSOA_crime_count_df["District code (2019)"]=="E08000029")]
print(B_LSOA_crime_count_df[B_LSOA_crime_count_df["LSOA crime count"] == B_LSOA_crime_count_df["LSOA crime count"].max()])

In [None]:
# Show crime types for LSOA
Solihull_max_crime = West_midlands_df1[(West_midlands_df1["LSOA name"]=="Solihull 009A")]

# USE API to find out which area this is??

sol = {"Crime type": "first", "Crime ID": "count"}
Solihull_max_crime_count = Solihull_max_crime.groupby(["Crime type"], as_index=False).agg(sol)
Solihull_max_crime_count = Solihull_max_crime_count.rename(columns={"Crime ID":"Solihull 009A Crime count"})
Solihull_max_crime_count

labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Solihull_max_crime_count["Solihull 009A Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0.1, 0, 0, 0, 0, 0 , 0.1, 0)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Solihull 009A Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Solihull 009A Crime Type (%) 2019.png")

The area with the highest crime in Solihull shows us something interesting which does not reflect the overall pattern of crime types in the West Midlands in 2019, but it does correlate with the overall percentage of crime types in Solihull. The percentage of vehicle crime and other theft is much higher than other districts and the LSAO with the max cime rate in Birmingham. Interestingly possessions of weapons is another crime type with has a higher percentage than the other districts. 

## Walsall crime data 

In [None]:
Walsall = West_midlands_df1[(West_midlands_df1["District code (2019)"]=="E08000030")]
w = {"Crime type": "first", "Crime ID": "count"}
Walsall_crime = Walsall.groupby(["Crime type"], as_index=False).agg(w)
Walsall_crime = Walsall_crime.rename(columns={"Crime ID":"Walsall Crime count"})
Walsall_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Walsall_crime["Walsall Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Walsall Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Walsall Crime Type (%) 2019.png")

The Walsall piechart reflects the data from the other districts. There is nothing too disimilar to the crime type percentages in the rest of the West Midlands.

## Wolverhampton crime data

In [None]:
Wolverhampton = West_midlands_df1[(West_midlands_df1["District code (2019)"]=="E08000031")]
wo = {"Crime type": "first", "Crime ID": "count"}
Wolverhampton_crime = Wolverhampton.groupby(["Crime type"], as_index=False).agg(wo)
Wolverhampton_crime = Wolverhampton_crime.rename(columns={"Crime ID":"Wolverhampton Crime count"})
Wolverhampton_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = Wolverhampton_crime["Wolverhampton Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Wolverhampton Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Wolverhampton Crime Type (%) 2019.png")

Wolverhampton has the highest percentage of violent and sexual offences in the West Midlands. 

Overall: We found that despite the difference in IMD scores for each district the type of crime and the percentage of crime count in 2019 was mostly similar across the West Midlands. 

To test this we chose a sample of LSOA's with an IMD score of less than 5 to ascertain what types of crimes were committed in areas with low IMD scores vs high IMD scores. 

In [None]:
# Filter dataframe by IMD scores < 10

low_IMD = West_midlands_df1.loc[(West_midlands_df1["LSOA IMD Score"] < 5)]
l = {"Crime type": "first", "Crime ID": "count"}
low_IMD_crime = low_IMD.groupby(["Crime type"], as_index=False).agg(l)
low_IMD_crime= low_IMD_crime.rename(columns={"Crime ID":"Crime count"})
low_IMD_crime

In [None]:
labels = ["Anti-social behaviour", "Bicycle theft", "Burglary", "Criminal damage and arson", "Drugs", "Other crime", "Other theft", "Posessions of weapons", "Public order", "Robbery", "Shoplifting", "Theft from the person", "Vehicle crime", "Violence and sexual offences"]
sizes = low_IMD_crime["Crime count"]
explode = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 , 0 , 0.1)
plt.pie(sizes, 
        labels = labels,
        explode = explode,
        autopct = "%1.1f%%",
       textprops={'fontsize': 18})
plt.title("Low IMD Crime Type (%) 2019", fontsize=30)
plt.savefig("output/Low IMD Crime Type (%) 2019.png")

This piechart shows that Burglary, Shoplifting, Criminal damage and arson is higher in LSOAs with lower IMD scores. Vehicle crime is massivly higher in in LSOAs with lower IMD scores. However, Theft from person and public order crime is much lower than 
LSOAs with higher IMD scores.

##  District heatmaps 

We have used heatmaps to further this point. The heatmaps plot all the crime for each district. So we can see how the crime is distributed geographically.

Birmingham crime heatmap 2019:

In [None]:
birmingham_df
LSOA_crime_locations1 = birmingham_df[["Latitude", "Longitude"]]
incedents1 = birmingham_df["LSOA crime count"].astype(float)

figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig1 = gmaps.figure(layout=figure_layout)



# Create heat layer
heat_layer = gmaps.heatmap_layer(LSOA_crime_locations1, weights=incedents1, 
                                 dissipating=False, max_intensity=600,
                                 point_radius=.01)


# Add layer
fig1.add_layer(heat_layer)

# Display figure
fig1

Coventry crime heatmap 2019:

In [None]:
coventry_df
LSOA_crime_locations2 = coventry_df[["Latitude", "Longitude"]]
incedents2 = coventry_df["LSOA crime count"].astype(float)

figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig2 = gmaps.figure(layout=figure_layout)



# Create heat layer
heat_layer = gmaps.heatmap_layer(LSOA_crime_locations2, weights=incedents2, 
                                 dissipating=False, max_intensity=600,
                                 point_radius=.01)


# Add layer
fig2.add_layer(heat_layer)

# Display figure
fig2

Solihull crime heatmap:

In [None]:
solihull_df
LSOA_crime_locations3 = solihull_df[["Latitude", "Longitude"]]
incedents3 = solihull_df["LSOA crime count"].astype(float)

figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig3 = gmaps.figure(layout=figure_layout)



# Create heat layer
heat_layer = gmaps.heatmap_layer(LSOA_crime_locations3, weights=incedents3, 
                                 dissipating=False, max_intensity=600,
                                 point_radius=.01)


# Add layer
fig3.add_layer(heat_layer)

# Display figure
fig3

Walsall crime heatmap 2019:

In [None]:
walsall_df
LSOA_crime_locations4 = walsall_df[["Latitude", "Longitude"]]
incedents4 = walsall_df["LSOA crime count"].astype(float)

figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig4 = gmaps.figure(layout=figure_layout)



# Create heat layer
heat_layer = gmaps.heatmap_layer(LSOA_crime_locations4, weights=incedents4, 
                                 dissipating=False, max_intensity=600,
                                 point_radius=.01)


# Add layer
fig4.add_layer(heat_layer)

# Display figure
fig4

Sandwell crime heatmap 2019:

In [None]:
sandwell_df
LSOA_crime_locations5 = sandwell_df[["Latitude", "Longitude"]]
incedents5 = sandwell_df["LSOA crime count"].astype(float)

figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig5 = gmaps.figure(layout=figure_layout)



# Create heat layer
heat_layer = gmaps.heatmap_layer(LSOA_crime_locations5, weights=incedents5, 
                                 dissipating=False, max_intensity=600,
                                 point_radius=.01)


# Add layer
fig5.add_layer(heat_layer)

# Display figure
fig5

Wolverhampton crime heatmap 2019:

In [None]:
wolverhampton_df
LSOA_crime_locations6 = wolverhampton_df[["Latitude", "Longitude"]]
incedents6 = wolverhampton_df["LSOA crime count"].astype(float)

figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig6 = gmaps.figure(layout=figure_layout)



# Create heat layer
heat_layer = gmaps.heatmap_layer(LSOA_crime_locations6, weights=incedents6, 
                                 dissipating=False, max_intensity=600,
                                 point_radius=.01)


# Add layer
fig6.add_layer(heat_layer)

# Display figure
fig6

The above heatmpas show that crime was concentrated in the centres of the districts where we can assume that there are city centres and shopping centres. The heatmaps also show that there are smaller pockets of high crime rates in other areas within the districts. This is not entirely true for Birmingham, however, the city centre does follow this trend but there are pockets of high crime rates in many areas. This supports the other tests and visualisations so far which highlight that high crime rates tend to occur in city centres and around universities.

Interestingly this idea is evidently true for the crime type: bike theft.

## Research Question 4: Where is the safest place to park your bike?

We have decided to focus on bike crime as the levels of bike crime are statistically less than other crimes in the West Midlands. We also notice on the heatmaps that there seemed to be a pattern in location of bike thefts.

In [None]:
# Create IMD score dataframe
IMD_score_df = pd.read_csv('IMD Score.csv', index_col = 0)
IMD_score_df.dropna(inplace = True) 
IMD_score_df.head()

In [None]:
West_midlands_df1.columns

In [None]:
# Collect a list of relevant column names and save to a variable
columns = [
    "LSOA code",
    "LSOA name",
    "LSOA IMD Score",
    "LONG_",
    "LAT",
    "Crime type"
]

# Filter the data so that only those Bike thefts  are in a DataFrame
bike_theft_df = West_midlands_df1.loc[West_midlands_df1["Crime type"] == "Bicycle theft", columns]
# count the relevant crime types and save to a variable 
bike_theft_counts = bike_theft_df["LSOA code"].value_counts()
# create a pd dataframe and save to a variable
bike_theft_counts_df = pd.DataFrame(bike_theft_counts)
# Convert the column name into "Bike theft count"
bike_theft_counts_df = bike_theft_counts_df.rename(
    columns={"LSOA code": "Bike theft count"})
# Set new index to LSOA code
bike_theft_counts_df = bike_theft_counts_df.rename_axis("LSOA code")
# Merge two dataframes using an outer join and save to a variable
merge1_df = pd.merge(bike_theft_counts_df,IMD_score_df, on = "LSOA code",how="outer")
# Drop the null values
merge1_df.dropna(inplace = True)
# re-order the columns and save to a variable
bike_theft_final_df = merge1_df[["LSOA name", "LSOA IMD Score",
                            "LONG__y", "LAT_y", "Bike theft count"]]
# display the dataframe
bike_theft_final_df.head()

In [None]:
mean_value7 = bike_theft_final_df["Bike theft count"].mean()
mean_value7

In [None]:
locations7 = bike_theft_final_df[["LAT_y", "LONG__y"]]
incedents7 = bike_theft_final_df["Bike theft count"].astype(float)

# Convert IMD Score to a list

IMD = IMD_score_df["LSOA name"].tolist()
# Create a map using coordinates to set markers
marker_locations7 = IMD_score_df[["LAT_y", "LONG__y"]]

# Customize the size of the figure
figure_layout = {
    'width': '800px',
    'height': '600px',
    'padding': '1px',
    'margin': '0 auto 0 auto'
}
fig7 = gmaps.figure(layout=figure_layout)

markers = gmaps.symbol_layer(
    marker_locations7, fill_color='rgba(0, 0, 255, 0.1)',
    stroke_color='rgba(0, 0, 255, 0.1)', scale=2,
    info_box_content=[f"LSOA NAME: {name}" for name in IMD]
)

fig7.add_layer(markers)

# Create heat layer
heat_layer = gmaps.heatmap_layer(locations7, weights=incedents7, 
                                 dissipating=False, max_intensity=mean_value7*4,
                                 point_radius=.01)


# Add layer
fig7.add_layer(heat_layer)

# Display figure
fig7


In [None]:
# geocoordinates
target_coordinates = "52.489471, -1.898575"
target_search = "university"
target_radius = 150000
target_type = "university"

# set up a parameters dictionary
params = {
    "location": target_coordinates,
    "keyword": target_search,
    "radius": target_radius,
    "type": target_type,
    "key": gkey
}


# base url
base_url = "https://maps.googleapis.com/maps/api/place/nearbysearch/json"

# run a request using our params dictionary
response = requests.get(base_url, params=params)
# convert response to json
places_data = response.json()


In [None]:
places_data

We found that bike thefts tended to occur around universities and student accommodation. 

## Conclusion

We found the following:
- There is a weak positive correlation between LSOA Index of Multiple Deprivation score and crime count. Walsall has the strongest positive correlation and this supports our hypothesis. The other districts do not tend to have a signifcantly maority of LSOAs that have crime counts higher than 300. Sandwell had the weakest linear correlation with higher counts of crime in all LSOAs. This highlights that not all areas with a high deprivation score have high crime counts, and vice versa, the areas with lower deprivation levels still see high levels of crime. 
- A lower IMD score does not mean that there will be less crime and a higher IMD score does not mean that there will be a higher crime rate. 
- There are higher crime rates nearer the city centres, by universities, and near hospitals.
- Each district in the West Midlands has a similar crime trend. Violent and sexual offences make up the largest proportion of crime for all districts. And the other types of crime also follow similar patterns.
- Depending on which district and LSOA you are in you will witness different crimes, e.g. In Solihull you are more likley to witness vehicle crime or theft, whereas in Wolverhampton there are more violent offences. 
- Wolverhampton has the highest rate of violent and sexual offences in the West Midlands.
- Burglary, Shoplifting, Criminal damage and arson is higher in LSOAs with lower IMD scores. Vehicle crime is massively higher in in LSOAs with lower IMD scores. However, Theft from person and public order crime is much lower than LSOAs with higher IMD scores.
- Bike thefts tended to occur around universities and student accommodation. 
- High crime rates were concentrated in the centres of the districts and there were smaller pockets of high crime rates in other areas within the districts.
