<a href="https://colab.research.google.com/github/KimD86/Assessment/blob/main/Coding_Assessment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#AN ANALYSIS OF VIOLENT OFFENCES IN THE NORTH EAST ON FOOTBALL GAME DAYS.
DO VIOLENT OFFENCES INCREASE ON THE DAYS NEWCASTLE UNITED PLAY AT HOME?

This coding report aims to conduct an analysis into the rates of recorded violent offence crimes in the North East of England. It will be analysing the crimes and looking to see if there is an increase in cases when the local football teams play a home game. The area that is being looked at is Newcastle in months where Newcastle United have played games.
#Violent crime covers a variety of offences – ranging from common assault to murder. It also encompasses the use of weapons such as firearms, knives and corrosive substances like acid.
The data used has been retrieved from [data.police.uk](https:/https://data.police.uk//) and covers both the 2020-2021 football season and the 2021-2022 season. The months for the football seasons are from July - May.
Spreadsheets have been made using the information from the football clubs websites and have been pulled in to create table's, graphs and maps to show where the violent offences are a problem.  

#LIMITATIONS

When looking at the data from the police data website there are limitations, these include:


1.  In the Police Data UK records, violence and sexual offences are combined so they cannot be differentiated to say whether the crime recorded was violence or sexually orientated, or both. For the purpose of this report, the crimes have been assumed to be violent offences.
2.   Not all crimes are reported so it is uncertain how accurate the statistics are.
3. There are often many factors involved in why a crime is happening in a certain area, so it cannot be stated for certain what one specific reason is the main cause of the crime.
4. A month with an increase in the level of crime, does not mean that there have been more crimes, but could, mean that more people have reported that the offence has happened.

In [None]:
import pandas as pd

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import folium
from folium import plugins

In [None]:
from folium.plugins import HeatMap
from folium.plugins import MarkerCluster
from folium.plugins import MeasureControl
import numpy as np

#A python library has been used called pandas, this stands for panel data, it has been imported ad pd. Pandas is used for data manipulation and analysis and offers data structures and operations for manipulating numerical tables and time series.

##BRIEFING

Football in the North East has a large following and many fans attend home and away games. When there have been home games in the Newcastle area there have also been a lot of violent crimes recorded. The report will analyse data and try to find a link between football games at home and violent offences. An example of one violent offence at a home football game can be seen in the following link. Four men given banning orders after disorder flared before Newcastle United home game. https://beta.northumbria.police.uk/latest-news/2023/march/four-men-given-banning-orders-after-disorder-flared-before-newcastle-united-home-game/

##Methodology

The data used in this report was collected from various sources, including the data police website, where crime statistics were downloaded for the months needed to compare and analyse. Pandas has been used to pull in the data, make new data frames and to analyse the crime type that is being looked at, the crime type looked at here will be violent offences.  

The following tables were made on Excel spreadsheets that show the Newcastle football scores for the seasons that were being looked at, these are 2020/2021 and 2021/2022 seasons.
They show the dates of games, who they played against, if it was a home or away game along with the scores.  

In [None]:
df_newcastle_season_20202021=pd.read_excel('/content/drive/MyDrive/Colab Notebooks/Season 2020 2021.xlsx')
df_newcastle_season_20202021

A spreadsheet was created to show the scores for all the Newcastle United football games in the 2020-2021 season. Once the spreadsheet was done a new data frame was created called df_newcastle_season_20202021, pandas read the file from the Google Drive content and pilled it in to show the information in the above.

In [None]:
df_newcastle_season_20212022=pd.read_excel('/content/drive/MyDrive/Season 2021 2022 games.xlsx')
df_newcastle_season_20212022

There was another spreadsheet created, this time for the 2021-2022 season and using pandas a new dataframe was created and pulled the information over into the report.

#The two table's above show the seasons 2020-2021 and 2021-2022 games for Newcastle United. The tables show the dates of the games, if they were played at home or away, the teams played and if the game was a win or a loss. From the tables the data is going to be looked at and charts, table's and graphs made to show what months have a higher rate of violent offences. Months where there have been more home games will be looked at to see if there is an increase in the offences from those months where there are more away games.

Spreadsheets were made and pulled in that were going to be used for the analysis. Once they were made, the police data website was used to download the crime files for Northumbria so that the crime type could be analysed. Firstly, each month that was being analysed was pulled in using the following code and method.

#The code was analysed and provides an unserstanding into what the aim of the investigation was.


In [None]:
df_crime_july_2020=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2020-07/2020-07-northumbria-street.csv')

In [None]:
df_crime_july_2020.head()

In [None]:
len(df_crime_july_2020)

len is used to bring up the total number of items in the container. In this case it was the total number of crime in the area specified for the month of July.

In [None]:
df_crime_july_2020['Crime type'].value_counts()

In [None]:
#@title Chart showing crime types for July 2020
g = sns.catplot(x="Crime type",
                data=df_crime_july_2020,
                row= 'Month',
                kind="count",
                palette="Reds", # see this link https://seaborn.pydata.org/tutorial/color_palettes.html
                aspect =3,
                height=4,
                order=df_crime_july_2020['Crime type'].value_counts().sort_values(ascending=False).index
                )
g.set_xticklabels(rotation=70, horizontalalignment='right')

The bar chart is showing the crime types and how many have been reportesd in the month on July. It can be seen that violent offences is the highest reported crime for this month.

In [None]:
df_crime_august_2020=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2020-08/2020-08-northumbria-street.csv')

In [None]:
df_crime_august_2020.head()

In [None]:
df_crime_august_2020['Crime type'].value_counts()

In [None]:
#@title Chart showing crime types from August 2020
g = sns.catplot(x="Crime type",
                data=df_crime_august_2020,
                row= 'Month',
                kind="count",
                palette="Reds", # see this link https://seaborn.pydata.org/tutorial/color_palettes.html
                aspect =3,
                height=4,
                order=df_crime_august_2020['Crime type'].value_counts().sort_values(ascending=False).index
                )
g.set_xticklabels(rotation=70, horizontalalignment='right')

In [None]:
df_crime_september_2020=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2020-09/2020-09-northumbria-street.csv')

In [None]:
df_crime_september_2020.head()

In [None]:
df_crime_october_2020=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2020-10/2020-10-northumbria-street.csv')

In [None]:
df_crime_october_2020.head()

In [None]:
df_crime_november_2020=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2020-11/2020-11-northumbria-street.csv')

In [None]:
df_crime_november_2020.head()

In [None]:
df_december_crime_2020=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2020-12/2020-12-northumbria-street.csv')

In [None]:
df_december_crime_2020.head()

In [None]:
df_january_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2021-01/2021-01-northumbria-street.csv')
df_january_crime_2021.head()

In [None]:
df_february_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2021-02/2021-02-northumbria-street.csv')
df_february_crime_2021.head()

In [None]:
df_march_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2021-03/2021-03-northumbria-street.csv')
df_march_crime_2021.head()

In [None]:
df_april_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2021-04/2021-04-northumbria-street.csv')
df_april_crime_2021.head()

In [None]:
df_may_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/2021-05/2021-05-northumbria-street.csv')
df_may_crime_2021.head()

The tables and graphs above are showing the crime rates for the Northumbria area, they show the crime type and the location of where that crime happened.
# The graph helps to visualise the data and help understand what is going on. It helps to understand what is happening and assists in the analysing any trends.

In [None]:
df_allcrime_northumbria_20202021=pd.concat([df_crime_july_2020, df_crime_august_2020, df_crime_september_2020, df_crime_october_2020, df_crime_november_2020, df_december_crime_2020, df_january_crime_2021, df_february_crime_2021, df_march_crime_2021, df_april_crime_2021, df_may_crime_2021])
df_allcrime_northumbria_20202021.sample(6)

In [None]:
df_allcrime_northumbria_20202021.to_csv('/content/drive/MyDrive/Colab Notebooks/crime data for july 2020 - may 2021.csv')

In [None]:
len(df_allcrime_northumbria_20202021)

All the months that are included in the football season were imported to show the data. Once this was done, concat was used. Concatenate is used to merge all the data specified into one. The new data frame was saved into the google drive so that it was ready for any further use for the analysis. Graphs were made to visualise the data to show the different types of crime and how they compare in number of offences committed.  

In [None]:
sns.catplot(y="Crime type", kind="count", row="Month", height=5, aspect=4,
            palette="flare", edgecolor=".6",
            data=df_allcrime_northumbria_20202021, order = df_allcrime_northumbria_20202021['Crime type'].value_counts().index);

The charts above are showing the data for all the months looked at in the 2020/2021 season. It shows the months and all the crime types that have been reported over the specified duration.

The main months that were looked at were July 2020 - May 2021 for this season, these months of data were saved and the a new dataframe was made. When the new dataframe was made, it was decided that the area that needed to be analysed was Newcastle upon Tyne.

In [None]:
df=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/crime data for july 2020 - may 2021.csv')

In [None]:
df.sample(6)

Now all the data has been imported for the Northumbria area, a new dataframe was made that will show only the crimes in the Newcastle area. This is the location of the Newcastle football games so this is the area that woas analysed.  

In [None]:
df_Newcastle=df.loc[df['LSOA name'].str.contains('Newcastle upon Tyne 006C', na=False)]

In [None]:
df_Newcastle.shape

In [None]:
df_Newcastle.head(4)

In [None]:
df_Newcastle.tail(4)

In [None]:
df_gp_each_month = df_Newcastle.groupby(by='Month')['Crime type'].count()
df_gp_each_month

In [None]:
df_month_count = df_gp_each_month.to_frame(name='Count').reset_index()
df_month_count

In [None]:
g = sns.catplot(data=df_month_count, x="Month", y="Count", kind="bar", palette="copper", height=4, aspect=2)
g.set_xticklabels(rotation=70, horizontalalignment='right')

A new dataframe has been made in order to make the data show violent offences only in the Newcastle area. Pandas was asked to look for the string that contains Violence and sexual offences from the crime type column and put it into a dataframe called df_violent_crime_Newcastle.
This was the dataframe that was used for the analysis into the link between violent offences and home football games.
Heatmaps have been made using the coordinates for Newcastle and visually show the number of violent offences in the Newcastle area, the heat map has three different colours they go from red which is a high number of offences to yellow that are areas where the crime is more moderate and the green markers which are the areas where violent offences are not as high.  

In [None]:
df_violent_crime_Newcastle=df.loc[df['Crime type'].str.contains('Violence and sexual offences')]

In [None]:
df_violent_crime_Newcastle.head(5)

In [None]:
violence_coordinates=df_violent_crime_Newcastle[['Latitude','Longitude']].to_numpy()

In [None]:
crime_map = folium.Map(width=800, height=600, location=[54.903395	, -1.676736], control_scale = True, zoom_start=12)
crime_map.add_child(MeasureControl())
crime_map

In [None]:
crime_map.add_child(plugins.MarkerCluster(violence_coordinates))
crime_map

In [None]:
crime_map.add_child(plugins.HeatMap(violence_coordinates))
crime_map

#The heatmap over shows the area in Newcastle and the amount of the different crime types that have been reported in the area. We can see that there are a lot of red areas on the map meaning there is a lot of crimes happening around the area in the months that have been analysed.

In [None]:
df_july_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2021-07/2021-07-northumbria-street.csv')

In [None]:
df_august_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2021-08/2021-08-northumbria-street.csv')

In [None]:
df_september_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2021-09/2021-09-northumbria-street.csv')

In [None]:
df_october_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2021-10/2021-10-northumbria-street.csv')

In [None]:
df_november_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2021-11/2021-11-northumbria-street.csv')

In [None]:
df_december_crime_2021=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2021-12/2021-12-northumbria-street.csv')

In [None]:
df_january_crime_2022=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2022-01/2022-01-northumbria-street.csv')

In [None]:
df_february_crime_2022=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2022-02/2022-02-northumbria-street.csv')

In [None]:
df_march_crime_2022=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2022-03/2022-03-northumbria-street.csv')

In [None]:
df_april_crime_2022=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2022-04/2022-04-northumbria-street.csv')

In [None]:
df_may_crime_2022=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Assessment/2022-05/2022-05-northumbria-street.csv')

#The code above is the crime data for the months of the 2021/2022 season.

In [None]:
df_allcrime_northumbria_20212022=pd.concat([df_july_crime_2021, df_august_crime_2021, df_september_crime_2021, df_october_crime_2021, df_november_crime_2021, df_december_crime_2021, df_january_crime_2022, df_february_crime_2022, df_march_crime_2022, df_april_crime_2022, df_may_crime_2022])
df_allcrime_northumbria_20212022.head()

In [None]:
sns.catplot(y="Crime type", kind="count", row="Month", height=5, aspect=4,
            palette="flare", edgecolor=".6",
            data=df_allcrime_northumbria_20212022, order = df_allcrime_northumbria_20202021['Crime type'].value_counts().index);

The charts above are showing the crimes that have been reported in the months for the seaon 2021/2022 of the Newcastle season. From them it shows that violent offences are the second highest reportes crime for the area in the months specifically looked at.

In [None]:
df_allcrime_northumbria_20212022.to_csv('/content/drive/MyDrive/Colab Notebooks/crime data for july 2021 - may 2022.csv')

In [None]:
df=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/crime data for july 2021 - may 2022.csv')

In [None]:
df_Newcastle1=df.loc[df['LSOA name'].str.contains('Newcastle upon Tyne 006C', na=False)]

In [None]:
df_Newcastle1.head()

In [None]:
df_violent_crime_Newcastle1=df.loc[df['Crime type'].str.contains('Violence and sexual offences')]

In [None]:
df_violent_crime_Newcastle1.head()

In [None]:
df_gp_each_month = df_Newcastle1.groupby(by='Month')['Crime type'].count()
df_gp_each_month

In [None]:
df_month_count = df_gp_each_month.to_frame(name='Count').reset_index()
df_month_count

In the code above it shows that the months with the highest violent offences reported are March, May and August in the season 2021/2022. April had 24 reported, this is one of the months that had a higher number of home games played in the season.

In [None]:
df_violent_crime_Newcastle1=df.loc[df['Crime type'].str.contains('Violence and sexual offences')]
df_violent_crime_Newcastle1.head()

In [None]:
violence_coordinates1=df_violent_crime_Newcastle1[['Latitude','Longitude']].to_numpy()

In [None]:
crime_map = folium.Map(width=800, height=600, location=[54.903395	, -1.676736], control_scale = True, zoom_start=12)
crime_map.add_child(MeasureControl())
crime_map

In [None]:
crime_map.add_child(plugins.MarkerCluster(violence_coordinates1))
crime_map

The heatmap shows the number of violent offences that have been reported in the area. It shows that in and around St. James's PArk, where Newcastle play, there have been 9 reported in the stadium and the number is increasing the further away that you get from the stadium. The numbers are high in and around more housing estate areas, this indicates that people are being violent in their homes or other residences.
In and around the towm center area the heat map is showing a lot of red areas, red is showing a hugh number of reports of the violent offences. There are a lot of pubs and clubs in the Newcastle town area where the supporters usually go after a match. The supporters are likely to be fuelling up with alcohol which has a big impact on the behaviour of some poeple.

From the data that has been analysed, it can be seen that there is a high number of violent offences when there are football games in the Newcastle area. Violent offences are the second highest crime type in the area over these months, they are second to anti-social behaviour. ASB is very high in the area too, this could also be an effect of having football games in the area. When attending football games, fans can become very emotional about the sport and if the team that they're there to see does not win then this can cause them to become violent towards not only other fans but also towards family members.
From the data for all the football games it can be seen that violent offences are at the highest in the month's October and April in the 2020/2021 season. During these months there was 6 home games played at the Newcastle ground, during these months the numbers went up quite drastically for the offences which could support the hypothesis that on days where there are home football games, there are more violent offences in the area. Looking at the 2021/2022 season the month with the highest number of violent offences was March, in this month there was only one home game for this season. The question asked was whether violent offences are higher on home game months. In April there was the most home games played, and the number of violent offences was 24. This is a high number of offences, and this could help support the fact that the offences are higher when there are a lot of home games.

##Conclusion

Looking at the results from the maps and charts that have been made, we can see that there is a slight increase in violent offences in and around the Newcastle stadium when there has been a home game played. This is seen to happen during both of the seasons that have been looked at.

##References

Four men given banning orders after disorder flared before Newcastle United Home Game (no date) Home : Northumbria Police. Available at: https://beta.northumbria.police.uk/latest-news/2023/march/four-men-given-banning-orders-after-disorder-flared-before-newcastle-united-home-game/ (Accessed: May 4, 2023).

MacKenzie, N. (no date) Newcastle United Football Club - fixtures 2020-21. Available at: https://www.nufc.com/html/2020-21html/fixtures.html (Accessed: March 5, 2023).

Violent crime (no date) The Crown Prosecution Service. Available at: https://www.cps.gov.uk/crime-info/violent-crime#:~:text=Violent%20crime%20covers%20a%20variety,and%20corrosive%20substances%20like%20acid.&amp;text=Murder%20and%20manslaughter%20are%20crimes,can%20be%20described%20as%20homicide. (Accessed: April 24, 2023).

(no date) Your Download. Available at: https://data.police.uk/data/fetch/652c3307-e9cc-43f4-beb4-fa23d55d10f2/ (Accessed: April 20, 2023).



