# 'It takes a walk to be stopped and searched'

#### A research project on the impact of the 'Stop and Search' Police policy in Liverpool

# Part I

## The story

The modern history of Liverpool is both intertwined with ethnic diversity, the "Harlem of Europe", and policy brutality on minorities, Toxteth Riots in 1981 (Moody, 2020). Section 60 is a controversial policy that allows UK police to stop and search a person without reasonable ground. Currently, black people are 9 times more likely to be stopped and searched (Home Office, 2021).
The map shows the locations of the Stop and Search (S&S) activities run by the Police in Liverpool in 2020, a further focus is on Black citizens, specificaly in the age 18-24. The S&S dataset is retrieved through the UK Police API, selecting an arbitrary polygon around the Liverpool city center. A month-by-month query is done and the results are combined into a single ‘2020’ geodataframe.
Through the CDRC dataset on Liverpool from the 2011 Census, two main variables are shown: the Index of Multiple Deprivation (IMD) and the Black residents in the area at the LSOA level.
Seven datasets are displayed across different zoom levels with two different basemap. The general idea is to start from a wider point of view on the phenomenon (all the S&S activities in 2020) and narrow the focus on a smaller subset of data and areas accordingly to the zoom levels. A standard Open Street Map basemap is chosen to enhance the city's topography. 
All the locations of the S&S activies are plotted to show how they are distributed (1). The IMD score is mapped in 7 classes using a natural break scheme to highlight the higher values (2). Previous studies have shown that there is a positive relationship between more deprived areas with an higher density of S&S activities, but for Black people that's not the case: they are invariably affected (Shiner et al, 2018). The S&S policy seems to disproportionately affect Black citizens in Liverpool. While they represent only 2.6% of Liverpool's total population, they are 5.2% of the total S&S population (Nomis, 2011). This means that a Black resident is two times more likely to get stopped. These data are consistent with previous results across the years (Statewatch, 2002; Equality and Human Right Commission, 2010).
The locations of this subset of S&S events are plotted (3) together with a distribution of the Black residents (4). Since younger black citizens (age 18-24) account for 40% of the total black people subset (93% male), a kernel density estimate (KDE) of the S&S locations is produced to capture the probability of being stopped by a police officer (5). The KDE together with the residential data may show how the majority of the Police activities on this group happen outside their ethnic communities. Potentially creating a deterrent to visiting more affluent and whiter zones (Shiner et al., 2018). Using Isochrone API from Mapbox, a 5 and 10 minutes walking distance from Liverpool Central (the highest KDE value) is plotted to show a simple walk in the city center might end up in an S&S for a young black male (6). Lastly, the specific locations of the S&S checks on young black people are shown (7). The map switches to the Mapbox Satellite basemap, to display POI (Point of Interest) and the effective urban landscape. The idea is to finally have a realistic ‘street view’ and further suggest research paths (eg. S&S are done nearby schools, stations).

## Conceptual background

1) An API, or application programming interface, is a system in which a company or institution lets their data and/or functionality accessible to an external person, company or institution. An API works as a machine-to-machine interface in which there is a systematic and programmatic way to access that information, usually a REST-ful framework. This means that there are endpoints that can be accessed. It works as a bridge between the client and the server. An API can have both POST and GET requests, this means that as a programmer, you can send information through the API (eg. programmatically post a Tweet i.e. bot) or you retrieve information (e.g get all Tweets with ‘#webmapping’ hashtag).
In this project, the Police API allows to get the Stop and Search data on a polygon around the Liverpool City area and the Isochrone Mapbox API allows to calculate the walking distance areas from a point.

2) As in a mosaic, a large map appears as a single one but in reality, is composed of multiple tiles. Each tile has a fixed dimension (Google standard is 256x256). This is especially important in web mapping since a lot of data would be required to retrieve and publish a unique map-image with all the required zoom levels. An additional problem, and advantage for the tile-based maps, is how other data associated with a map (eg. car traffic) would be retrieved and stored. With the tile-based technology, only the tiles at a certain zoom level store certain data, instead of loading all the data regardless of which is the zoom level of the map (eg. all car traffic in the world vs only car traffic along a route). This advantage makes the map even more dynamic since the content associated with a tileset changes accordingly to the zoom level (eg. COVID-19 infections per country versus COVID-19 infections by UK district). However, this could be also a drawback since data are not accessible as a whole.
Overall, the tileset technology makes web-mapping more scalable, efficient and accessible across devices but it is not designed to conduct analysis or to be edited (it is hard to edit them once created) but instead to efficiently visualize.

In [2]:
# Manipulate spatial data
import geopandas as gpd
# Manipulate json
import json
# Manipulate dataframe
import pandas as pd
# Handle API requests
import requests
# Add basemap to the maps
import contextily as cx
# Ensure plots renders inside the notebook 
import matplotlib.pyplot as plt

# Part IIA - Extract data from the UK Police API

In [3]:
# poly format [lat],[lng]:[lat],[lng]:[lat],[lng]
# define the geographical boundaries of the selected area
poly = '53.44389944710078, -3.01025390625:' + '53.40461992848442,-2.9999542236328125:' + '53.366942995161345, -2.967681884765625:' + '53.37800381298034, -2.912750244140625:' + '53.4230367215282,-2.9196166992187496:' + '53.44676215918743,-2.973175048828125:' + '53.44489944710078,-3.012025390625'
# Police API endpoint
url = f'https://data.police.uk/api/stops-street?poly={poly}&date=2020-01'
# Send GET request to the url
r = requests.get(url)
# Handle json data from the GET response
ss = json.loads(r.content)
# Plot first element in the Stop & Search content
ss[0]

{'age_range': 'over 34',
 'outcome': 'Arrest',
 'involved_person': True,
 'self_defined_ethnicity': 'White - English/Welsh/Scottish/Northern Irish/British',
 'gender': 'Male',
 'legislation': 'Misuse of Drugs Act 1971 (section 23)',
 'outcome_linked_to_object_of_search': True,
 'datetime': '2020-01-06T10:11:23+00:00',
 'removal_of_more_than_outer_clothing': False,
 'outcome_object': {'id': 'bu-arrest', 'name': 'Arrest'},
 'location': {'latitude': '53.420561',
  'street': {'id': 914481, 'name': 'On or near Tamar Close'},
  'longitude': '-2.961643'},
 'operation': None,
 'officer_defined_ethnicity': 'White',
 'type': 'Person search',
 'operation_name': None,
 'object_of_search': 'Controlled drugs'}

In [4]:
# built function to systematically parse a single S&S event
def parse_ss_event(cr):
    # create a panda series with the selected keys and content
    cr_parsed = pd.Series({
           'age_range': cr['age_range'],\
           'outcome': cr['outcome'],\
           'involed_person': cr['involved_person'],\
           'gender': cr['gender'],\
           'self_defined_ethnicity': cr['self_defined_ethnicity'],\
           'street_id': cr['location']['street']['id'],\
           'longitude': float(cr['location']['longitude']), \
           'latitude': float(cr['location']['latitude']), \
           'date': cr['datetime'], \
           'officer_ethnicity': cr['officer_defined_ethnicity']
                     })
    return cr_parsed

In [5]:
# create a function to 
def parsing(stop_and_search):
    # Start an empty list to store parsed crimes dynamically
    parsed = []

    # Loop over each crime event in the list of crimes
    for cr in stop_and_search:
        # Parse a single crime
        pc = parse_ss_event(cr)
        # Store the parsed crime into the list created
        parsed.append(pc)

    # Convert the list into a DataFrame
    parsed = pd.DataFrame(parsed)
    return parsed

In [6]:
# create a function to get data from a month in a year
def get_month_crime(month,year):
    # Police API endpoint
    url = f'https://data.police.uk/api/stops-street?poly={poly}&date={year}-{month}'
    # Send GET request
    r = requests.get(url)
    # Load monthly crimes in json
    mcrimes = json.loads(r.content)
    # Parse json and store it in a variable
    month_db = parsing(mcrimes)
    # Return the newly created variable
    return month_db

In [7]:
# create a DataFrame concatenating each month result
yearly_crimes_20 = pd.concat([get_month_crime('01','2020'), 
                get_month_crime('02','2020'), 
                get_month_crime('03','2020'),
                get_month_crime('04','2020'),
                get_month_crime('05','2020'),
                get_month_crime('06','2020'),
                get_month_crime('07','2020'),
                get_month_crime('08','2020'),
                get_month_crime('09','2020'),
                get_month_crime('10','2020'),
                get_month_crime('11','2020'),
                get_month_crime('12','2020'),
               ])

In [None]:
# create Geopandas points from the Stop and Search events coordinates
points = gpd.points_from_xy(yearly_crimes_20["longitude"],
                                  yearly_crimes_20["latitude"]
                                 )
# create Geopandas dataframe with the points geometry and the S&S 2020 dataset.
crimes_geodf_20 = gpd.GeoDataFrame(yearly_crimes_20,
                                geometry=points,
                                # set CRS to ESPG 27700  
                                crs="EPSG:27700"
                               )

In [None]:
# Show S&S 2020 geodataframe
crimes_geodf_20

In [None]:
# Create a subset of the total S&S dataset about the citizens stopped with a self defined Black identity
black = crimes_geodf_20[crimes_geodf_20['self_defined_ethnicity'].str.contains('Black', na=False)]

In [None]:
# Create a subset of the total S&S dataset about the citizens stopped with a self defined White identity
white = crimes_geodf_20[crimes_geodf_20['self_defined_ethnicity'].str.contains('White', na=False)]

In [None]:
# Percentange of black people stopped and searched out of the total S&S in 2020
bk_percentage = (len(black)/ len(crimes_geodf_20))
bk_percentage

In [None]:
# Percentange of white people stopped and searched out of the total S&S in 2020
wh_percentage = (len(white)/ len(crimes_geodf_20))
wh_percentage

In [None]:
# Overepresentation of Black people in S&S out of the total resident population in Liverpool (2.6% are Black)
bk_percentage / 0.026

In [None]:
# Underepresentation of White people in S&S (88.9% are White)

In [39]:
wh_percentage / 0.889

0.8545256004650322

In [40]:
# Average per day black people stopped and searched
len(black)/365

3.0273972602739727

In [41]:
# Total Black citizens 18-24 stopped for a Stop and Search in 2020
black_youth = black[black['age_range'].str.contains('18-24', na=False)]

In [42]:
# Percentage of black youth out of the total S&S
(len(black_youth)/len(black))*100

39.547511312217196

In [43]:
# Total males out of the Black Youth subset
bk_youth_sex = black_youth[black_youth['gender'].str.contains('Male', na=False)]

In [44]:
# Percentage of the black youth males out of the total black youth dataset
len(bk_youth_sex)/len(black_youth)

0.9336384439359268

In [45]:
# Percentage of black male citizens between 18-24 stopped and searched over the total black 18-24 population (6592, Nomis 2011)
bk_youth_percentage = (len(black_youth)/6592)
bk_youth_percentage

0.06629247572815535

In [46]:
# Total White citizens under 24 stopped for a Stop and Search in 2020
white_youth = white[white['age_range'].str.contains('18-24', na=False)]

In [47]:
# Percentage of white male citizens between 18-24 stopped and searched over the total white 18-24 population (202544)
wh_youth_percentage = (len(white_youth)/202544)*100
wh_youth_percentage

2.6685559680859465

In [48]:
# Difference in chances of being stopped if you are black vs white youth
bk_youth_percentage/wh_youth_percentage

0.0248420780830407

In [49]:
# Define dataframe with all the citizens arrested after a S&S
arrested = crimes_geodf_20[crimes_geodf_20['outcome'].str.contains('Arrest', na=False)]
# Return total citizens arrested count
len(arrested)

1515

In [50]:
# Define dataframe with all the Black citizens arrested after a S&S
bk_arrested = crimes_geodf_20[crimes_geodf_20['self_defined_ethnicity'].str.contains('Black', na=False) & crimes_geodf_20['outcome'].str.contains('Arrest', na=False)]
# Return total Black citizens arrested count
len(bk_arrested)

89

In [51]:
# Define dataframe with all the White citizens arrested after a S&S
wh_arrested = crimes_geodf_20[crimes_geodf_20['self_defined_ethnicity'].str.contains('White', na=False) & crimes_geodf_20['outcome'].str.contains('Arrest', na=False)]
# Return total White citizens arrested count
len(wh_arrested)

1177

In [52]:
# Percentage of Stop and Search that led to an arrest in 2020
(len(bk_arrested)/len(black))*100 # Black

8.054298642533936

In [53]:
# Percentage of Stop and Search that led to an arrest in 2020
(len(wh_arrested)/len(white))*100 # White

7.40111928566937

In [54]:
# Percentage of Stop and Search led by Black officers to Black citizens in 2020
# Identify black stopped by black officers
b_2_b = crimes_geodf_20[crimes_geodf_20['officer_ethnicity'].str.contains('Black', na=False) & crimes_geodf_20['self_defined_ethnicity'].str.contains('Black', na=False)]

# Calculate percentage
(len(b_2_b)/len(black))*100

91.49321266968326

In [55]:
# Export as Geodatapack the total S&S Police activities in 2020
crimes_geodf_20.to_file("./outputs/crimes_20.gpkg", driver="GPKG")

In [56]:
# Export as Geodatapack the total S&S Police activities in 2020 involving self declared Black citizens
black.to_file("./outputs/black_20.gpkg", driver="GPKG")
# Export as Geodatapack the total S&S Police activities in 2020 involving self declared White citizens
white.to_file("./outputs/white_20.gpkg", driver="GPKG")

In [57]:
# Export as Geodatapack the total S&S Police activities in 2020 involving Black young citizens (age 18-24)
black_youth.to_file("./outputs/black_youth_20.gpkg", driver="GPKG")

In [58]:
# Export as Geodatapack the total S&S Police activities in 2020 involving White young citizens (age 18-24)
white_youth.to_file("./outputs/white_youth_20.gpkg", driver="GPKG")

## Part IIB - Extract Walking distance around highest value of the KDE

In [59]:
# Get Longitude and Latitude from Liverpool Central (highest KDE value area)
latitude = '53.404856'
longitude = '-2.982918'

In [1]:
# Define public Mapbox token
TOKEN = 'mapbox_public_token'

In [67]:
# Get endpoint for the area reachable in 5 and 10 minutes walking distance
url_iso = 'https://api.mapbox.com/isochrone/v1/mapbox/walking/' + longitude + ',' + latitude + '?contours_minutes=5,10&contours_colors=6706ce,04e813&polygons=true&access_token=' + TOKEN

In [68]:
# Retrive content
r = requests.get(url_iso)

In [69]:
# Load content
walk = json.loads(r.content)
walk

{'features': [{'properties': {'fill': '#04e813',
    'fillOpacity': 0.33,
    'fill-opacity': 0.33,
    'fillColor': '#04e813',
    'color': '#04e813',
    'contour': 10,
    'opacity': 0.33,
    'metric': 'time'},
   'geometry': {'coordinates': [[[-2.978886, 53.409888],
      [-2.975918, 53.408865],
      [-2.974918, 53.409182],
      [-2.973336, 53.408856],
      [-2.972172, 53.407856],
      [-2.971969, 53.406856],
      [-2.97068, 53.405856],
      [-2.97165, 53.404856],
      [-2.971143, 53.403856],
      [-2.972649, 53.402856],
      [-2.972188, 53.401856],
      [-2.973918, 53.400065],
      [-2.975918, 53.399147],
      [-2.977918, 53.399598],
      [-2.983918, 53.39852],
      [-2.986918, 53.398461],
      [-2.989608, 53.399856],
      [-2.98975, 53.401024],
      [-2.990918, 53.400963],
      [-2.992679, 53.401856],
      [-2.992338, 53.402856],
      [-2.993605, 53.404856],
      [-2.99281, 53.406856],
      [-2.990744, 53.408856],
      [-2.988103, 53.410041],
      [-2.984

In [71]:
# Split Polygons based on walking distance
walk_5 = walk['features'][0]
walk_10 = walk['features'][1]

In [None]:
# Create Geopandas
walk_5 = gpd.GeoDataFrame.from_features([walk_5], crs="EPSG:4326")
walk_10 = gpd.GeoDataFrame.from_features([walk_10], crs="EPSG:4326")

In [80]:
# Export as geodatapack the 5 and 10 minutes isochrone area
walk_5.to_file("./outputs/walk_5.gpkg", driver="GPKG")
walk_10.to_file("./outputs/walk_10.gpkg", driver="GPKG")

# Literature

+ Equality and Human Right Commission (2010) 'Stop and think. A critical review of the use of stop and search powers in England and Wales' - https://www.equalityhumanrights.com/sites/default/files/ehrc_stop_and_search_report.pdf
+ Home Office (2021) - 'Stop and Search' data - Home Office - Stop and Search - https://www.ethnicity-facts-figures.service.gov.uk/crime-justice-and-the-law/policing/stop-and-search/latest
+ Moody, J. (2020). Black Liverpool: Living with the Legacy of the Past. In The persistence of memory: Remembering slavery in Liverpool, 'slaving capital of the world' (pp. 65-100). Liverpool: Liverpool University Press. doi:10.2307/j.ctv1675bp5.9
+ ONS Census - Ethnicities in Liverpool (2011) https://www.nomisweb.co.uk/reports/localarea?compare=E08000012#section_6_4  
+ Shiner, Michael, Carre, Zoe, Delsol, Rebekah and Eastwood, Niamh (2018) The colour of injustice: 'race', drugs and law enforcement in England and Wales. . StopWatch, London, United Kingdom. ISBN 9781999316303
+ Statewatch (2002) - 'UK: Ethnic injustice: More black and Asian people are being stopped and searched than ever before' - https://www.statewatch.org/media/documents/news/2004/aug/stop-and-search.pdf