<a href="https://colab.research.google.com/github/jam244-web/api/blob/main/takehome.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 The latest data is available through the public Police Data API
https
 ://data.police.uk/docs/method/stops-force/
 Utilise the API documentation to fetch all historical search data relating to a force. This API
does not require authentication.
The example URL below only fetches 1-month of data:
 force?force=avon-and-somerset&date=2023-01
Notes: Historical data only goes back to a certain time.
 Your task is to complete following steps:
 https://data.police.uk/api/stops
• Write a script in Python to programmatically pull down all stop and search data from the
API for the Metropolitan Police Service (i.e. just one force).
 • Combine all data into a Pandas dataframe.
 • Clean and format the data as you see appropriate.
 • Write the data out to a csv file.
 • Write a process that pulls the latest data at a given time each day and updates the existing
csv file.
 • Prepare a short explanation of how you would store the data in a relational database if
required – think about the structure of the data and what schema you would apply.
 • Suppose entries were updated or deleted in the API – how would your system handle
this? What changes to their API would you recommend to help make the update process
easier?
 • Prepare an example of a SQL statement that creates a view to return an aggregate count
of crime type by date.
 • Use docker to containerise your script to be deployed in any cloud service as a data
fetching service.
 Other things to consider:
 • If there are other technologies that you want to include in your solution, do feel free –
don’t feel

In [None]:
import requests
import pandas as pd
import datetime
import os

FORCE_ID = 'metropolitan'
BASE_URL = 'https://data.police.uk/api/stops-force'

def fetch_data(force_id, date):
  url = f'{BASE_URL}?force={force_id}&date={date}'
  response = requests.get(url)
  if response.status_code == 200:
    return response.json()
  else:
    return []


def fetch_historical_data(force_id):
  data = []
  current_date = datetime.date.today()

  start_date = datetime.date(2024, 1, 1)
  while start_date < current_date:
    formatted_date = start_date.strftime('%Y-%m')
    print(f'Fetching data for {formatted_date}')
    monthly_data = fetch_data(force_id, formatted_date)
    data.extend(monthly_data)
    if start_date.month == 12:
      start_date = datetime.date(start_date.year + 1, 1, 1)
    else:
      start_date = datetime.date(start_date.year, start_date.month + 1, 1)

  return data

met_data = fetch_historical_data(FORCE_ID)

df = pd.DataFrame(met_data)
csv_file= 'met_data.csv'
df.to_csv(csv_file, index=False)
print(f'Data saved to {csv_file}')


Fetching data for 2024-01
Fetching data for 2024-02
Fetching data for 2024-03
Fetching data for 2024-04
Fetching data for 2024-05
Fetching data for 2024-06
Fetching data for 2024-07
Fetching data for 2024-08
Fetching data for 2024-09
Data saved to met_data.csv


In [None]:
display(df)

Unnamed: 0,age_range,outcome,involved_person,self_defined_ethnicity,gender,legislation,outcome_linked_to_object_of_search,datetime,removal_of_more_than_outer_clothing,outcome_object,location,operation,officer_defined_ethnicity,type,operation_name,object_of_search
0,,A no further action disposal,False,,Male,Police and Criminal Evidence Act 1984 (section 1),,2024-01-11T16:44:00+00:00,,"{'id': 'bu-no-further-action', 'name': 'A no f...",,False,,Vehicle search,,Offensive weapons
1,over 34,A no further action disposal,True,Other ethnic group - Not stated,Male,Misuse of Drugs Act 1971 (section 23),,2024-01-05T04:05:00+00:00,,"{'id': 'bu-no-further-action', 'name': 'A no f...",,False,Black,Person search,,Controlled drugs
2,over 34,Arrest,True,Asian/Asian British - Pakistani,Male,Police and Criminal Evidence Act 1984 (section 1),,2024-01-25T08:02:00+00:00,,"{'id': 'bu-arrest', 'name': 'Arrest'}","{'latitude': '51.469668', 'street': {'id': 165...",False,Asian,Person search,,Stolen goods
3,10-17,Community resolution,True,White - English/Welsh/Scottish/Northern Irish/...,Male,Police and Criminal Evidence Act 1984 (section 1),,2024-01-06T17:50:00+00:00,,"{'id': 'bu-community-resolution', 'name': 'Com...",,False,White,Person search,,Stolen goods
4,18-24,Community resolution,True,Black/African/Caribbean/Black British - African,Male,Misuse of Drugs Act 1971 (section 23),,2024-01-23T19:00:00+00:00,,"{'id': 'bu-community-resolution', 'name': 'Com...","{'latitude': '51.470361', 'street': {'id': 165...",False,Black,Person search,,Controlled drugs
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
65220,18-24,Community resolution,True,Black/African/Caribbean/Black British - Caribbean,Male,Misuse of Drugs Act 1971 (section 23),False,2024-07-26T23:05:00+00:00,False,"{'id': 'bu-community-resolution', 'name': 'Com...","{'latitude': '51.541306', 'street': {'id': 167...",False,Black,Person search,,Controlled drugs
65221,10-17,A no further action disposal,True,Other ethnic group - Not stated,Male,Misuse of Drugs Act 1971 (section 23),False,2024-07-03T16:30:00+00:00,False,"{'id': 'bu-no-further-action', 'name': 'A no f...","{'latitude': '51.549928', 'street': {'id': 166...",False,Other,Person search,,Controlled drugs
65222,,A no further action disposal,True,Other ethnic group - Not stated,Other,Criminal Justice and Public Order Act 1994 (se...,False,2024-07-22T22:55:00+00:00,False,"{'id': 'bu-no-further-action', 'name': 'A no f...","{'latitude': '51.539856', 'street': {'id': 166...",False,Black,Person search,,Anything to threaten or harm anyone
65223,18-24,A no further action disposal,True,Black/African/Caribbean/Black British - African,Male,Misuse of Drugs Act 1971 (section 23),False,2024-07-21T12:36:00+00:00,False,"{'id': 'bu-no-further-action', 'name': 'A no f...","{'latitude': '51.499354', 'street': {'id': 165...",False,Black,Person search,,Controlled drugs


In [None]:
def clean_data(df):
  df.columns = df.columns.str.lower().str.replace(' ', '_')
  df['datetime'] = pd.to_datetime(df['datetime'], errors='coerce')
  df.fillna('', inplace=True)
  df = df.dropna(subset=['datetime'])
  return df

csv_file = 'met_clean.csv'
df_clean = clean_data(df)
df_clean.to_csv(csv_file, index=False)
display(df)
df.dtypes


Unnamed: 0,age_range,outcome,involved_person,self_defined_ethnicity,gender,legislation,outcome_linked_to_object_of_search,datetime,removal_of_more_than_outer_clothing,outcome_object,location,operation,officer_defined_ethnicity,type,operation_name,object_of_search
0,,A no further action disposal,False,,Male,Police and Criminal Evidence Act 1984 (section 1),,2024-01-11 16:44:00+00:00,,"{'id': 'bu-no-further-action', 'name': 'A no f...",,False,,Vehicle search,,Offensive weapons
1,over 34,A no further action disposal,True,Other ethnic group - Not stated,Male,Misuse of Drugs Act 1971 (section 23),,2024-01-05 04:05:00+00:00,,"{'id': 'bu-no-further-action', 'name': 'A no f...",,False,Black,Person search,,Controlled drugs
2,over 34,Arrest,True,Asian/Asian British - Pakistani,Male,Police and Criminal Evidence Act 1984 (section 1),,2024-01-25 08:02:00+00:00,,"{'id': 'bu-arrest', 'name': 'Arrest'}","{'latitude': '51.469668', 'street': {'id': 165...",False,Asian,Person search,,Stolen goods
3,10-17,Community resolution,True,White - English/Welsh/Scottish/Northern Irish/...,Male,Police and Criminal Evidence Act 1984 (section 1),,2024-01-06 17:50:00+00:00,,"{'id': 'bu-community-resolution', 'name': 'Com...",,False,White,Person search,,Stolen goods
4,18-24,Community resolution,True,Black/African/Caribbean/Black British - African,Male,Misuse of Drugs Act 1971 (section 23),,2024-01-23 19:00:00+00:00,,"{'id': 'bu-community-resolution', 'name': 'Com...","{'latitude': '51.470361', 'street': {'id': 165...",False,Black,Person search,,Controlled drugs
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
65220,18-24,Community resolution,True,Black/African/Caribbean/Black British - Caribbean,Male,Misuse of Drugs Act 1971 (section 23),False,2024-07-26 23:05:00+00:00,False,"{'id': 'bu-community-resolution', 'name': 'Com...","{'latitude': '51.541306', 'street': {'id': 167...",False,Black,Person search,,Controlled drugs
65221,10-17,A no further action disposal,True,Other ethnic group - Not stated,Male,Misuse of Drugs Act 1971 (section 23),False,2024-07-03 16:30:00+00:00,False,"{'id': 'bu-no-further-action', 'name': 'A no f...","{'latitude': '51.549928', 'street': {'id': 166...",False,Other,Person search,,Controlled drugs
65222,,A no further action disposal,True,Other ethnic group - Not stated,Other,Criminal Justice and Public Order Act 1994 (se...,False,2024-07-22 22:55:00+00:00,False,"{'id': 'bu-no-further-action', 'name': 'A no f...","{'latitude': '51.539856', 'street': {'id': 166...",False,Black,Person search,,Anything to threaten or harm anyone
65223,18-24,A no further action disposal,True,Black/African/Caribbean/Black British - African,Male,Misuse of Drugs Act 1971 (section 23),False,2024-07-21 12:36:00+00:00,False,"{'id': 'bu-no-further-action', 'name': 'A no f...","{'latitude': '51.499354', 'street': {'id': 165...",False,Black,Person search,,Controlled drugs


Unnamed: 0,0
age_range,object
outcome,object
involved_person,bool
self_defined_ethnicity,object
gender,object
legislation,object
outcome_linked_to_object_of_search,object
datetime,"datetime64[ns, UTC]"
removal_of_more_than_outer_clothing,object
outcome_object,object


In [None]:
!pip install schedule

import schedule
import time

def daily_update():
  current_date = datetime.date.today()
  formatted_date = current_date.strftime(%Y-%mm)

  new_data = fetch_data(FORCE_ID, formatted_date)

  if new_data:
    new_df = pd.DataFrame(new_data)
    new_df_clean = clean_data(new_df)
    new_df_clean.to_csv(csv_file, mode = 'a', header=False, index = False)
    print('Appended new data for {formattted_date}')
  else:
    print('No new data found')

schedule.every().day.at('00:00').do(daily_update)

while True:
  schedule.run_pending()
  time.sleep(60)