## Cleaner demo

Any changes to the Cleaner module should only be pushed to main if the below code works without issue.

The Cleaner class is primarily respomnsible for correcting spelling errors contained within PFD reports. It also standardises coroner names into _Initial. LastName_ format, which we've used to assist with coroner-level filtering.

In [1]:
from pfd_toolkit import Cleaner, LLM
from dotenv import load_dotenv
import os
import pandas as pd

# Read unclean reports from file (these were scraped with the Scraper class)
unclean_reports = pd.read_csv('../data/testreports.csv')

# Get API key
load_dotenv("api.env")
openai_api_key = os.getenv("OPENAI_API_KEY")

# Set up LLM client
llm_client = LLM(api_key=openai_api_key, 
                 max_workers=50)

# Run cleaner
cleaner = Cleaner(
    llm=llm_client,
    reports=unclean_reports)


cleaned_reports = cleaner.clean_reports()

Processing Fields: 100%|██████████| 6/6 [00:29<00:00,  4.98s/it]


In [2]:
cleaned_reports.head(n=10)

#cleaned_reports.to_csv('../data/testreports_cleaned.csv')

Unnamed: 0.1,Unnamed: 0,URL,ID,Date,CoronerName,Area,Receiver,InvestigationAndInquest,CircumstancesOfDeath,MattersOfConcern
0,0,https://www.judiciary.uk/prevention-of-future-...,2025-0140,2025-02-01,L. Brown,West London,Revon Healthcare,On 18 December 2023 I commenced an investigati...,James was found deceased in his room at Surbit...,During the inquest the court was advised that ...
1,1,https://www.judiciary.uk/prevention-of-future-...,2025-0136,2025-11-03,S. Ridge,Surrey,HMPPS,N/A: Not found,During the course of the inquest the court hea...,Probation staff are not always aware of or hav...
2,2,https://www.judiciary.uk/prevention-of-future-...,2025-0121,2025-04-03,N. Walker,"Hampshire, Portsmouth and Southampton",National Institute for Health and Care Excelle...,On 19th September 2023 an investigation was co...,Chloe Elizabeth Burgess was found deceased at ...,The inquest heard evidence that the potential ...
3,3,https://www.judiciary.uk/prevention-of-future-...,2025-0115,2025-02-28,A. Cox,Cornwall and the Isles of Scilly,MP; Secretary of State for Health & Social Care,"On 27 February 2025, I concluded a four-day ju...",Despite appropriate treatment by paramedics an...,Delay in ambulance response attributable to de...
4,4,https://www.judiciary.uk/prevention-of-future-...,2025-0114,2025-02-28,A. Cox,Cornwall and the Isles of Scilly,"Chief Constable, Devon & Cornwall Constabulary...","On 27th February 2025, I concluded a four-day ...",Mr Campbell had a history of recreational drug...,Delays in ambulance attendance. I have written...
5,5,https://www.judiciary.uk/prevention-of-future-...,2025-0113,2025-02-28,H. Westerman,"Shropshire, Telford & Wrekin",NHS England; Chief Executive of Shrewsbury and...,"On 12 July 2023 Mr Ellery, H.M. Senior Coroner...",Mr Green was admitted to The Royal Shrewsbury ...,(1) Once any patient at the Royal Shrewsbury H...
6,6,https://www.judiciary.uk/prevention-of-future-...,2025-0110,2025-02-27,R. Middleton,Dorset,The Home Office,"On the 13th June 2024, an investigation was co...",Mr Leatham-Prosser had started misusing ketami...,N/A: Not found
7,7,https://www.judiciary.uk/prevention-of-future-...,2025-0057,2025-01-31,J. Turner,"West Sussex, Brighton and Hove",Ministry of Defence,On 1 November 2023 I commenced an investigatio...,Mr Taylor had rapidly fallen into drug addicti...,When found to have taken illicit drugs months ...
8,8,https://www.judiciary.uk/prevention-of-future-...,2025-0055,2025-01-31,N. Parsley,Suffolk,Secretary of State Department of Health and So...,On 13th May 2024 I commenced an investigation ...,Kim Robinson's death was recognised at 05:16 o...,Following Kim's tragic death the GP who had pr...
9,9,https://www.judiciary.uk/prevention-of-future-...,2025-0048,2025-01-24,X. Mooyaart,Inner South London,NHS England,On 1 July 2021 an investigation into the death...,Mr Marriage had a longstanding diagnosis of id...,There are cohorts of patients who are medicati...


Below, we can see the output of our cleaning instance:

Let's compare it with the original, unclean reports that we imported earlier. Even though the below content in concatinated, we can see that the above has correctly standardised the Coroner's name into the desired format. There are a couple of instances in the longer sections where improper spaces have been removed (e.g. "On 19 th September" has been changed to "On 19th September").

In [3]:
unclean_reports.head(n=10)

Unnamed: 0.1,Unnamed: 0,URL,ID,Date,CoronerName,Area,Receiver,InvestigationAndInquest,CircumstancesOfDeath,MattersOfConcern
0,0,https://www.judiciary.uk/prevention-of-future-...,2025-0140,2025-02-01,Lydia Brown,West London,Revon Healthcare,On 18 December 2023 I commenced an investigati...,James was found deceased in his room at [REDAC...,During the course of the inquest the evidence ...
1,1,https://www.judiciary.uk/prevention-of-future-...,2025-0136,2025-11-03,Susan Ridge,Surrey,HMPPS,N/A: Not found,During the course of the inquest the court hea...,The MATTERS OF CONCERN are: a.Probation staff ...
2,2,https://www.judiciary.uk/prevention-of-future-...,2025-0121,2025-04-03,Nicholas Walker,"Hampshire, Portsmouth and Southampton",1. National Institute for Health and Care Exce...,On 19 th September 2023 an investigation was c...,Chloe Elizabeth Burgess was found deceased at ...,During the inquest the evidence revealed matte...
3,3,https://www.judiciary.uk/prevention-of-future-...,2025-0115,2025-02-28,Andrew Cox,Cornwall and the Isles of Scilly,"1. , MP, Secretary of State for Health & Socia...","On 27 February 2025, I concluded a four-day ju...",The jury recorded the following: Despite appro...,"During the course of these inquests, the evide..."
4,4,https://www.judiciary.uk/prevention-of-future-...,2025-0114,2025-02-28,Andrew Cox,Cornwall and the Isles of Scilly,"1. , Chief Constable, Devon & Cornwall Constab...","On 27/2/25, I concluded a four-day jury inques...",The relevant background circumstances are that...,"During the course of these inquests, the evide..."
5,5,https://www.judiciary.uk/prevention-of-future-...,2025-0113,2025-02-28,Heath Westerman,"Shropshire, Telford & Wrekin","1. NHS England, Wellington House, 133-155 Wate...","On 12 July 2023 Mr Ellery, H.M. Senior Coroner...",Mr Green was admitted to The Royal Shrewsbury ...,During the course of the inquest the evidence ...
6,6,https://www.judiciary.uk/prevention-of-future-...,2025-0110,2025-02-27,Richard Middleton,Dorset,The Home Office,"On the 13 th June 2024, an investigation was c...",Mr Leatham-Prosser had started misusing ketami...,N/A: Not found
7,7,https://www.judiciary.uk/prevention-of-future-...,2025-0057,2025-01-31,Joseph Turner,"West Sussex, Brighton and Hove",Ministry of Defence,On 01 November 2023 I commenced an investigati...,Mr Taylor had rapidly fallen into drug addicti...,During the course of the investigation my inqu...
8,8,https://www.judiciary.uk/prevention-of-future-...,2025-0055,2025-01-31,Nigel Parsley,Suffolk,Secretary of State Department of Health and So...,On 13 th May 2024 I commenced an investigation...,Kim Robinson's death was recognised at 05:16 o...,During the course of the inquest the evidence ...
9,9,https://www.judiciary.uk/prevention-of-future-...,2025-0048,2025-01-24,Xavier Mooyaart,Inner South London,NHS England,On 1 July 2021 an investigation into the death...,Mr Marriage had a longstanding diagnosis of id...,During the course of the inquest the evidence ...


In [1]:
from pfd_toolkit import load_reports, Cleaner, LLM
from dotenv import load_dotenv
import os

# Get API key
load_dotenv("api.env")
openai_api_key = os.getenv("OPENAI_API_KEY")

# Set up LLM client
llm_client = LLM(api_key=openai_api_key, 
                 max_workers=50)

reports_samp = load_reports(n_reports=10)


cleaner = Cleaner(
    llm=llm_client,
    reports=reports_samp)

summarised_reports = cleaner.summarise()
summarised_reports

                                                                   

Unnamed: 0,URL,ID,Date,CoronerName,Area,Receiver,InvestigationAndInquest,CircumstancesOfDeath,MattersOfConcern,summary
0,https://www.judiciary.uk/prevention-of-future-...,2025-0248,2025-05-28,Clare Bailey,Teesside and Hartlepool,1 Department of Health and Social Care 2 Chief...,Mr Dean Bradley died on 15 th October 2021 at ...,At approximately 0300 on 15 th October 2021 Mr...,During the course of the investigation my inqu...,Mr Dean Bradley died by hanging on 15 October ...
1,https://www.judiciary.uk/prevention-of-future-...,2025-0243,2025-05-27,Andrew Cousins,Blackpool & Fylde,BARCHESTER HEALTHCARE LIMITED 1,"On 30 April 2025 and 23 May 2025, at an inques...",I returned the following in box 4 of the Recor...,During the course of the inquest the evidence ...,"Mr Keith Inseon, a resident at Glenroyd Care H..."
2,https://www.judiciary.uk/prevention-of-future-...,2025-0244,2025-05-27,Peter Merchant,West Yorkshire West,"1 , Chief Constable West Yorkshire Police 1",On 15 February 2024 the death of Paul Andrew A...,"As identified above, Paul Alexander had a long...",During the course of the investigation my inqu...,"Paul Andrew Alexander, who had a long history ..."
3,https://www.judiciary.uk/prevention-of-future-...,2025-0245,2025-05-27,Nadia Persaud,East London,", Chief Executive Officer, Barts Health NHS Fo...",On the 13 June 2024 I commenced an investigati...,Abdirahman Afrah began to suffer from chest pa...,During the course of the inquest the evidence ...,"Abdirahman Abdirizaq Afrah, aged 17, died from..."
4,https://www.judiciary.uk/prevention-of-future-...,2025-0246,2025-05-27,Rebecca Sutton,Durham and Darlington,"1. Deputy Chief Constable , Durham Constabular...",On 7 January 2025 an investigation into the de...,The Deceased had a long history of mental heal...,During the course of the inquest the evidence ...,"Sophie Ann Louise Cotton, aged 24, with a hist..."
5,https://www.judiciary.uk/prevention-of-future-...,2025-0241,2025-05-23,Mary Hassell,Inner North London,1. Commissioner Metropolitan Police Service (M...,"On 12 February 2016, I commenced an investigat...",Lewis Johnson died as a consequence of a road ...,"2 During the course of the inquest, the eviden...","Lewis Johnson, aged 18, died in a road traffic..."
6,https://www.judiciary.uk/prevention-of-future-...,2025-0242,2025-05-23,Mary Hassell,Inner North London,1. Director General Independent Office for Pol...,"On 12 February 2016, I commenced an investigat...",Lewis Johnson died as a consequence of a road ...,"2 During the course of the inquest, the eviden...","Lewis Johnson, aged 18, died in a road traffic..."
7,https://www.judiciary.uk/prevention-of-future-...,2025-0247,2025-05-23,Nadia Persaud,East London,"1. , CEO, North East London Foundation Trust (...",On 27 November 2024 I commenced an investigati...,Mr. Fraser was a 37-year-old gentleman who had...,During the course of the inquest the evidence ...,"George Kenneth Fraser, aged 37, with schizophr..."
8,https://www.judiciary.uk/prevention-of-future-...,2025-0236,2025-05-21,Kate Robertson,North West Wales,Betsi Cadwaladr University Health Board (BCUHB) 1,On 20 May 2024 I commenced an investigation in...,The circumstances of the death are as follows ...,"During the course of the inquest, the evidence...",Etta-Lili Stockwell-Parry was born at 40+13 we...
9,https://www.judiciary.uk/prevention-of-future-...,2025-0240,2025-05-21,Andrew Morse,South Wales Central,The Chief Executive Cardiff & Vale University ...,On 30 October 2023 I commenced an investigatio...,These were recorded as follows Robert Maxwell ...,During the course of the inquest the evidence ...,Robert Maxwell Smith died by suicide on 26 Oct...


In [2]:
summarised_reports.to_csv('../data/summarised_reports.csv')