## Topic identification

This notebook demonstrates how users can use PFD Toolkit to identify recurring topics or themes from PFD reports.

In [1]:
from pfd_toolkit import load_reports, Extractor, LLM
from dotenv import load_dotenv
import os

# Load reports
reports = load_reports(n_reports=100)

# Set up LLM client
load_dotenv("api.env")
openai_api_key = os.getenv("OPENAI_API_KEY")

llm_client = LLM(api_key=openai_api_key, max_workers=40)

# Set up Extractor
extractor = Extractor(
    llm=llm_client,
    reports=reports
)

# Summarise reports
extractor.summarise()

                                                                      

Unnamed: 0,URL,ID,Date,CoronerName,Area,Receiver,InvestigationAndInquest,CircumstancesOfDeath,MattersOfConcern,summary
0,https://www.judiciary.uk/prevention-of-future-...,2025-0248,2025-05-28,Clare Bailey,Teesside and Hartlepool,1 Department of Health and Social Care 2 Chief...,Mr Dean Bradley died on 15 th October 2021 at ...,At approximately 0300 on 15 th October 2021 Mr...,During the course of the investigation my inqu...,Mr Dean Bradley died by hanging on 15 October ...
1,https://www.judiciary.uk/prevention-of-future-...,2025-0243,2025-05-27,Andrew Cousins,Blackpool & Fylde,BARCHESTER HEALTHCARE LIMITED 1,"On 30 April 2025 and 23 May 2025, at an inques...",I returned the following in box 4 of the Recor...,During the course of the inquest the evidence ...,"Mr Keith Inseon, a resident at Glenroyd Care H..."
2,https://www.judiciary.uk/prevention-of-future-...,2025-0244,2025-05-27,Peter Merchant,West Yorkshire West,"1 , Chief Constable West Yorkshire Police 1",On 15 February 2024 the death of Paul Andrew A...,"As identified above, Paul Alexander had a long...",During the course of the investigation my inqu...,"Paul Andrew Alexander, who had a long history ..."
3,https://www.judiciary.uk/prevention-of-future-...,2025-0245,2025-05-27,Nadia Persaud,East London,", Chief Executive Officer, Barts Health NHS Fo...",On the 13 June 2024 I commenced an investigati...,Abdirahman Afrah began to suffer from chest pa...,During the course of the inquest the evidence ...,"Abdirahman Abdirizaq Afrah, aged 17, died from..."
4,https://www.judiciary.uk/prevention-of-future-...,2025-0246,2025-05-27,Rebecca Sutton,Durham and Darlington,"1. Deputy Chief Constable , Durham Constabular...",On 7 January 2025 an investigation into the de...,The Deceased had a long history of mental heal...,During the course of the inquest the evidence ...,"Sophie Ann Louise Cotton, 24, died by suicide ..."
...,...,...,...,...,...,...,...,...,...,...
95,https://www.judiciary.uk/prevention-of-future-...,2025-0152,2025-03-18,Joanne Andrews,"West Sussex, Brighton and Hove",1. The President of the Royal College of Obste...,On 3 October 2023 I commenced an investigation...,At 28 weeks of gestation it was noted on scans...,During the course of the inquest the evidence ...,"Alonzo Christopher Andrew Wood, born on 23 Sep..."
96,https://www.judiciary.uk/prevention-of-future-...,2025-0149,2025-03-18,Andrew Hetherington,Northumberland,"CHIEF EXECUTIVE, NORTHUMBRIA HEALTHCARE NHS FO...",On 1 May 2024 I commenced an investigation int...,The deceased had considerable underlying natur...,During the course of the inquest the evidence ...,Renate Mark died from a head injury sustained ...
97,https://www.judiciary.uk/prevention-of-future-...,2025-0144,2025-03-17,Sean Horstead,Essex,1. Chief Executive Officer of Essex Partnershi...,On 31 st October 2023 I commenced an investiga...,On the 23 rd September 2023 after concerns wer...,: A significant number of the serious causativ...,"Darren Neil Turner, aged 37, died by suicide o..."
98,https://www.judiciary.uk/prevention-of-future-...,2025-0145,2025-03-17,Rachel Knight,South Wales Central,The Chief Executive Cardiff & Vale University ...,On 24 October 2023 I commenced an investigatio...,Mr Colley was left unsupervised with cot sides...,During the course of the inquest the evidence ...,"Colin Colley, aged 87 with dementia, frailty, ..."


In [2]:
# Estimate the total number of tokens in the summary column
extractor.estimate_tokens()

18347

In [3]:
extractor.discover_themes(max_themes=5)

pfd_toolkit.extractor.DiscoveredThemes

In [4]:
print(extractor.identified_themes)

```json
{
  "mental_health": {
    "type": "bool",
    "description": "Concerns related to mental health care including risk assessments, communication failures, inadequate crisis response, and insufficient mental health service provision."
  },
  "communication_failures": {
    "type": "bool",
    "description": "Failures in communication and information sharing between healthcare providers, emergency services, social care, families, and other agencies leading to missed or delayed interventions."
  },
  "care_quality": {
    "type": "bool",
    "description": "Issues involving poor clinical care, inadequate monitoring, incomplete or inaccurate record-keeping, and failure to follow protocols or guidelines in healthcare and social care settings."
  },
  "systemic_delays": {
    "type": "bool",
    "description": "Delays caused by systemic factors such as ambulance response times, hospital overcrowding, social care shortages, and lack of available beds or resources impacting timely treat

In [5]:
assigned_reports = extractor.reset().extract_features(force_assign=True,
                                              allow_multiple=True)

Extracting features: 100%|██████████| 100/100 [00:05<00:00, 17.64it/s]


In [6]:
assigned_reports.head(10)

Unnamed: 0,URL,ID,Date,CoronerName,Area,Receiver,InvestigationAndInquest,CircumstancesOfDeath,MattersOfConcern,mental_health,communication_failures,care_quality,systemic_delays,safety_management
0,https://www.judiciary.uk/prevention-of-future-...,2025-0248,2025-05-28,Clare Bailey,Teesside and Hartlepool,1 Department of Health and Social Care 2 Chief...,Mr Dean Bradley died on 15 th October 2021 at ...,At approximately 0300 on 15 th October 2021 Mr...,During the course of the investigation my inqu...,True,True,False,False,False
1,https://www.judiciary.uk/prevention-of-future-...,2025-0243,2025-05-27,Andrew Cousins,Blackpool & Fylde,BARCHESTER HEALTHCARE LIMITED 1,"On 30 April 2025 and 23 May 2025, at an inques...",I returned the following in box 4 of the Recor...,During the course of the inquest the evidence ...,False,False,True,False,True
2,https://www.judiciary.uk/prevention-of-future-...,2025-0244,2025-05-27,Peter Merchant,West Yorkshire West,"1 , Chief Constable West Yorkshire Police 1",On 15 February 2024 the death of Paul Andrew A...,"As identified above, Paul Alexander had a long...",During the course of the investigation my inqu...,True,True,False,True,False
3,https://www.judiciary.uk/prevention-of-future-...,2025-0245,2025-05-27,Nadia Persaud,East London,", Chief Executive Officer, Barts Health NHS Fo...",On the 13 June 2024 I commenced an investigati...,Abdirahman Afrah began to suffer from chest pa...,During the course of the inquest the evidence ...,False,True,True,True,False
4,https://www.judiciary.uk/prevention-of-future-...,2025-0246,2025-05-27,Rebecca Sutton,Durham and Darlington,"1. Deputy Chief Constable , Durham Constabular...",On 7 January 2025 an investigation into the de...,The Deceased had a long history of mental heal...,During the course of the inquest the evidence ...,True,True,False,True,True
5,https://www.judiciary.uk/prevention-of-future-...,2025-0241,2025-05-23,Mary Hassell,Inner North London,1. Commissioner Metropolitan Police Service (M...,"On 12 February 2016, I commenced an investigat...",Lewis Johnson died as a consequence of a road ...,"2 During the course of the inquest, the eviden...",False,True,False,True,True
6,https://www.judiciary.uk/prevention-of-future-...,2025-0242,2025-05-23,Mary Hassell,Inner North London,1. Director General Independent Office for Pol...,"On 12 February 2016, I commenced an investigat...",Lewis Johnson died as a consequence of a road ...,"2 During the course of the inquest, the eviden...",False,False,False,False,True
7,https://www.judiciary.uk/prevention-of-future-...,2025-0247,2025-05-23,Nadia Persaud,East London,"1. , CEO, North East London Foundation Trust (...",On 27 November 2024 I commenced an investigati...,Mr. Fraser was a 37-year-old gentleman who had...,During the course of the inquest the evidence ...,True,True,True,False,False
8,https://www.judiciary.uk/prevention-of-future-...,2025-0236,2025-05-21,Kate Robertson,North West Wales,Betsi Cadwaladr University Health Board (BCUHB) 1,On 20 May 2024 I commenced an investigation in...,The circumstances of the death are as follows ...,"During the course of the inquest, the evidence...",False,True,True,False,True
9,https://www.judiciary.uk/prevention-of-future-...,2025-0240,2025-05-21,Andrew Morse,South Wales Central,The Chief Executive Cardiff & Vale University ...,On 30 October 2023 I commenced an investigatio...,These were recorded as follows Robert Maxwell ...,During the course of the inquest the evidence ...,True,True,False,False,False
