# HES Outpatients Dataset
## 1. Summary
The information below is retrieved from the Health Data Gateway API developed by NHS England, with additional fields added by UK LLC.

In [1]:
# define target dataset to document
schema = 'nhsd'
table = 'HESOP'
version = 'v0002'
# import functions from script helper
import sys
script_fp = "../../../../scripts/"
sys.path.insert(0, script_fp)
from data_doc_helper import DocHelper
# create instance
document = DocHelper(schema, table, version, script_fp)
# markdown/code hybrid cell module requirement
from IPython.display import display, Markdown

In [2]:
# get api data
dataset = document.get_api_data()
display(Markdown("**NHS England title of dataset:** "+dataset['datasetfields']['datautility']['title']))
display(Markdown("**Dataset name in UK LLC TRE**: nhsd.HESAPC"))  
display(Markdown("**Nested datasets**: HESAPC encompasses the following three datasets: Maternity dataset (HESAPC_MAT), Critical Care minimum dataset (HESCC) and the retired Augmented Care Periods dataset (HESAPC_ACP)"))  
display(Markdown("**Short abstract:** "+dataset['datasetfields']['abstract']))
display(Markdown("**Extended abstract:** "+dataset['datasetv2']['documentation']['description']))
display(Markdown("**Geographical coverage:** "+dataset['datasetfields']['geographicCoverage'][0]))
display(Markdown("**Temporal coverage:** "+dataset['datasetfields']['datasetStartDate']))
display(Markdown("**Data available from**: 01/04/1997 onwards"))
display(Markdown("**Typical age range:** "+dataset['datasetfields']['ageBand']))
display(Markdown("**Collection situation:** "+dataset['datasetv2']['provenance']['origin']['collectionSituation'][0]))
display(Markdown("**Purpose:** "+dataset['datasetv2']['provenance']['origin']['purpose'][0]))
display(Markdown("**Source:** "+dataset['datasetv2']['provenance']['origin']['source'][0]))
display(Markdown("**Pathway:** "+dataset['datasetv2']['coverage']['pathway']))
display(Markdown("**Information collected**: Patient demographics, date and source of admission, waiting time, reason for admission, clinical diagnosis and procedures performed, and date and destination of discharge"))  
display(Markdown("**Structure of dataset**: Data are organised into episodes and spells - see section 2 for further details"))  
display(Markdown("**Semantic annotations:** "+dataset['datasetv2']['accessibility']['formatAndStandards']['vocabularyEncodingScheme'][0]))
display(Markdown("**Data models:** "+dataset['datasetv2']['accessibility']['formatAndStandards']['conformsTo'][0]))
display(Markdown("**Language:** "+dataset['datasetv2']['accessibility']['formatAndStandards']['language'][0]))
display(Markdown("**Update frequency in UK LLC TRE**: Quarterly"))  
display(Markdown("**Dataset version in UK LLC TRE**: TBC"))
display(Markdown("**Summary of changes between dataset versions**: TBC"))  
display(Markdown("**Data quality issues**: TBC"))  
display(Markdown("**Restrictions to data usage**: Medical purposes only"))  
display(Markdown("**Further information**: https://digital.nhs.uk/data-and-information/data-tools-and-services/data-services/hospital-episode-statistics"))


**NHS England title of dataset:** Hospital Episode Statistics Outpatients

**Dataset name in UK LLC TRE**: nhsd.HESAPC

**Nested datasets**: HESAPC encompasses the following three datasets: Maternity dataset (HESAPC_MAT), Critical Care minimum dataset (HESCC) and the retired Augmented Care Periods dataset (HESAPC_ACP)

**Short abstract:** Record-level patient data set of patients attending outpatient clinics at NHS hospitals in England. A record represents one appointment.

**Extended abstract:** Hospital Episode Statistics (HES) is a database containing details of all admissions, A and E attendances and outpatient appointments at NHS hospitals in England.

Initially this data is collected during a patient's time at hospital as part of the Commissioning Data Set (CDS). This is submitted to NHS Digital for processing and is returned to healthcare providers as the Secondary Uses Service (SUS) data set and includes information relating to payment for activity undertaken. It allows hospitals to be paid for the care they deliver. 

This same data can also be processed and used for non-clinical purposes, such as research and planning health services. Because these uses are not to do with direct patient care, they are called 'secondary uses'. This is the HES data set.

HES data covers all NHS Clinical Commissioning Groups (CCGs) in England, including:

private patients treated in NHS hospitals
patients resident outside of England
care delivered by treatment centres (including those in the independent sector) funded by the NHS
Each HES record contains a wide range of information about an individual patient admitted to an NHS hospital, including:

clinical information about diagnoses and operations
patient information, such as age group, gender and ethnicity
administrative information, such as dates and methods of admission and discharge
geographical information such as where patients are treated and the area where they live
We apply a strict statistical disclosure control in accordance with the NHS Digital protocol, to all published HES data. This suppresses small numbers to stop people identifying themselves and others, to ensure that patient confidentiality is maintained.

https://digital.nhs.uk/data-and-information/publications/statistical/hospital-outpatient-activity

**Geographical coverage:** United Kingdom,England

**Temporal coverage:** 2003-04-01

**Data available from**: 01/04/1997 onwards

**Typical age range:** 0-150

**Collection situation:** OUTPATIENTS

**Purpose:** CARE

**Source:** EPR

**Pathway:** Secondary Care pathway. This dataset covers outpatient appointments at hospitals in England. It includes information on the treatment and outcome of the appointment.

**Information collected**: Patient demographics, date and source of admission, waiting time, reason for admission, clinical diagnosis and procedures performed, and date and destination of discharge

**Structure of dataset**: Data are organised into episodes and spells - see section 2 for further details

**Semantic annotations:** OPCS4

**Data models:** NHS DATA DICTIONARY

**Language:** en

**Update frequency in UK LLC TRE**: Quarterly

**Dataset version in UK LLC TRE**: TBC

**Summary of changes between dataset versions**: TBC

**Data quality issues**: TBC

**Restrictions to data usage**: Medical purposes only

**Further information**: https://digital.nhs.uk/data-and-information/data-tools-and-services/data-services/hospital-episode-statistics

## 2. Metrics
Below we include tables that summarise the HESOP dataset in the UK LLC TRE.

In [3]:
# group extract by date
gb = document.get_extract_count()
print(gb.to_markdown(index=False, tablefmt="fancy_grid"))
#display(gb_cohort)

╒════════════════╤═════════╕
│ extract_date   │   count │
╞════════════════╪═════════╡
│ 2022-01-07     │ 5592065 │
├────────────────┼─────────┤
│ 2022-02-11     │  312950 │
├────────────────┼─────────┤
│ 2022-03-04     │     128 │
├────────────────┼─────────┤
│ 2022-06-10     │  617775 │
├────────────────┼─────────┤
│ 2022-06-30     │    4792 │
├────────────────┼─────────┤
│ 2022-08-25     │  513438 │
├────────────────┼─────────┤
│ 2022-12-21     │  590237 │
├────────────────┼─────────┤
│ 2023-04-13     │  382208 │
├────────────────┼─────────┤
│ total          │ 8013593 │
╘════════════════╧═════════╛


**Table 1** The number of HESOP records in the UK LLC TRE by extract date

In [4]:
# group extract by date
gb_cohort = document.get_cohort_count()
print(gb_cohort.to_markdown(index=False, tablefmt="fancy_grid"))
#display(gb_cohort)

╒════════════════╤═════════╕
│ cohort         │   count │
╞════════════════╪═════════╡
│ ALSPAC         │    5677 │
├────────────────┼─────────┤
│ BCS70          │    5681 │
├────────────────┼─────────┤
│ BIB            │   26088 │
├────────────────┼─────────┤
│ ELSA           │    6952 │
├────────────────┼─────────┤
│ EPICN          │   14684 │
├────────────────┼─────────┤
│ EXCEED         │    9144 │
├────────────────┼─────────┤
│ FENLAND        │   10027 │
├────────────────┼─────────┤
│ GLAD           │   44330 │
├────────────────┼─────────┤
│ MCS            │   17080 │
├────────────────┼─────────┤
│ NCDS58         │    6143 │
├────────────────┼─────────┤
│ NEXTSTEP       │    4433 │
├────────────────┼─────────┤
│ NIHRBIO_COPING │   16052 │
├────────────────┼─────────┤
│ NSHD46         │    2861 │
├────────────────┼─────────┤
│ TRACKC19       │   13247 │
├────────────────┼─────────┤
│ TWINSUK        │   12553 │
├────────────────┼─────────┤
│ UKHLS          │    6609 │
├─────────────

**Table 2** The number of participants from each LPS that are represented in the HESOP dataset in the UK LLC TRE

## 3. Helpful syntax
Below we will include syntax that may be helpful to other researchers in the UK LLC TRE. For longer scripts, we will include a snippet of the code plus a link to Git where you can find the full script. 