## ECDS emergency care admission vs SUS hospital admission validation

There is a delay between patients being admitted to hospital with COVID-19 and this data being made available.  SUS-APCS data is the 'gold standard' for COVID-19 hospital admission but is only created once the patient is discharged (home, elswehere or died).  Data for ongoing hospital spells at the time of the SUS extract is therefore not available, creating an ascertainment bias against longer spells and more recent spells. 

This means we cannot rapidly evaluate vaccine effectiveness with respect to reducing hospital admission.

As a large proportion of hospital admission comes through A&E attendance, emergency admission data through ECDS may provide a good enough picture of hospital admission.  Below is a validation of this.

### Methods

* Patients hospitalised due to COVID-19 are first identified using SUS.
* Patients admitted to emergency care due to COVID-19 are identified using the criteria below:
    * All patients admitted to A&E AND then discharged to hospital or ICU are selected.
    * In these patients, those admitted due to COVID-19 are filtered to include those that
        * Are admitted with positive COVID-19 code OR
        * Have had positive COVID-19 test in the 2 weeks prior to admission OR
        * Have been recorded as COVID-19 positive in primary care 2 weeks prior to admission (primary care positive test, primary care covid code, primary care covid sequalae). 
* Ability of ECDS data to identify those admitted to hospital due to COVID-19 vs SUS data is assessed.


In [46]:
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib.colors import ListedColormap


%matplotlib inline

pd.options.display.float_format = '{:.0f}'.format

In [40]:
df = pd.read_csv('../output/input.csv')
ae_discharge_dict = {"discharged_to_ward": 306706006, "discharged_to_icu": 1066391000000105}
ae_discharge_list = [value for (key, value) in ae_discharge_dict.items()]


positive_covid_patients_sus = df[df['primary_covid_hospital_admission'].notna()]
negative_covid_patients_sus = df[~df['primary_covid_hospital_admission'].notna()]

positive_covid_patients_ecds = df[(df['ae_attendance']==1) & ((df['ae_attendance_covid_status']==1) | (df['positive_covid_test_before_ae_attendance'] ==1) | (df['covid_primary_care_before_ae_attendance'] ==1))]
negative_covid_patients_ecds = df[~(df['ae_attendance']==1)  & ((df['ae_attendance_covid_status']==1) | (df['positive_covid_test_before_ae_attendance'] ==1) | (df['covid_primary_care_before_ae_attendance'] ==1))]


sus_patients_positive = list(positive_covid_patients_sus['patient_id'])
ecds_patients_positive = list(positive_covid_patients_ecds['patient_id'])
sus_patients_negative = list(negative_covid_patients_sus['patient_id'])
ecds_patients_negative = list(negative_covid_patients_ecds['patient_id'])

In [50]:
sus_pos_ecds_pos = len(list(set(sus_patients_positive) & set(ecds_patients_positive)))
sus_pos_ecds_neg = len(list(set(sus_patients_positive) & set(ecds_patients_negative)))
sus_neg_ecds_pos = len(list(set(sus_patients_negative) & set(ecds_patients_positive)))
sus_neg_ecds_neg = len(list(set(sus_patients_negative) & set(ecds_patients_negative)))

pd.DataFrame([[sus_pos_ecds_pos, sus_neg_ecds_pos], [sus_pos_ecds_neg, sus_neg_ecds_neg]], columns=["SUS-positive", "SUS-negative"], index=["ECDS-positive", "ECDS-negative"])


Unnamed: 0,SUS-positive,SUS-negative
ECDS-positive,1153,2744
ECDS-negative,1778,4107


In [41]:
#sensitivity - number of sus identified by ecds
#specificity - number of those not in sus who are not in ecds

sensitivity = len(list(set(sus_patients_positive).intersection(ecds_patients_positive)))/len(sus_patients_positive)
print(f"Sensitivity: {sensitivity}")

specificity = len(list(set(sus_patients_negative).intersection(ecds_patients_negative)))/len(sus_patients_negative)
print(f"Specificity : {specificity}")


Sensitivity: 0.38433333333333336
Specificity : 0.5867142857142857


In [42]:
correct_patient_ids = list(set(sus_patients_positive) & set(ecds_patients_positive))
correct_patients = positive_covid_patients_ecds[positive_covid_patients_ecds['patient_id'].isin(correct_patient_ids)]
selected_variables = ["ae_attendance_covid_status", "positive_covid_test_before_ae_attendance", "positive_covid_test_before_ae_attendance", "covid_primary_care_before_ae_attendance"]

vars_dict = {}

for var in selected_variables:
    nums = correct_patients[correct_patients[var] ==1][var].count()
    total = correct_patients.shape[0]
    vars_dict[var] = nums/total

vars_dict

{'ae_attendance_covid_status': 0.9245446660884649,
 'positive_covid_test_before_ae_attendance': 0.5125758889852559,
 'covid_primary_care_before_ae_attendance': 0.49176062445793584}