## ECDS emergency care admission vs SUS hospital admission validation

There is a delay between patients being admitted to hospital with COVID-19 and this data being made available.  SUS-APCS data is the 'gold standard' for COVID-19 hospital admission but is only created once the patient is discharged (home, elswehere or died).  Data for ongoing hospital spells at the time of the SUS extract is therefore not available, creating an ascertainment bias against longer spells and more recent spells. 

This means we cannot rapidly evaluate vaccine effectiveness with respect to reducing hospital admission.

As a large proportion of hospital admission comes through A&E attendance, emergency admission data through ECDS may provide a good enough picture of hospital admission.  Below is a validation of this.

### Methods

* Patients hospitalised due to COVID-19 are first identified using SUS.
* Patients admitted to emergency care due to COVID-19 are identified using the criteria below:
    * All patients admitted to A&E AND then discharged to hospital or ICU are selected.
    * In these patients, those admitted due to COVID-19 are filtered to include those that
        * Are admitted with positive COVID-19 code OR
        * Have had positive COVID-19 test in the 2 weeks prior to admission OR
        * Have been recorded as COVID-19 positive in primary care 2 weeks prior to admission (primary care positive test, primary care covid code, primary care covid sequalae). 
* Ability of ECDS data to identify those admitted to hospital due to COVID-19 vs SUS data is assessed.


In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib.colors import ListedColormap


%matplotlib inline

pd.options.display.float_format = '{:.0f}'.format

In [2]:
df = pd.read_csv('../output/input.csv')


ae_discharge_dict = {"discharged_to_ward": 306706006, "discharged_to_icu": 1066391000000105}
ae_discharge_list = [value for (key, value) in ae_discharge_dict.items()]


positive_covid_patients_sus = df[df['primary_covid_hospital_admission'].notna()]
negative_covid_patients_sus = df[~df['primary_covid_hospital_admission'].notna()]

positive_covid_patients_ecds = df[(df['ae_attendance']==1) & ((df['ae_attendance_covid_status']==1) | (df['positive_covid_test_before_ae_attendance'] ==1) | (df['covid_primary_care_before_ae_attendance'] ==1))]
negative_covid_patients_ecds = df[~(df['ae_attendance']==1)  & ((df['ae_attendance_covid_status']==1) | (df['positive_covid_test_before_ae_attendance'] ==1) | (df['covid_primary_care_before_ae_attendance'] ==1))]


sus_patients_positive = set(list(positive_covid_patients_sus['patient_id']))
ecds_patients_positive = set(list(positive_covid_patients_ecds['patient_id']))
sus_patients_negative = set(list(negative_covid_patients_sus['patient_id']))
ecds_patients_negative = set(list(negative_covid_patients_ecds['patient_id']))

In [3]:
sus_pos_ecds_pos = len(list(set(sus_patients_positive) & set(ecds_patients_positive)))
sus_pos_ecds_neg = len(list(set(sus_patients_positive) & set(ecds_patients_negative)))
sus_neg_ecds_pos = len(list(set(sus_patients_negative) & set(ecds_patients_positive)))
sus_neg_ecds_neg = len(list(set(sus_patients_negative) & set(ecds_patients_negative)))

pd.DataFrame([[sus_pos_ecds_pos, sus_neg_ecds_pos, (sus_pos_ecds_pos + sus_neg_ecds_pos)], [sus_pos_ecds_neg, sus_neg_ecds_neg, (sus_pos_ecds_neg + sus_neg_ecds_neg)], [(sus_pos_ecds_pos+sus_pos_ecds_neg), (sus_neg_ecds_pos+sus_neg_ecds_neg), (sus_pos_ecds_pos + sus_pos_ecds_neg + sus_neg_ecds_pos + sus_neg_ecds_neg)]], columns=["SUS-positive", "SUS-negative", "Total"], index=["ECDS-positive", "ECDS-negative", "Total"])


Unnamed: 0,SUS-positive,SUS-negative,Total
ECDS-positive,1172,2738,3910
ECDS-negative,1770,4095,5865
Total,2942,6833,9775


In [4]:
#sensitivity - number of sus identified by ecds
#specificity - number of those not in sus who are not in ecds

sensitivity = (sus_pos_ecds_pos/(sus_pos_ecds_pos + sus_pos_ecds_neg))*100
print(f"Sensitivity: {sensitivity:.2f}%")

specificity = (sus_neg_ecds_neg/(sus_neg_ecds_pos + sus_neg_ecds_neg))*100
print(f"Specificity : {specificity:.2f}%")


Sensitivity: 39.84%
Specificity : 59.93%


## Variable Breakdown 

### AE Attendance


In [5]:
positive_ae_covid_ecds = df[((df['ae_attendance']==1))]
negative_ae_covid_ecds = df[((df['ae_attendance']==0))]

ecds_patients_positive = set(list(positive_ae_covid_ecds['patient_id']))
ecds_patients_negative = set(list(negative_ae_covid_ecds['patient_id']))

sus_pos_ecds_pos = len(list(set(sus_patients_positive) & set(ecds_patients_positive)))
sus_pos_ecds_neg = len(list(set(sus_patients_positive) & set(ecds_patients_negative)))
sus_neg_ecds_pos = len(list(set(sus_patients_negative) & set(ecds_patients_positive)))
sus_neg_ecds_neg = len(list(set(sus_patients_negative) & set(ecds_patients_negative)))

pd.DataFrame([[sus_pos_ecds_pos, sus_neg_ecds_pos, (sus_pos_ecds_pos + sus_neg_ecds_pos)], [sus_pos_ecds_neg, sus_neg_ecds_neg, (sus_pos_ecds_neg + sus_neg_ecds_neg)], [(sus_pos_ecds_pos+sus_pos_ecds_neg), (sus_neg_ecds_pos+sus_neg_ecds_neg), (sus_pos_ecds_pos + sus_pos_ecds_neg + sus_neg_ecds_pos + sus_neg_ecds_neg)]], columns=["SUS-positive", "SUS-negative", "Total"], index=["AE attendance +", "AE attendance -", "Total"])


Unnamed: 0,SUS-positive,SUS-negative,Total
AE attendance +,1191,2809,4000
AE attendance -,1809,4191,6000
Total,3000,7000,10000


### AE Attendance + AE Covid Status

In [6]:
positive_ae_covid_ecds = df[(df['ae_attendance']==1) & ((df['ae_attendance_covid_status']==1))]
negative_ae_covid_ecds = df[(df['ae_attendance']==1) & ((df['ae_attendance_covid_status']==0))]

ecds_patients_positive = set(list(positive_ae_covid_ecds['patient_id']))
ecds_patients_negative = set(list(negative_ae_covid_ecds['patient_id']))

sus_pos_ecds_pos = len(list(set(sus_patients_positive) & set(ecds_patients_positive)))
sus_pos_ecds_neg = len(list(set(sus_patients_positive) & set(ecds_patients_negative)))
sus_neg_ecds_pos = len(list(set(sus_patients_negative) & set(ecds_patients_positive)))
sus_neg_ecds_neg = len(list(set(sus_patients_negative) & set(ecds_patients_negative)))

pd.DataFrame([[sus_pos_ecds_pos, sus_neg_ecds_pos, (sus_pos_ecds_pos + sus_neg_ecds_pos)], [sus_pos_ecds_neg, sus_neg_ecds_neg, (sus_pos_ecds_neg + sus_neg_ecds_neg)], [(sus_pos_ecds_pos+sus_pos_ecds_neg), (sus_neg_ecds_pos+sus_neg_ecds_neg), (sus_pos_ecds_pos + sus_pos_ecds_neg + sus_neg_ecds_pos + sus_neg_ecds_neg)]], columns=["SUS-positive", "SUS-negative", "Total"], index=["AE Covid +", "AE Covid -", "Total"])



Unnamed: 0,SUS-positive,SUS-negative,Total
AE Covid +,1078,2514,3592
AE Covid -,113,295,408
Total,1191,2809,4000


In [7]:
sensitivity = (sus_pos_ecds_pos/(sus_pos_ecds_pos + sus_pos_ecds_neg))*100
print(f"Sensitivity: {sensitivity:.2f}%")

specificity = (sus_neg_ecds_neg/(sus_neg_ecds_pos + sus_neg_ecds_neg))*100
print(f"Specificity : {specificity:.2f}%")


###  AE Attendance + Recent Positive Covid Test

In [8]:
positive_cov_test_ecds = df[(df['ae_attendance']==1) & ((df['positive_covid_test_before_ae_attendance']==1))]
negative_cov_test_ecds = df[(df['ae_attendance']==1) & ((df['positive_covid_test_before_ae_attendance']==0))]

ecds_patients_positive = set(list(positive_cov_test_ecds['patient_id']))
ecds_patients_negative = set(list(negative_cov_test_ecds['patient_id']))

sus_pos_ecds_pos = len(list(set(sus_patients_positive) & set(ecds_patients_positive)))
sus_pos_ecds_neg = len(list(set(sus_patients_positive) & set(ecds_patients_negative)))
sus_neg_ecds_pos = len(list(set(sus_patients_negative) & set(ecds_patients_positive)))
sus_neg_ecds_neg = len(list(set(sus_patients_negative) & set(ecds_patients_negative)))

pd.DataFrame([[sus_pos_ecds_pos, sus_neg_ecds_pos, (sus_pos_ecds_pos + sus_neg_ecds_pos)], [sus_pos_ecds_neg, sus_neg_ecds_neg, (sus_pos_ecds_neg + sus_neg_ecds_neg)], [(sus_pos_ecds_pos+sus_pos_ecds_neg), (sus_neg_ecds_pos+sus_neg_ecds_neg), (sus_pos_ecds_pos + sus_pos_ecds_neg + sus_neg_ecds_pos + sus_neg_ecds_neg)]], columns=["SUS-positive", "SUS-negative", "Total"], index=["Covid Test +", "Covid Test -", "Total"])


Unnamed: 0,SUS-positive,SUS-negative,Total
Covid Test +,609,1418,2027
Covid Test -,582,1391,1973
Total,1191,2809,4000


In [9]:
sensitivity = (sus_pos_ecds_pos/(sus_pos_ecds_pos + sus_pos_ecds_neg))*100
print(f"Sensitivity: {sensitivity:.2f}%")

specificity = (sus_neg_ecds_neg/(sus_neg_ecds_pos + sus_neg_ecds_neg))*100
print(f"Specificity : {specificity:.2f}%")


### AE Attendance + Covid Positive Primary Care

In [10]:
positive_cov_test_ecds = df[(df['ae_attendance']==1) & ((df['covid_primary_care_before_ae_attendance']==1))]
negative_cov_test_ecds = df[(df['ae_attendance']==1) & ((df['covid_primary_care_before_ae_attendance']==0))]

ecds_patients_positive = set(list(positive_cov_test_ecds['patient_id']))
ecds_patients_negative = set(list(negative_cov_test_ecds['patient_id']))

sus_pos_ecds_pos = len(list(set(sus_patients_positive) & set(ecds_patients_positive)))
sus_pos_ecds_neg = len(list(set(sus_patients_positive) & set(ecds_patients_negative)))
sus_neg_ecds_pos = len(list(set(sus_patients_negative) & set(ecds_patients_positive)))
sus_neg_ecds_neg = len(list(set(sus_patients_negative) & set(ecds_patients_negative)))

pd.DataFrame([[sus_pos_ecds_pos, sus_neg_ecds_pos, (sus_pos_ecds_pos + sus_neg_ecds_pos)], [sus_pos_ecds_neg, sus_neg_ecds_neg, (sus_pos_ecds_neg + sus_neg_ecds_neg)], [(sus_pos_ecds_pos+sus_pos_ecds_neg), (sus_neg_ecds_pos+sus_neg_ecds_neg), (sus_pos_ecds_pos + sus_pos_ecds_neg + sus_neg_ecds_pos + sus_neg_ecds_neg)]], columns=["SUS-positive", "SUS-negative", "Total"], index=["Primary Care Covid +", "Primary Care Covid -", "Total"])


Unnamed: 0,SUS-positive,SUS-negative,Total
Primary Care Covid +,610,1399,2009
Primary Care Covid -,581,1410,1991
Total,1191,2809,4000


In [11]:
sensitivity = (sus_pos_ecds_pos/(sus_pos_ecds_pos + sus_pos_ecds_neg))*100
print(f"Sensitivity: {sensitivity:.2f}%")

specificity = (sus_neg_ecds_neg/(sus_neg_ecds_pos + sus_neg_ecds_neg))*100
print(f"Specificity : {specificity:.2f}%")
