## Predictive Models

In this notebook, we will try multiple methods to predict A&E demands

In [1]:
import pandas as pd
import geopandas as gpd

In [2]:
url1 = 'https://www.opendata.nhs.scot/dataset/0d57311a-db66-4eaa-bd6d-cc622b6cbdfa/resource/a5f7ca94-c810-41b5-a7c9-25c18d43e5a4/download/weekly_ae_activity_20240714.csv'
df_week_AE = pd.read_csv(url1)

url2 = 'https://www.opendata.nhs.scot/dataset/997acaa5-afe0-49d9-b333-dcf84584603d/resource/022c3b27-6a58-48dc-8038-8f1f93bb0e78/download/opendata_monthly_ae_when_202405.csv'
df_month_AE = pd.read_csv(url2)

url3 = 'https://www.opendata.nhs.scot/dataset/997acaa5-afe0-49d9-b333-dcf84584603d/resource/c4622324-f59c-4011-a67b-83b59c59ca94/download/opendata_monthly_ae_discharge_202405.csv'
df_discharge = pd.read_csv(url3)

url4= 'https://www.opendata.nhs.scot/dataset/997acaa5-afe0-49d9-b333-dcf84584603d/resource/37ba17b1-c323-492c-87d5-e986aae9ab59/download/monthly_ae_activity_202405.csv'
df_month_attendances= pd.read_csv(url4)

url_hospital = 'https://www.opendata.nhs.scot/dataset/cbd1802e-0e04-4282-88eb-d7bdcfb120f0/resource/c698f450-eeed-41a0-88f7-c1e40a568acc/download/hospitals.csv'
df_hospital = pd.read_csv(url_hospital)
#df_hospital : A list of all NHS hospitals across Scotland and associated geographic information. It should be noted that this list contains all hospitals in Scotland, not only acute hospitals.

url_demographics= 'https://www.opendata.nhs.scot/dataset/997acaa5-afe0-49d9-b333-dcf84584603d/resource/6abbf8e4-e4e0-4a56-a7b9-f7c7b4171ff3/download/opendata_monthly_ae_demographics_202405.csv'
df_demographics= pd.read_csv(url_demographics)

url_multiple_attendance= 'https://www.opendata.nhs.scot/dataset/997acaa5-afe0-49d9-b333-dcf84584603d/resource/0ca3b959-b758-4532-bb55-aa86da28679e/download/opendata_monthly_ae_multiple_attendances_202405.csv'
df_multi_attendance= pd.read_csv(url_multiple_attendance)
#This data resource contains multiple attendances statistics on new and unplanned return attendances at Accident and Emergency (A&E) services across Scotland for the latest 12 month period.

In [3]:
shapefile_path = "SG_NHS_HealthBoards_2019"

#Reading the shapefile into a GeoDataFrame
gdf = gpd.read_file(shapefile_path)

#Information about locations
location_df= gdf[["HBCode", "HBName"]]

In [4]:
df_week_AE.head()

Unnamed: 0,WeekEndingDate,Country,HBT,TreatmentLocation,DepartmentType,NumberOfAttendancesEpisode,NumberWithin4HoursEpisode,NumberOver4HoursEpisode,PercentageWithin4HoursEpisode,NumberOver8HoursEpisode,PercentageOver8HoursEpisode,NumberOver12HoursEpisode,PercentageOver12HoursEpisode
0,20150222,S92000003,S08000015,A210H,Emergency Department,814,624,190,76.7,21,2.6,2,0.2
1,20150222,S92000003,S08000015,A111H,Emergency Department,1347,1115,232,82.8,31,2.3,2,0.1
2,20150222,S92000003,S08000016,B120H,Emergency Department,517,463,54,89.6,1,0.2,0,0.0
3,20150222,S92000003,S08000017,Y146H,Emergency Department,604,578,26,95.7,0,0.0,0,0.0
4,20150222,S92000003,S08000017,Y144H,Emergency Department,196,185,11,94.4,1,0.5,0,0.0


In [5]:
df_month_AE.head()

Unnamed: 0,Month,Country,HBT,TreatmentLocation,DepartmentType,Day,Week,Hour,InOut,NumberOfAttendances
0,201801,S92000003,S08000015,A111H,Emergency Department,Friday,Weekday,00:00 to 00:59,Out of Hours,20
1,201801,S92000003,S08000015,A111H,Emergency Department,Friday,Weekday,01:00 to 01:59,Out of Hours,14
2,201801,S92000003,S08000015,A111H,Emergency Department,Friday,Weekday,02:00 to 02:59,Out of Hours,6
3,201801,S92000003,S08000015,A111H,Emergency Department,Friday,Weekday,03:00 to 03:59,Out of Hours,5
4,201801,S92000003,S08000015,A111H,Emergency Department,Friday,Weekday,04:00 to 04:59,Out of Hours,5


In [6]:
df_month_attendances.head()

Unnamed: 0,Month,Country,HBT,TreatmentLocation,DepartmentType,NumberOfAttendancesAll,NumberWithin4HoursAll,NumberOver4HoursAll,PercentageWithin4HoursAll,NumberOfAttendancesEpisode,...,PercentageWithin4HoursEpisode,PercentageWithin4HoursEpisodeQF,NumberOver8HoursEpisode,NumberOver8HoursEpisodeQF,PercentageOver8HoursEpisode,PercentageOver8HoursEpisodeQF,NumberOver12HoursEpisode,NumberOver12HoursEpisodeQF,PercentageOver12HoursEpisode,PercentageOver12HoursEpisodeQF
0,200707,S92000003,S08000015,A101H,Minor Injury Unit or Other,252,252,0,100.0,,...,,z,,z,,z,,z,,z
1,200707,S92000003,S08000015,A111H,Emergency Department,5414,5290,124,97.7,5414.0,...,97.7,,26.0,,0.5,,24.0,,0.4,
2,200707,S92000003,S08000015,A207H,Minor Injury Unit or Other,92,92,0,100.0,,...,,z,,z,,z,,z,,z
3,200707,S92000003,S08000015,A210H,Emergency Department,3530,3355,175,95.0,3530.0,...,95.0,,3.0,,0.1,,1.0,,0.0,
4,200707,S92000003,S08000016,B103H,Minor Injury Unit or Other,20,20,0,100.0,,...,,z,,z,,z,,z,,z


In [7]:
df_discharge.head()

Unnamed: 0,Month,Country,HBT,TreatmentLocation,DepartmentType,Age,AgeQF,Discharge,DischargeQF,NumberOfAttendances
0,201801,S92000003,S08000015,A111H,Emergency Department,18-24,,Admission to same Hospital,,85
1,201801,S92000003,S08000015,A111H,Emergency Department,18-24,,Discharged Home or to usual Place of Residence,,386
2,201801,S92000003,S08000015,A111H,Emergency Department,18-24,,Transferred to Other Hospital/Service,,5
3,201801,S92000003,S08000015,A111H,Emergency Department,18-24,,,:,18
4,201801,S92000003,S08000015,A111H,Emergency Department,25-39,,Admission to same Hospital,,206


In [8]:
df_hospital.head()

Unnamed: 0,HospitalCode,HospitalName,AddressLine1,AddressLine2,AddressLine2QF,AddressLine3,AddressLine3QF,AddressLine4,AddressLine4QF,Postcode,HealthBoard,HSCP,CouncilArea,IntermediateZone,DataZone
0,A101H,Arran War Memorial Hospital,Lamlash,Isle of Arran,,,z,,z,KA278LF,S08000015,S37000020,S12000021,S02002097,S01011176
1,A103H,Ayrshire Central Hospital,Kilwinning Road,Irvine,,,z,,z,KA128SS,S08000015,S37000020,S12000021,S02002105,S01011213
2,A105H,Kirklandside Hospital,Kirklandside,Kilmarnock,,Ayrshire,,,z,KA1 5LH,S08000015,S37000008,S12000008,S02001492,S01007961
3,A110H,Lady Margaret Hospital,College St,Millport,,Isle of Cumbrae,,,z,KA280HF,S08000015,S37000020,S12000021,S02002128,S01011328
4,A111H,University Hospital Crosshouse,Kilmarnock Road,Kilmarnock,,Ayrshire,,,z,KA2 0BE,S08000015,S37000008,S12000008,S02001504,S01008027


In [9]:
df_demographics.head()

Unnamed: 0,Month,Country,HBT,DepartmentType,Age,AgeQF,Sex,SexQF,Deprivation,DeprivationQF,NumberOfAttendances
0,201801,S92000003,S08000015,Emergency Department,18-24,,Female,,1.0,,158
1,201801,S92000003,S08000015,Emergency Department,18-24,,Female,,2.0,,112
2,201801,S92000003,S08000015,Emergency Department,18-24,,Female,,3.0,,50
3,201801,S92000003,S08000015,Emergency Department,18-24,,Female,,4.0,,39
4,201801,S92000003,S08000015,Emergency Department,18-24,,Female,,5.0,,27


In [10]:
df_multi_attendance.head()

Unnamed: 0,YearEnd,Country,HBT,DepartmentType,OneAttendance,TwoAttendances,ThreeAttendances,FourAttendances,FivePlusAttendances
0,202405,S92000003,S08000015,Emergency Department,50073,11022,3062,1060,986
1,202405,S92000003,S08000016,Emergency Department,14984,3573,1145,426,465
2,202405,S92000003,S08000017,Emergency Department,22579,5419,1804,681,782
3,202405,S92000003,S08000019,Emergency Department,33852,6435,1766,648,652
4,202405,S92000003,S08000019,Minor Injury Unit or Other,9782,639,68,17,2


**Objective:** To build a model that can predict Discharge outcome (Admission to same Hospital/Discharged Home or to usual Place of Residence/Transferred to Other Hospital/Service) based on the Hour and day of arrival. 

The thought behind this objective is that as seen in the exploratory analysis, the demand is not equally distributed across the health boards, therefore it would be useful to predict the outcome of the emergency visit. For example, if a specific board is more likely to provide transfer of the patient, it could imply lack of resources in that board. Similariliy, a board that admits most patients indicates sufficient capacity to deal with emergency patients. Also, boards with most discharges to residence, may be more efficient in handling their resources. 

The fators to be considered include 