## PRMT-2333 Hypothesis: Practices responsible for EMIS-EMIS sender not LM compliant is same in Aug vs. July 

Hypothesis

We believe that the practices that cause EMIS-EMIS error code 23s in Julywill be the same (or nearly the same) as the practices causing these errors in AugWe will know this to be true when we have compared the two data sets and the lists of practices have a correlation

Scope

Generate list of practices that caused EMIS-EMIS error 23s in July, and a separate one for Aug

Compare the two lists to identify if there is a correlation



In [1]:
import pandas as pd 
import numpy as np
from datetime import datetime

In [2]:
transfer_files = [
    "s3://prm-gp2gp-transfer-data-preprod/v4/2021/7/transfers.parquet",
    "s3://prm-gp2gp-notebook-data-prod/PRMT-2324-2-weeks-august-data/transfers/v4/2021/8/transfers.parquet"
]

transfers_raw = pd.concat((
    pd.read_parquet(f)
    for f in transfer_files
))

transfers = transfers_raw.copy()

In [3]:
asid_lookup_file_location = "s3://prm-gp2gp-asid-lookup-preprod/"
asid_lookup_files = [
    "2021/7/asidLookup.csv.gz",
    "2021/8/asidLookup.csv.gz"    
]
asid_lookup_input_files = [asid_lookup_file_location + f for f in asid_lookup_files]
asid_lookup = pd.concat((
    pd.read_csv(f)
    for f in asid_lookup_input_files
)).drop_duplicates()
lookup = asid_lookup[["ASID", "NACS","OrgName"]]

transfers = transfers.merge(lookup, left_on='requesting_practice_asid',right_on='ASID',how='left')
transfers = transfers.rename({'ASID': 'requesting_supplier_asid', 'NACS': 'requesting_ods_code','OrgName':'requesting_practice_name'}, axis=1)
transfers = transfers.merge(lookup, left_on='sending_practice_asid',right_on='ASID',how='left')
transfers = transfers.rename({'ASID': 'sending_supplier_asid', 'NACS': 'sending_ods_code','OrgName':'sending_practice_name'}, axis=1)

### EMIS - EMIS: Sender not Large Message compliant (error 23)

In [4]:
emis_sender_bool = transfers["sending_supplier"]=="EMIS"
emis_requester_bool = transfers["requesting_supplier"]=="EMIS"
sender_error_23_bool = transfers["sender_error_codes"].apply(lambda error_codes: 23 in error_codes)
emis_transfers_with_error_23 = transfers[emis_sender_bool & emis_requester_bool & sender_error_23_bool].copy()

grouped_emis_transfers_with_error_23 = emis_transfers_with_error_23.groupby(by='sending_practice_name').agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()

In [5]:
grouped_emis_transfers_with_error_23

Unnamed: 0,sending_practice_name,conversation_id
0,ST CLEMENTS PARTNERSHIP,4
1,ARCHWAY MEDICAL CENTRE,4
2,UNIVERSITY HEALTH SERVICE,3
3,KESTON MEDICAL PRACTICE,3
4,GODIVA GROUP PRACTICE,2
...,...,...
397,HIGHPARKS MEDICAL PRACTICE,1
398,HIGHERLAND SURGERY,1
399,HIGH STREET SURGERY,1
400,HETHERINGTON AT THE PAVILION,1


In [6]:
july_bool = emis_transfers_with_error_23["date_requested"] < datetime(2021, 8, 1)
july_emis_transfers_with_error_23 = emis_transfers_with_error_23[july_bool]

grouped_july_emis_transfers_with_error_23=july_emis_transfers_with_error_23.groupby(by='sending_practice_name').agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()
grouped_july_emis_transfers_with_error_23

Unnamed: 0,sending_practice_name,conversation_id
0,ST CLEMENTS PARTNERSHIP,3
1,ARCHWAY MEDICAL CENTRE,3
2,CLARENCE MEDICAL CENTRE,2
3,UNIVERSITY MEDICAL GROUP,2
4,CENTRAL GATESHEAD MEDICAL GROUP,2
...,...,...
233,HIGHERLAND SURGERY,1
234,HOLLIES MEDICAL CENTRE,1
235,HOLLYMOOR MEDICAL CENTRE,1
236,HORIZON HEALTH CENTRE,1


In [7]:
august_bool = emis_transfers_with_error_23["date_requested"] > datetime(2021, 7, 31)
august_emis_transfers_with_error_23 = emis_transfers_with_error_23[august_bool]

grouped_august_emis_transfers_with_error_23=august_emis_transfers_with_error_23.groupby(by='sending_practice_name').agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()
grouped_august_emis_transfers_with_error_23

Unnamed: 0,sending_practice_name,conversation_id
0,WOOSEHILL PRACTICE,2
1,BALMORE PARK SURGERY,2
2,ELIZABETH STREET SURGERY,2
3,KESTON MEDICAL PRACTICE,2
4,THE ROYTON & CROMPTON FAMILY PRACTICE,2
...,...,...
180,KNOLL MEDICAL PRACTICE,1
181,LANE ENDS SURGERY,1
182,LANGLEY HEALTH CENTRE,1
183,LAUNCESTON CLOSE SURGERY,1


#### Practices that had error code 23 in both July and August as senders

In [8]:
practices_with_error_23_july_and_august = grouped_july_emis_transfers_with_error_23.merge(grouped_august_emis_transfers_with_error_23, how='inner', on='sending_practice_name')
practices_with_error_23_july_and_august.rename({'conversation_id_x': 'No of transfers in July', 'conversation_id_y': 'No of transfers in August'}, axis=1)

Unnamed: 0,sending_practice_name,No of transfers in July,No of transfers in August
0,ST CLEMENTS PARTNERSHIP,3,1
1,ARCHWAY MEDICAL CENTRE,3,1
2,UNIVERSITY HEALTH SERVICE,2,1
3,SHEPPEY NHS HEALTHCARE CENTRE,1,1
4,SOUTHERN GROUP PRACTICE,1,1
5,ST PAUL'S SURGERY,1,1
6,STONEFIELD STREET SURGERY,1,1
7,THE SURGERY KINGSTONE,1,1
8,WHITSTABLE MEDICAL PRACTICE,1,1
9,MILLGATE HEALTHCARE PARTNERSHIP,1,1


### EMIS - EMIS: Requester not Large Message compliant (error 14)

Checking if the requesting practices are the same in both months as both sides need to have LM enabled for the large transfer to be successful

In [9]:
sender_error_14_bool = transfers["sender_error_codes"].apply(lambda error_codes: 14 in error_codes)
emis_transfers_with_error_14 = transfers[emis_sender_bool & emis_requester_bool & sender_error_14_bool].copy()

grouped_emis_transfers_with_error_14 = emis_transfers_with_error_14.groupby(by='requesting_practice_name').agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()

In [10]:
grouped_emis_transfers_with_error_14

Unnamed: 0,requesting_practice_name,conversation_id
0,THE GREYSWOOD PRACTICE,4
1,NEXUS HEALTH GROUP,4
2,AMHERST MEDICAL PRACTICE,3
3,WHITSTABLE MEDICAL PRACTICE,3
4,THE PRACTICE ALBERT ROAD,3
...,...,...
350,GRAHAM ROAD SURGERY,1
351,GOODMAN'S FIELD HEALTH CENTRE,1
352,GLOUCESTER ROAD MEDICAL CENTRE,1
353,GLASTONBURY SURGERY,1


In [11]:
july_bool = emis_transfers_with_error_14["date_requested"] < datetime(2021, 8, 1)
july_emis_transfers_with_error_14 = emis_transfers_with_error_14[july_bool]

grouped_july_emis_transfers_with_error_14=july_emis_transfers_with_error_14.groupby(by='requesting_practice_name').agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()
grouped_july_emis_transfers_with_error_14.rename({'conversation_id': 'No of transfers in July'}, axis=1)

Unnamed: 0,requesting_practice_name,No of transfers in July
0,NEXUS HEALTH GROUP,3
1,MARTINS OAK SURGERY,2
2,STOKENCHURCH MEDICAL CTRE,2
3,THE GILL MEDICAL PRACTICE,2
4,THE GREYSWOOD PRACTICE,2
...,...,...
172,HARTINGTON SURGERY,1
173,HEATON MOOR MEDICAL GROUP,1
174,HESWALL & PENSBY GROUP PRACTICE,1
175,HIGH GLADES MEDICAL CENTRE,1


In [12]:
august_bool = emis_transfers_with_error_14["date_requested"] > datetime(2021, 7, 31)
august_emis_transfers_with_error_14 = emis_transfers_with_error_14[august_bool]

grouped_august_emis_transfers_with_error_14=august_emis_transfers_with_error_14.groupby(by='requesting_practice_name').agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()
grouped_august_emis_transfers_with_error_14.rename({'conversation_id': 'No of transfers in August'}, axis=1)

Unnamed: 0,requesting_practice_name,No of transfers in August
0,THE PRACTICE ALBERT ROAD,3
1,E HARLING & KENNINGHALL MEDICAL PRACTICE,2
2,WHITSTABLE MEDICAL PRACTICE,2
3,GOSBERTON MEDICAL CENTRE,2
4,THE GREYSWOOD PRACTICE,2
...,...,...
188,GRAHAM ROAD SURGERY,1
189,GREEN LANE MEDICAL CENTRE,1
190,GREENBANK MEDICAL PRACTICE,1
191,GUILDOWNS GROUP PRACTICE,1


#### Practices that had error code 14 in both July and August as requestors

In [13]:
practices_with_error_14_july_and_august = grouped_july_emis_transfers_with_error_14.merge(grouped_august_emis_transfers_with_error_14, how='inner', on='requesting_practice_name')
practices_with_error_14_july_and_august.rename({'conversation_id_x': 'No of transfers in July', 'conversation_id_y': 'No of transfers in August'}, axis=1)

Unnamed: 0,requesting_practice_name,No of transfers in July,No of transfers in August
0,NEXUS HEALTH GROUP,3,1
1,THE GREYSWOOD PRACTICE,2,2
2,AMHERST MEDICAL PRACTICE,2,1
3,ARCHWAY MEDICAL CENTRE,2,1
4,WHITSTABLE MEDICAL PRACTICE,1,2
5,WITTON STREET SURGERY,1,1
6,MARKET QUARTER MEDICAL PRACTICE,1,1
7,THE FORUM HEALTH CENTRE,1,1
8,THORNTON & VALLEY PARK SURGERY,1,1
9,MANOR ROAD SURGERY,1,1


### Checking if practices appear consistently with LM errors as both senders and requesters

In [14]:
july_recurring_LM_error_practices = grouped_july_emis_transfers_with_error_14.merge(grouped_july_emis_transfers_with_error_23, how='inner', left_on='requesting_practice_name', right_on='sending_practice_name')
july_recurring_LM_error_practices.rename({'conversation_id_x': 'No of transfers with error 14', 'conversation_id_y': 'No of transfers with error 23'}, axis=1)

Unnamed: 0,requesting_practice_name,No of transfers with error 14,sending_practice_name,No of transfers with error 23
0,MARTINS OAK SURGERY,2,MARTINS OAK SURGERY,1
1,MIDDLEWOOD PARTNERSHIP,2,MIDDLEWOOD PARTNERSHIP,1
2,ARCHWAY MEDICAL CENTRE,2,ARCHWAY MEDICAL CENTRE,3
3,ST MARY'S SURGERY,1,ST MARY'S SURGERY,1
4,ST MARYS ISLAND GROUP PRACTICES,1,ST MARYS ISLAND GROUP PRACTICES,1
5,ROCKY LANE MEDICAL CENTRE,1,ROCKY LANE MEDICAL CENTRE,1
6,WEST HAMPSTEAD MEDICAL CENTRE,1,WEST HAMPSTEAD MEDICAL CENTRE,1
7,WHITSTABLE MEDICAL PRACTICE,1,WHITSTABLE MEDICAL PRACTICE,1
8,MARKET QUARTER MEDICAL PRACTICE,1,MARKET QUARTER MEDICAL PRACTICE,1
9,THE EUXTON MEDICAL CENTRE,1,THE EUXTON MEDICAL CENTRE,1


In [15]:
august_recurring_LM_error_practices = grouped_august_emis_transfers_with_error_14.merge(grouped_august_emis_transfers_with_error_23, how='inner', left_on='requesting_practice_name', right_on='sending_practice_name')
august_recurring_LM_error_practices.rename({'conversation_id_x': 'No of transfers with error 14', 'conversation_id_y': 'No of transfers with error 23'}, axis=1)

Unnamed: 0,requesting_practice_name,No of transfers with error 14,sending_practice_name,No of transfers with error 23
0,WHITSTABLE MEDICAL PRACTICE,2,WHITSTABLE MEDICAL PRACTICE,1
1,ST CLEMENTS PARTNERSHIP,1,ST CLEMENTS PARTNERSHIP,1
2,SAXONBURY HOUSE SURGERY,1,SAXONBURY HOUSE SURGERY,1
3,NEXUS HEALTH GROUP,1,NEXUS HEALTH GROUP,1
4,RAINBOW MEDICAL CENTRE,1,RAINBOW MEDICAL CENTRE,1
5,VICTORIA MEDICAL CENTRE,1,VICTORIA MEDICAL CENTRE,1
6,THE ROYTON & CROMPTON FAMILY PRACTICE,1,THE ROYTON & CROMPTON FAMILY PRACTICE,2
7,THE CEDARS SURGERY,1,THE CEDARS SURGERY,1
8,THE MILLER PRACTICE,1,THE MILLER PRACTICE,1
9,CHILCOTE PRACTICE,1,CHILCOTE PRACTICE,1


### Checking across the full dataset

In [16]:
grouped_practices_with_error_14 = emis_transfers_with_error_14.groupby("requesting_practice_name").agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()
grouped_practices_with_error_23 = emis_transfers_with_error_23.groupby("sending_practice_name").agg({'conversation_id': 'count'}).sort_values(by='conversation_id', ascending=False).reset_index()

recurring_LM_error_practices = grouped_practices_with_error_14.merge(grouped_practices_with_error_23, how='inner', left_on='requesting_practice_name', right_on='sending_practice_name')
recurring_LM_error_practices.rename({'conversation_id_x': 'No of transfers with error 14', 'conversation_id_y': 'No of transfers with error 23'}, axis=1)

Unnamed: 0,requesting_practice_name,No of transfers with error 14,sending_practice_name,No of transfers with error 23
0,NEXUS HEALTH GROUP,4,NEXUS HEALTH GROUP,1
1,WHITSTABLE MEDICAL PRACTICE,3,WHITSTABLE MEDICAL PRACTICE,2
2,ARCHWAY MEDICAL CENTRE,3,ARCHWAY MEDICAL CENTRE,4
3,MARKET QUARTER MEDICAL PRACTICE,2,MARKET QUARTER MEDICAL PRACTICE,1
4,MARTINS OAK SURGERY,2,MARTINS OAK SURGERY,1
5,MIDDLEWOOD PARTNERSHIP,2,MIDDLEWOOD PARTNERSHIP,2
6,SAXONBURY HOUSE SURGERY,1,SAXONBURY HOUSE SURGERY,1
7,ROCKY LANE MEDICAL CENTRE,1,ROCKY LANE MEDICAL CENTRE,1
8,RINGMEAD MEDICAL PRACTICE,1,RINGMEAD MEDICAL PRACTICE,1
9,RAINBOW MEDICAL CENTRE,1,RAINBOW MEDICAL CENTRE,1


### Checking if practices with not LM compliant error had any LM transfers

In [17]:
gp2gp_messages_files = [
    "s3://prm-gp2gp-raw-spine-data-preprod/v2/messages/2021/7/2021-7_spine_messages.csv.gz",
]

gp2gp_messages_raw = pd.concat((
    pd.read_csv(f, parse_dates=["_time"], dtype={"messageRecipient": str, "messageSender": str})
    for f in gp2gp_messages_files
))

gp2gp_messages = gp2gp_messages_raw.copy()

In [18]:
gp2gp_messages = gp2gp_messages.merge(lookup, left_on='messageRecipient',right_on='ASID',how='left')
gp2gp_messages = gp2gp_messages.rename({'ASID': 'requesting_supplier_asid', 'NACS': 'requesting_ods_code','OrgName':'requesting_practice_name'}, axis=1)
gp2gp_messages = gp2gp_messages.merge(lookup, left_on='messageSender',right_on='ASID',how='left')
gp2gp_messages = gp2gp_messages.rename({'ASID': 'sending_supplier_asid', 'NACS': 'sending_ods_code','OrgName':'sending_practice_name'}, axis=1)

In [19]:
is_copc = gp2gp_messages["interactionID"] == "urn:nhs:names:services:gp2gp/COPC_IN000001UK01"
gp2gp_copc_messages = gp2gp_messages[is_copc]

In [20]:
copc_messages_from_practices_with_error_23 = gp2gp_copc_messages.merge(grouped_july_emis_transfers_with_error_23, how="inner", on="sending_practice_name")
grouped_copc_messages_from_practices_with_error_23 = copc_messages_from_practices_with_error_23.groupby("sending_practice_name").agg({'conversationID': lambda x: x.nunique()}).sort_values(by='conversationID', ascending=False).reset_index()
practices_with_lm_error_and_copc_messages = grouped_copc_messages_from_practices_with_error_23.merge(grouped_july_emis_transfers_with_error_23, how="inner", on="sending_practice_name")
practices_with_lm_error_and_copc_messages = practices_with_lm_error_and_copc_messages.rename({'conversationID': 'No of transfers with at least one COPC message', "conversation_id": "No of transfers with error 23"}, axis=1)
practices_with_lm_error_and_copc_messages

Unnamed: 0,sending_practice_name,No of transfers with at least one COPC message,No of transfers with error 23
0,LEEDS STUDENT MEDICAL PRACTICE,428,1
1,LANCASTER MEDICAL PRACTICE,347,1
2,MEDICUS HEALTH PARTNERS,339,1
3,ARCHWAY MEDICAL CENTRE,331,3
4,HAZELDENE MEDICAL CENTRE,302,1
...,...,...,...
233,TRENTHAM MEDICAL CENTRE,10,1
234,SEAFORTH VILLAGE SURGERY,10,1
235,THE DURU PRACTICE,9,1
236,PARKWOOD FAMILY PRACTICE,5,1


In [21]:
# Checking if there are any practices that had error code 23 and no COPC messages

practices_with_lm_error_and_no_copc_messages = grouped_july_emis_transfers_with_error_23.merge(gp2gp_copc_messages, how="left", on="sending_practice_name").fillna("N/A")
practices_with_lm_error_and_no_copc_messages[practices_with_lm_error_and_no_copc_messages["GUID"]=="N/A"]

Unnamed: 0,sending_practice_name,conversation_id,_time,conversationID,GUID,interactionID,messageSender,messageRecipient,messageRef,jdiEvent,toSystem,fromSystem,requesting_supplier_asid,requesting_ods_code,requesting_practice_name,sending_supplier_asid,sending_ods_code
