# PRMT-2477 Pre GP2GP failures from MI data

## Context

We have been doing some research with practices to understand issues that occur during registrations that prevent GP2GP from happening. There are known scenarios such as patients coming from abroad, but we want to see if there are unknown scenarios that also contribute, e.g. to do with PDS or SDS issues. 

Questions to be answered using MI data:

For a given month (Oct or Nov)

1. How many registrations failed with any of the following process failure points:

- 10 = PDS trace
- 20 = PDS update
- 30 = SDS lookup Practice (not used)
- 40 = SDS lookup ASID

2. Are there any registrations that have any of these failure points and eventually go to GP2GP (i.e. have a conversation ID?)

3. Do these process failure points correlate with any of the specific failure types? : 

- 0 = Attempted,
- 1 = Sent,
- 2 = Not Sent - Patient at current practice,
- 3 = Not Sent - Patient known at current practice transferring from non-GP2GP practice,
- 4 = Not Sent - Patient not known at current practice transferring from a non-GP2GP practice,
- 5 = Not Sent – Patient has no previous practice registered,
- 6 = Negative acknowledgement received.

4. Can we tell which registrations are failed but could have gone via GP2GP, vs. which are not eligible for GP2GP e.g. new born, coming from Scotland or Wales, Army, prison, International etc. 

## Notes

Data downloaded from Splunk using the following query:
```
index="gp2gp_nms_prod" sourcetype="gp2gpmi-rr"
| table *
```

In [1]:
import pandas as pd
import numpy as np

In [2]:
def convert_to_float(val):
    try:
        return int(val)
    except:
        return val

mi_data_file_location = "s3://prm-gp2gp-notebook-data-prod/PRMT-2477-pre-gp2gp-failures/MI_RR-Nov_2021.csv"

dates_fields = ["RegistrationTime", "RequestFailureTime", "RequestTime", "ExtractTime", "ExtractAckTime", "ExtractAckFailureTime"]
practice_registrations = pd.read_csv(mi_data_file_location, parse_dates=dates_fields)

practice_registrations["RequestErrorCode"] = practice_registrations["RequestErrorCode"].apply(convert_to_float)
practice_registrations = practice_registrations.fillna("None")

practice_registrations = (
    practice_registrations
        .sort_values(by="_time", ascending=True)
        .drop_duplicates(subset=["RegistrationTime", "RegistrationSmartcardUID"], keep="last")
    )

  interactivity=interactivity, compiler=compiler, result=result)


In [3]:
def has_conversation_id(value):
    if value=="None":
        return False
    else:
        return True
    
practice_registrations["TriggeredGP2GP"] = practice_registrations.apply(lambda row: has_conversation_id(row["ConversationID"]), axis=1)

failure_points_of_interest = [10, 20, 30, 40]
is_failure_point_of_interest = practice_registrations["RequestFailurePoint"].apply(lambda error_code: error_code in failure_points_of_interest)
registrations_with_failure_points_of_interest = practice_registrations[is_failure_point_of_interest]

registrations_grouped_by_failures = (
    registrations_with_failure_points_of_interest
        .groupby(by=["RequestFailurePoint", "RequestFailureType", "RequestErrorCode", "TriggeredGP2GP"])
        .agg({"RegistrationTime": "count"})
        .rename(columns={"RegistrationTime": "count"})
        .sort_values(by="count", ascending=False)
    )
registrations_grouped_by_failures

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,count
RequestFailurePoint,RequestFailureType,RequestErrorCode,TriggeredGP2GP,Unnamed: 4_level_1
10,,20,False,22395
40,3.0,24,False,3142
40,4.0,24,False,2749
20,0.0,20,False,1121
40,,20,False,509
20,0.0,IU030,False,70
20,0.0,-8,True,38
20,0.0,IU056,False,10
20,0.0,,False,5
20,0.0,-3,False,2


**RequestFailurePoint:**
- 0 = No failure
- 10 = PDS trace
- 20 = PDS update
- 30 = SDS lookup Practice (not used)
- 40 = SDS lookup ASID
- 50 = SDS lookup Contract Props
- 60 = Send Request
- 70 = Manual Request

**RequestFailureType:**
- 0 = Attempted
- 1 = Sent
- 2 = Not Sent - Patient at current practice
- 3 = Not Sent - Patient known at current practice transferring from non-GP2GP practice
- 4 = Not Sent - Patient not known at current practice transferring from a non-GP2GP practice
- 5 = Not Sent – Patient has no previous practice registered
- 6 = Negative acknowledgement received

**RequestErrorCode:**
- 3 = Record available but cannot be sent - DEPRECATED
- 8 = The system’s configuration prevents it from processing this message - DEPRECATED
- 20 = Spine system responded with an error
- 24 = SDS lookup provided zero or more than one result to the query for each interaction
- 25 = Large messages rejected due to timeout duration reached of overall transfer