# PRMT-2414: Look at spread of EMIS spike errors across practices

## Context

We have learnt from the CIS team that the EMIS spikes are down to individual practices not being on Windows 10. We want to do some data analysis to understand the spread of the errors we’re seeing and see if it’s the same practices responsible for the different scenarios. We will then compare this to a list EMIS will provide to us of the practices that aren’t on Windows 10. 

### Scope
Look at the spread of the below error scenarios across practices. 

Questions to answers:

List of practices nationally and the occurances of each error, split down by the last three months 

We would like to know if it is the same practices responsible for each error codes

### Error scenarios
EMIS sending: Core extract not sent - error code 20 sender error

EMIS - EMIS: final error 25

EMIS sending or receiving: COPCs not acknowledged - no error code

EMIS to TPP: final error 31

EMIS sending: Core extract not sent - sender error 19

EMIS sending: Request not acknowledged - no error

EMIS sending: Contains fatal sender error - sender error 14

In [None]:
import pandas as pd 
import numpy as np
import paths
from data.practice_metadata import read_asid_metadata

In [None]:
asid_lookup=read_asid_metadata("prm-gp2gp-ods-metadata-preprod", "v2/2021/9/organisationMetadata.json")

transfer_file_location = "s3://prm-gp2gp-transfer-data-preprod/v5/2021/"

transfer_files = [
    "7/2021-7-transfers.parquet",
    "8/2021-8-transfers.parquet",
    "9/2021-9-transfers.parquet",
]
transfer_input_files = [transfer_file_location + f for f in transfer_files]

transfers_raw = pd.concat((
    pd.read_parquet(f)
    for f in transfer_input_files
))

transfers = transfers_raw\
    .join(asid_lookup.add_prefix("requesting_"), on="requesting_practice_asid", how="left")\
    .join(asid_lookup.add_prefix("sending_"), on="sending_practice_asid", how="left")\

transfers['month']=transfers['date_requested'].dt.to_period('M')

# Supplier name mapping
supplier_renaming = {
    "SystmOne":"TPP",
    None: "Unknown"
}

transfers["sending_supplier"] = transfers["sending_supplier"].replace(supplier_renaming.keys(), supplier_renaming.values())
transfers["requesting_supplier"] = transfers["requesting_supplier"].replace(supplier_renaming.keys(), supplier_renaming.values())

In [None]:
monthly_transfers_per_sending_practice = transfers.pivot_table(
        index=["sending_practice_name", "sending_practice_ods_code"], 
        columns="month", 
        values="conversation_id", 
        aggfunc="count"
)


def generate_sender_error_scenario_overview(monthy_transfers_with_error_scenario, error_scenario_name):
    monthly_error_scenario_overview = monthly_transfers_per_sending_practice.merge(
        monthy_transfers_with_error_scenario, 
        how="right", 
        on=["sending_practice_name", "sending_practice_ods_code"]
    )


    monthly_error_scenario_overview = (
        monthly_error_scenario_overview
            .rename(
                {"2021-07_x" : "Total transfers: Jul", 
                 "2021-08_x" : "Total transfers: Aug",
                 "2021-09_x" : "Total transfers: Sept",
                 "2021-07_y" : f"Transfers with {error_scenario_name}: Jul",
                 "2021-08_y" : f"Transfers with {error_scenario_name}: Aug",
                 "2021-09_y" : f"Transfers with {error_scenario_name}: Sept"}, axis=1)
            .reset_index()
    )

    monthly_error_scenario_overview["% of transfers in Jul"] = (
            (monthly_error_scenario_overview[f"Transfers with {error_scenario_name}: Jul"] / monthly_error_scenario_overview["Total transfers: Jul"])
            .multiply(100)
            .round(2)
    )

    monthly_error_scenario_overview["% of transfers in Aug"] = (
            (monthly_error_scenario_overview[f"Transfers with {error_scenario_name}: Aug"] / monthly_error_scenario_overview["Total transfers: Aug"])
            .multiply(100)
            .round(2)
    )

    monthly_error_scenario_overview["% of transfers in Sept"] = (
            (monthly_error_scenario_overview[f"Transfers with {error_scenario_name}: Sept"] / monthly_error_scenario_overview["Total transfers: Sept"])
            .multiply(100)
            .round(2)
    )

    return monthly_error_scenario_overview.fillna(0).sort_values(by="% of transfers in Sept", ascending=False)

### EMIS sending: Core extract not sent - error code 20 sender error

In [None]:
emis_sender_bool = transfers["sending_supplier"] == "EMIS"
core_extract_not_sent_bool = transfers["failure_reason"] == "Core extract not sent"
sender_error_code_20_bool = transfers["sender_error_codes"].apply(lambda error_codes: 20 in error_codes)
emis_transfers_with_error_20 = transfers[emis_sender_bool & core_extract_not_sent_bool & sender_error_code_20_bool].copy()

monthly_emis_transfers_with_error_20 = emis_transfers_with_error_20.pivot_table(
        index=["sending_practice_name", "sending_practice_ods_code"], 
        columns="month", values="conversation_id", 
        aggfunc="count"
)

monthly_error_20_overview = generate_sender_error_scenario_overview(monthly_emis_transfers_with_error_20, "error code 20")
monthly_error_20_overview

### EMIS - EMIS: final error 25 from sender

In [None]:
emis_sender_bool = transfers["sending_supplier"] == "EMIS"
emis_requesting_bool = transfers["requesting_supplier"] == "EMIS"
final_error_code_25_bool = transfers["final_error_codes"].apply(lambda error_codes: 25 in error_codes)
emis_transfers_with_final_error_25 = transfers[emis_sender_bool & emis_requesting_bool & final_error_code_25_bool].copy()

monthly_emis_transfers_with_final_error_25 = emis_transfers_with_final_error_25.pivot_table(
        index=["sending_practice_name", "sending_practice_ods_code"], 
        columns="month", values="conversation_id", 
        aggfunc="count"
)

monthly_error_25_overview = generate_sender_error_scenario_overview(monthly_emis_transfers_with_final_error_25, "error code 25")
monthly_error_25_overview

In [None]:
with pd.ExcelWriter("Emis Error Code Scenarios by Practice PRMT-2414.xlsx") as writer:
    monthly_error_20_overview.to_excel(writer, sheet_name="Error 20 overview",index=False)
    monthly_error_25_overview.to_excel(writer, sheet_name="Error 25 overview",index=False)