# Codelists for Inclusion Criteria
 
**Description** This notebook creates the codelists for Aortic Stenosis, SAVR and TAVI.
 
**Authors** Fionna Chalmers, Anna Stevenson (Health Data Science Team, BHF Data Science Centre)

**Reviewers** âš  UNREVIEWED

**Notes** Note that TAVI code combinatons contain SAVR codes thus to derive a SAVR case, it must be established that a TAVI combination does not exist.

**Data Output**
- **`ccu056_out_codelists_inclusionsd`** : codelist for AS, SAVR and TAVI

# 0. Setup

In [0]:
# pyspark libraries
import pyspark.sql.functions as f
import pyspark.sql.types as t
from pyspark.sql import Window

from functools import reduce

import databricks.koalas as ks
import pandas as pd
import numpy as np

import re
import io
import datetime

# plotting libraries
import matplotlib
import matplotlib.pyplot as plt
from matplotlib import dates as mdates
import seaborn as sns


# versions
print("Matplotlib version: ", matplotlib.__version__)
print("Seaborn version: ", sns.__version__)
_datetimenow = datetime.datetime.now() # .strftime("%Y%m%d")
print(f"_datetimenow:  {_datetimenow}")


##0.1 Helpers

### Common Functions

In [0]:
%run "/Repos/shds/common/functions"

### Help Functions

In [0]:
%run "/Repos/shds/Fionna/help_functions"

##0.2 Parameters

In [0]:
%run "./CCU056-01-parameters"

##0.3 Data

In [0]:
icd10 = spark.table(path_ref_icd10)
opcs = spark.table(path_ref_opcs4)
gdppr_ref = spark.table(path_ref_gdppr_refset)

# 1. Codelists

## Aortic Senosis


Supplied by Anvesha:

| ICD10 Code | Definition |
| ---------- | ---------- |
| I06.0	| Rheumatic aortic stenosis |
| I06.2	| Rheumatic aortic stenosis with insufficiency |
| I35.0	| Nonrheumatic aortic (valve) stenosis |
| I35.2	| Aortic (valve) stenosis with insufficiency |
| Q23.0	| Congenital stenosis of aortic valve |


Codes now removed:

| ICD10 Code | Definition |
| ---------- | ---------- |
| I06.8	| Other rheumatic aortic valve diseases |
| I06.9	| Rheumatic aortic valve disease, unspecified |
| I08.0	| Disorders of both mitral and aortic valves |
| I35.8	| Other aortic valve disorders |
| I35.9	| Aortic valve disorder, unspecified |
| I39.1*	| Aortic valve disorders in diseases classified elsewhere |

In [0]:
# # working to copy and paste codes
# display(icd10.filter(f.col("CODE").startswith('Q23')))

In [0]:
codelist_as = spark.createDataFrame(
  [
    ('aortic_stenosis',  'ICD10',  'I060',   'Rheumatic aortic stenosis','',''),
    ('aortic_stenosis',  'ICD10',  'I062',   'Rheumatic aortic stenosis with insufficiency','',''),
    # ('aortic_stenosis',  'ICD10',  'I068',   'Other rheumatic aortic valve diseases','',''),
    # ('aortic_stenosis',  'ICD10',  'I069',   'Rheumatic aortic valve disease, unspecified','',''),

    # ('aortic_stenosis',  'ICD10',  'I080',   'Disorders of both mitral and aortic valves','',''),

    ('aortic_stenosis',  'ICD10',  'I350',   'Aortic (valve) stenosis','',''),
    ('aortic_stenosis',  'ICD10',  'I352',   'Aortic (valve) stenosis with insufficiency','',''),
    # ('aortic_stenosis',  'ICD10',  'I358',   'Other aortic valve disorders','',''),
    # ('aortic_stenosis',  'ICD10',  'I359',   'Aortic valve disorder, unspecified','',''),

    # ('aortic_stenosis',  'ICD10',  'I391',   'Aortic valve disorders in diseases classified elsewhere','',''),
    
    ('aortic_stenosis',  'ICD10',  'Q230',   'Congenital stenosis of aortic valve','','')
    
  ],
  
  ['name', 'terminology', 'code', 'term', 'code_type', 'RecordDate']  
)


# Reformat
# remove trailing X's, decimal points, dashes, and spaces
codelist_as = (
  codelist_as
  .withColumn('_code_old', f.col('code'))
  .withColumn('code', f.when(f.col('terminology') == 'ICD10', f.regexp_replace('code', r'X$', '')).otherwise(f.col('code')))\
  .withColumn('code', f.when(f.col('terminology') == 'ICD10', f.regexp_replace('code', r'[\.\-\s]', '')).otherwise(f.col('code')))
  .withColumn('_code_diff', f.when(f.col('code') != f.col('_code_old'), 1).otherwise(0))
)

# check
tmpt = tab(codelist_as, '_code_diff'); print()
print(codelist_as.where(f.col('_code_diff') == 1).orderBy('name', 'terminology', 'code').toPandas().to_string()); print()

In [0]:
display(codelist_as)

## SAVR


Supplied by Anvesha:
OPCS-4.10 codes for Aortic Valve Replacement<br>
https://classbrowser.nhs.uk/#/book/OPCS-4.10/volume1-p2-4.html+K26.4

Note that in the document from Anvesha K265 to K269 codes are in brackets - query this.<br>
Note also that Anvesha included the K26 parent code but this has not been included as all child codes in K26 are also listed.

| OPCS4 Code | Definition |
| ---------- | ---------- |
| K26	| Plastic repair of aortic valves |
| K261	| Allograft replacement of aortic valve |
| K262	| Xenograft replacement of aortic valve |
| K263	| Prosthetic replacement of aortic valve |
| K264	| Replacement of aortic valve NEC |
| K265	| Aortic valve repair NEC Includes: Aortic valvuloplasty NEC |
| K268	| Other specifiedy |
| K269	| Unspecified |

In [0]:
# # working to copy and paste codes
# display(opcs.filter(f.col("OPCS_CODE").startswith('Y49'))
#         .filter(f.col("OPCS_VERSION")=="4_10").orderBy("OPCS_CODE")
#         .select("OPCS_CODE","ALT_OPCS_CODE","OPCS_CODE_DESC_FULL","CATEGORY_CODE","CATEGORY_CODE_DESC_FULL")
#         )

In [0]:
codelist_savr = spark.createDataFrame(
  [
    ('savr',  'OPCS4',  'K261',   'Allograft replacement of aortic valve','',''),
    ('savr',  'OPCS4',  'K262',   'Xenograft replacement of aortic valve','',''),
    ('savr',  'OPCS4',  'K263',   'Prosthetic replacement of aortic valve','',''),
    ('savr',  'OPCS4',  'K264',   'Replacement of aortic valve NEC','',''),
    ('savr',  'OPCS4',  'K265',   'Aortic valve repair NEC','',''),
    ('savr',  'OPCS4',  'K268',   'Other specified plastic repair of aortic valve','',''),
    ('savr',  'OPCS4',  'K269',   'Unspecified plastic repair of aortic valve','','')

  ],
  
  ['name', 'terminology', 'code', 'term', 'code_type', 'RecordDate']  
)

In [0]:
display(codelist_savr)

## TAVI

Supplied by Anvesha:
https://classbrowser.nhs.uk/ref_books/OPCS-4.9_NCCS-2021.pdf

| OPCS4 Code | Definition |
| ---------- | ---------- |
| PCSK1	| Transcatheter aortic valve implantation (K26) |

Using the classbrowser above there are 2 TAVI code combinations:
    
For transcatheter aortic valve implantation (TAVI) using a **surgical approach** through left
ventricle (transapical or transventricular approach) the following codes must be assigned:
* K26.- Plastic repair of aortic valve
* Y49.4 - Transapical approach to heart
* Y53 - Approach to organ under image control **or**<br>
  Y68.- Other approach to organ under image control
  
For TAVI using a **transluminal approach** through an artery (i.e. femoral, subclavian, axillary
or aorta) the following codes must be assigned:
* K26.- Plastic repair of aortic valve
* Y79.- Approach to organ through artery
* Y53.- Approach to organ under image control **or**<br>
 Y68.- Other approach to organ under image control

 **TAVI will be defined using the above 2 combinations. That is, we will use the surgical approach TAVI definition and the transluminal approach TAVI definition.**

TAVI combinations will be identified and named as follows:<br>
 **TAVI 1** - K26 & Y49 & Y53<br>
 **TAVI 2** - K26 & Y49 & Y68<br>
 **TAVI 3** - K26 & Y79 & Y53<br>
 **TAVI 4** - K26 & Y79 & Y68

In [0]:
# surgical approach
codelist_tavi_sa = spark.createDataFrame(
    [
        ('tavi', 'OPCS4', 'K261', 'K26', 'Allograft replacement of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K262', 'K26', 'Xenograft replacement of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K263', 'K26', 'Prosthetic replacement of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K264', 'K26', 'Replacement of aortic valve NEC', '', ''),
        ('tavi', 'OPCS4', 'K265', 'K26', 'Aortic valve repair NEC', '', ''),
        ('tavi', 'OPCS4', 'K268', 'K26', 'Other specified plastic repair of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K269', 'K26', 'Unspecified plastic repair of aortic valve', '', ''),

        ('tavi', 'OPCS4', 'Y494', 'Y49', 'Transapical approach to heart', '', ''),

        ('tavi', 'OPCS4', 'Y531', 'Y53', 'Approach to organ under radiological control', '', ''),
        ('tavi', 'OPCS4', 'Y532', 'Y53', 'Approach to organ under ultrasonic control', '', ''),
        ('tavi', 'OPCS4', 'Y533', 'Y53', 'Approach to organ under computed tomography scan control', '', ''),
        ('tavi', 'OPCS4', 'Y533', 'Y53', 'Approach to organ under CT scan control', '', ''),
        ('tavi', 'OPCS4', 'Y534', 'Y53', 'Approach to organ under fluoroscopic controly', '', ''),
        ('tavi', 'OPCS4', 'Y535', 'Y53', 'Approach to organ under image intensifier', '', ''),
        ('tavi', 'OPCS4', 'Y536', 'Y53', 'Approach to organ under video control', '', ''),
        ('tavi', 'OPCS4', 'Y537', 'Y53', 'Approach to organ under MRI control', '', ''),
        ('tavi', 'OPCS4', 'Y537', 'Y53', 'Approach to organ under magnetic resonance imaging control', '', ''),
        ('tavi', 'OPCS4', 'Y538', 'Y53', 'Other specified approach to organ under image control', '', ''),
        ('tavi', 'OPCS4', 'Y539', 'Y53', 'Unspecified approach to organ under image control', '', ''),

        ('tavi', 'OPCS4', 'Y681', 'Y68', 'Approach to organ under contrast enhanced ultrasonic control', '', ''),
        ('tavi', 'OPCS4', 'Y688', 'Y68', 'Other specified other approach to organ under image control', '', ''),
        ('tavi', 'OPCS4', 'Y689', 'Y68', 'Unspecified other approach to organ under image control', '', ''),
    ],

    ['name', 'terminology', 'code', 'parent', 'term', 'code_type', 'RecordDate']
)

#  transluminal approach
codelist_tavi_ta = spark.createDataFrame(
    [
        ('tavi', 'OPCS4', 'K261', 'K26', 'Allograft replacement of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K262', 'K26', 'Xenograft replacement of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K263', 'K26', 'Prosthetic replacement of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K264', 'K26', 'Replacement of aortic valve NEC', '', ''),
        ('tavi', 'OPCS4', 'K265', 'K26', 'Aortic valve repair NEC', '', ''),
        ('tavi', 'OPCS4', 'K268', 'K26', 'Other specified plastic repair of aortic valve', '', ''),
        ('tavi', 'OPCS4', 'K269', 'K26', 'Unspecified plastic repair of aortic valve', '', ''),

        ('tavi', 'OPCS4', 'Y791', 'Y79', 'Transluminal approach to organ through subclavian artery', '', ''),
        ('tavi', 'OPCS4', 'Y792', 'Y79', 'Transluminal approach to organ through brachial artery', '', ''),
        ('tavi', 'OPCS4', 'Y793', 'Y79', 'Transluminal approach to organ through femoral artery', '', ''),
        ('tavi', 'OPCS4', 'Y794', 'Y79', 'Transluminal approach to organ through aortic artery', '', ''),
        ('tavi', 'OPCS4', 'Y795', 'Y79', 'Transluminal approach to organ through radial artery', '', ''),
        ('tavi', 'OPCS4', 'Y798', 'Y79', 'Other specified approach to organ through artery', '', ''),
        ('tavi', 'OPCS4', 'Y799', 'Y79', 'Unspecified approach to organ through artery', '', ''),

        ('tavi', 'OPCS4', 'Y531', 'Y53', 'Approach to organ under radiological control', '', ''),
        ('tavi', 'OPCS4', 'Y532', 'Y53', 'Approach to organ under ultrasonic control', '', ''),
        ('tavi', 'OPCS4', 'Y533', 'Y53', 'Approach to organ under computed tomography scan control', '', ''),
        ('tavi', 'OPCS4', 'Y533', 'Y53', 'Approach to organ under CT scan control', '', ''),
        ('tavi', 'OPCS4', 'Y534', 'Y53', 'Approach to organ under fluoroscopic controly', '', ''),
        ('tavi', 'OPCS4', 'Y535', 'Y53', 'Approach to organ under image intensifier', '', ''),
        ('tavi', 'OPCS4', 'Y536', 'Y53', 'Approach to organ under video control', '', ''),
        ('tavi', 'OPCS4', 'Y537', 'Y53', 'Approach to organ under MRI control', '', ''),
        ('tavi', 'OPCS4', 'Y537', 'Y53', 'Approach to organ under magnetic resonance imaging control', '', ''),
        ('tavi', 'OPCS4', 'Y538', 'Y53', 'Other specified approach to organ under image control', '', ''),
        ('tavi', 'OPCS4', 'Y539', 'Y53', 'Unspecified approach to organ under image control', '', ''),

        ('tavi', 'OPCS4', 'Y681', 'Y68', 'Approach to organ under contrast enhanced ultrasonic control', '', ''),
        ('tavi', 'OPCS4', 'Y688', 'Y68', 'Other specified other approach to organ under image control', '', ''),
        ('tavi', 'OPCS4', 'Y689', 'Y68', 'Unspecified other approach to organ under image control', '', ''),
    ],

    ['name', 'terminology', 'code', 'parent', 'term', 'code_type', 'RecordDate']
)

In [0]:
display(codelist_tavi_sa)

In [0]:
display(codelist_tavi_ta)

# 2. Save

In [0]:
codelists_inclusions = (
    codelist_as.select("name","terminology","code","term")
    .union(codelist_savr.select("name","terminology","code","term"))
    .union(codelist_tavi_sa.select("name","terminology","code","term"))
    .union(codelist_tavi_ta.select("name","terminology","code","term"))
    .distinct()
    .withColumn("parent",f.when(f.col("name")=="tavi",f.substring(f.col("code"), 1, 3)).otherwise(f.lit("")))
    .orderBy("name","code")
    )

display(codelists_inclusions)

In [0]:
save_table(df=codelists_inclusions, out_name=f'{proj}_out_codelists_inclusions', save_previous=False)

In [0]:
codelists_inclusions =spark.table(f'{dsa}.{proj}_out_codelists_inclusions')

In [0]:
display(codelists_inclusions)