## charlton co-morbidity index

Automated calculation of CCMI using weightings and calculation from: https://www.mdcalc.com/calc/3917/charlson-comorbidity-index-cci 


ICD-10 code lists taken from: 
https://journals.lww.com/lww-medicalcare/_layouts/15/oaks.journals/ImageView.aspx?k=lww-medicalcare:2005:11000:00010&i=T1-10&year=2005&issue=11000&article=00010&type=Fulltext

Quan, Hude MD, PhD*†; Sundararajan, Vijaya MD, MPH, FACP‡; Halfon, Patricia MD§; Fong, Andrew BCOMM*; Burnand, Bernard MD, MPH§; Luthi, Jean-Christophe MD, PhD§; Saunders, L Duncan MBBCh, PhD¶; Beck, Cynthia A. MD, MASc*∥; Feasby, Thomas E. MD**; Ghali, William A. MD, MPH*††† Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data, Medical Care: November 2005 - Volume 43 - Issue 11 - p 1130-1139
doi: 10.1097/01.mlr.0000182534.19832.83 

SNOMED-CT code lists developed using Expression Constraint Language with clinical review.


| Disease                       | ICD-10 | SNOMED-CT (ECL)                                                                                                                                                                                                                                                     |
|-------------------------------|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| MyocardialInfarction          |  I21, I22, 1252      | << 22298006 \| Myocardial infarction (disorder) \|                                                                                                                                                                                                                  |
| CongestiveHeartFailure        | I099, I110, I13[02], I255, I42[056789], I43, I50, P290      | << 42343007 \| Congestive heart failure (disorder) \|                                                                                                                                                                                                               |
| PeripheralVascularDisease     | I7[01], I73[189], I771, I79[02], K55[189], Z95[89]       | << 400047006 \| Peripheral vascular disease (disorder) \|                                                                                                                                                                                                           |
| CVA                           | G4[56], H340, I6       | << 230690007 \|Cerebrovascular accident (disorder) \| or << 266257000 \| Transient ischemic attack (disorder) \|  or  <<432504007 \|Cerebral infarction (disorder\|)                                                                                                |
| Dementia                      | F0[0-3], F051, G30, G311       | << 52448006 \| Dementia (disorder) \|                                                                                                                                                                                                                               |
| COPD                          |  I27[89], J4[0-7], J6[0-7], J684, J70[13]      | << 13645005 \| Chronic obstructive lung disease (disorder) \|                                                                                                                                                                                                       |
| ConnectiveTissueDisease       |  M0[56], M315, M3[2-4], M35[3], M360      | (<< 55464009\|Systemic lupus erythematosus\| OR << 69896004\|Rheumatoid arthritis\| OR << 156370009\|Psoriatic arthritis\| OR << 239821006\|Secondary inflammatory arthritis\| OR < 9631008\|Ankylosing spondylitis\| OR << 128460000\|Systemic sclerosis, diffuse\|) MINUS 1148595001 \| Subacute arthritis due to and following rheumatic fever (disorder) \| |
| PepticUlcer                   |  K2[5-8]      | <<13200003 \| Peptic ulcer (disorder) \| or <<266998003 \| History of peptic ulcer (situation) \| or << 397825006 \| Gastric ulcer (disorder)\|                                                                                                                                                               |
| MildLiverDisease              | B18, K70[0-39], K717, K7[34], K76[02-489], Z944       | << 128241005 \| Inflammatory disease of liver (disorder) \| :  << 263502005 \| Clinical course (attribute) \| != << 424124008 \| Sudden onset AND/OR short duration (qualifier value)\|                                                                                      |
| ModerateSevereLiverDisease    | I85[09], I864, I982, K704, K7[12]1, K729, K76[567]       | << 235856003 \| Disorder of liver (disorder) \| MINUS 773113008 \| Acute hepatitis caused by infection (disorder) \|                                                                                                                                                                                                                   |
| DiabetesNoComplications       |  E1[0-4][1689]      | << 73211009 \| Diabetes mellitus (disorder) \| MINUS << 703136005 \| Diabetes mellitus in remission (disorder)\|                                                                                                                                                    |
| DiabetesComplications         | E1[0-4][23457]       | << 116223007\|Complication\| : 42752001\|Due to\| = << 73211009\|Diabetes mellitus\|                                                                                                                                                                                |
| Hemiplegia                    |  G041, G114, G80[12], G8[12], G83[0-40]      | << 50582007 \| Hemiplegia (disorder) \|                                                                                                                                                                                                                             |
| ChronicKidneyDisease          |  I120, I131, N0[35][2-7], N1[89], N25, Z49[012], Z940, Z992      | << 709044004 \| Chronic kidney disease (disorder) \|                                                                                                                                                                                                                |
| MalignantHematologicalDisease |  C[016], C2[0-6], C3[7-9], C4[013-9], C5[0-8], C7[0-34-6]      | << 64572001\|Disease\| : << 116676008\|Associated morphology\| = << 86049000 \|Malignant neoplasm, primary (morphologic abnormality)\|                                                                                                                              |
| MetastaticDisease             |  C7[789], C80      | << 128462008 \| Secondary malignant neoplastic disease (disorder) \| OR (64572001\|Disease\| : 116676008\|Associated morphology\| = 14799000\|Neoplasm, metastatic\|)                                                                                               |
| LeukaemiaMultipleMyeloma      |  C9[0-7]      | << 93143009 \|Leukemia, disease (disorder)\|                                                                                                                                                               |
| Lymphoma                      |  C8[1-58]      |   << 118600007 \| Malignant lymphoma (disorder)\|                                                                                                                                                       |
| AIDS                          |  B2[0-24]      | << 763713000 \| Idiopathic CD4 lymphocytopenia (disorder) \| OR ( \*: << 246075003 \|Causative agent (attribute)\|= <<19030005 \| Human immunodeficiency virus (organism) \|)                                                                                     |


## Set up

Read in packages, define disorders of interest, assign weightings

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import pyodbc
import datetime
import functools as ft
import requests

pd.set_option('max_colwidth', None)

In [None]:
def MyocardialInfarction():   
    codelist = ECL_to_conceptid('<< 22298006')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('I21|I22|I252', regex=True, na=False)].mrn
    c=pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["MyocardialInfarction"] = 1
    return d

def CongestiveHeartFailure():
    codelist=ECL_to_conceptid('<<42343007')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('I099|I13[02]|I255|I42[05-9]|I43|I50|P290', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["CongestiveHeartFailure"] = 1
    return d

def PeripheralVascularDisease():   
    codelist = ECL_to_conceptid('<< 400047006')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('I7[01]|I73[189]|I771|I79[02]|k55[189]|Z95[89]', regex=True, na=False)].mrn
    c=pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["PeripheralVascularDisease"] = 1
    return d

def CVA():
    codelist=ECL_to_conceptid('<< 230690007 |Cerebrovascular accident (disorder) |  or << 266257000 | Transient ischemic attack (disorder) |  or  <<432504007 |Cerebral infarction (disorder|')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('G4[56]|H340|I6', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["Stroke"] = 1
    return d
 
def Dementia():
    codelist = ECL_to_conceptid('<< 52448006')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('F0[0-3]|F051|G30|G311', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["Dementia"] = 1
    return d

def COPD():
    codelist = ECL_to_conceptid('<< 13645005')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('I27[89]|J4[0-7]|J6[0-7]|J684|J70[13]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["RespiratoryDisease"] = 1
    return d    

def ConnectiveTissueDisease():
    codelist = ECL_to_conceptid('(<< 55464009|Systemic lupus erythematosus| OR << 69896004|Rheumatoid arthritis| OR << 156370009|Psoriatic arthritis| OR << 239821006|Secondary inflammatory arthritis| OR < 9631008|Ankylosing spondylitis| OR << 128460000|Systemic sclerosis, diffuse|) MINUS 1148595001 | Subacute arthritis due to and following rheumatic fever (disorder) |')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('M0[56]|M315|M3[2-4]|M35[3]|M360', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["ConnectiveTissueDisease"] = 1
    return d 

def PepticUlcer():
    codelist = ECL_to_conceptid('<<13200003 | Peptic ulcer (disorder) | or <<266998003 | History of peptic ulcer (situation) | or << 397825006 | Gastric ulcer (disorder)|')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('K2[5-8]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["PepticUlcer"] = 1
    return d 

def MildLiverDisease():
    codelist = ECL_to_conceptid('<< 128241005 |Inflammatory disease of liver (disorder)| :  << 263502005 |Clinical course (attribute)| != << 424124008 |Sudden onset AND/OR short duration (qualifier value)|')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('B18|K70[0-39]|K717|K7[34]|K76[02-489]|Z944', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["MildLiverDisease"] = 1
    return d 

def ModerateSevereLiverDisease():
    codelist = ECL_to_conceptid('<< 235856003 MINUS 773113008 | Acute hepatitis caused by infection (disorder) |')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('I85[09]|I864|I982|K704|K7[12]1|K729|K76[567]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["ModerateSevereLiverDisease"] = 1
    return d 

def DiabetesNoComplications():
    codelist = ECL_to_conceptid('<< 73211009 MINUS << 703136005')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('E1[0-4][1689]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["DiabetesNoComplications"] = 1
    return d 

def DiabetesComplications():
    codelist = ECL_to_conceptid('<< 116223007|Complication| : 42752001|Due to| = << 73211009|Diabetes mellitus|')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('E1[0-4][23457]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["DiabetesComplications"] = 1
    return d
    
def Hemiplegia():
    codelist = ECL_to_conceptid('<< 50582007')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('G041|G114|G80[12]|G8[12]|G83[0-40]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["Hemiplegia"] = 1
    return d

def ChronicKidneyDisease():
    codelist = ECL_to_conceptid('<< 709044004')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('I120|I131|N0[35][2-7]|N1[89]|N25|Z49[012]|Z940|Z992', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["ChronicKidneyDisease"] = 1
    return d

def MalignantHematologicalDisease():
    codelist = ECL_to_conceptid('<< 64572001|Disease| : << 116676008|Associated morphology| = << 86049000 |Malignant neoplasm, primary (morphologic abnormality)|')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('C[016]|C2[0-6]|C3[7-9]|C4[013-9]|C5[0-8]|C7[0-34-6]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["MalignantHematologicalDisease"] = 1
    return d

def MetastaticDisease():
    codelist = ECL_to_conceptid('<< 128462008 | Secondary malignant neoplastic disease (disorder) | OR (64572001|Disease| : 116676008|Associated morphology| = 14799000|Neoplasm, metastatic|)')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('C7[789]|C80', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["MetastaticDisease"] = 1
    return d

def LeukaemiaMultipleMyeloma():
    codelist = ECL_to_conceptid('<< 93143009 |Leukemia, disease (disorder)|')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('C9[0-7]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["LeukaemiaMultipleMyeloma"] = 1
    return d

def Lymphoma():
    codelist = ECL_to_conceptid('  << 118600007 |Malignant lymphoma (disorder)|')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('C8[1-58]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["Lymphoma"] = 1
    return d

def AIDS():
    codelist = ECL_to_conceptid('<< 763713000 | Idiopathic CD4 lymphocytopenia (disorder) | OR (*: << 246075003 |Causative agent (attribute)|= <<19030005 | Human immunodeficiency virus (organism)|)')
    a=data[data.ConceptID.isin(codelist.code)].mrn    
    b=ICDdata[ICDdata.ICD_Diagnosis_Cd.str.contains('B2[0-24]', regex=True, na=False)].mrn
    c =pd.concat([a,b], ignore_index= True).drop_duplicates()
    d=pd.DataFrame(c)
    d["AIDS"] = 1
    return d

In [None]:
CoMorb_weightings = {'MyocardialInfarction': 1,
    'CongestiveHeartFailure':1,
    'PeripheralVascularDisease': 1,
    'Stroke': 1,
    'Dementia': 1,
    'RespiratoryDisease': 1,
    'ConnectiveTissueDisease': 1,
    'PepticUlcer': 1,
    'MildLiverDisease' : 1,
    'ModerateSevereLiverDisease': 3,
    #'Diabetes diet controlled': 0, # no weighting so no need to include
    'DiabetesNoComplications': 1,
    'DiabetesComplications': 2,
    'Hemiplegia': 2,
    'ChronicKidneyDisease': 2,
    'MalignantHematologicalDisease': 2,
    'MetastaticDisease': 6,
    'LeukaemiaMultipleMyeloma': 2,
    'Lymphoma': 2,
    'AIDS': 6}

## Data extraction

cohort - definition of patient cohort lists

In [None]:
conn = pyodbc.connect('Driver={xxxx};'
                     'Server=xxxx;'
                     'Database=xxxx;'
                     'Trusted_Connection=xxxx;')
cursor = conn.cursor()

In [None]:
cohort = 'patientlist'

# SNOMED problems and diagnoses 
sql = '''
SELECT distinct patientID, SNOMED_conceptID, SNOMED_humanreadable FROM DWH_problems_list p
where (confirmation = 'Confirmed' or Confirmation = 'Probable') and (Status_Lifecycle = 'Active' or Status_Lifecycle = 'Resolved')
and exists (select 1 from ''' + cohort + ''' c where p.patientID = c.patientID)'''

data = pd.read_sql(sql, conn)
data.drop_duplicates(inplace=True)

In [None]:
# ICD10 code lists
sql = '''
select distinct patientID, icd.ICD_Diagnosis_Cd from DWH_ICD10_list icd
where exists ( select 1 from ''' + cohort + ''' c where icd.patientID=c.patientID)'''

ICDdata = pd.read_sql(sql, conn)

In [None]:
# age
sql = '''
select distinct patientID, a.Birth_Dt from DWH_demographics d 
where exists (select 1 from ''' + cohort + '''  c where d.patientID=c.patientID)'''

agesql = pd.read_sql(sql, conn)

# close server connection
cursor.close()
conn.close()

## NHS Digital terminology server to run ECL expressions and produce code lists for disorders of interest

In [None]:
# NHS Terminology server sign-in details
client_id = "name"
client_secret = "xxxxx"

#API endpoints
authoring_server = "https://ontology.nhs.uk/authoring/fhir"
production1_server = "https://ontology.nhs.uk/production1/fhir"
production2_server = "https://ontology.nhs.uk/production2/fhir"
token_server = "https://ontology.nhs.uk/authorisation/auth/realms/nhs-digital-terminology/protocol/openid-connect/token"
authorisation_server = "https://ontology.nhs.uk/authorisation/auth/realms/nhs-digital-terminology/protocol/openid-connect/auth"

# get access_token from token server
auth_data =  {'grant_type': 'client_credentials', 'client_id': client_id, 'client_secret': client_secret} 
token_response_json = requests.post(token_server, data=auth_data).json()
access_token=token_response_json['access_token']
expires_in=token_response_json['expires_in'] # could be used to time new token request or refresh
refresh_token=token_response_json['refresh_token'] # can be used to refresh access_token

## Expression Constraint Language as function
def ECL_to_conceptid(ecl):
    lookup_url=production1_server + '/ValueSet/$expand'
    request_headers = {'Authorization': 'Bearer ' + access_token}
    query={'url':'http://snomed.info/sct?fhir_vs=ecl/'+ ecl, 'count':1000}
    lookup_response_json = requests.get(lookup_url, headers=request_headers, params=query).json()
    _list = []
    _list2 = []
    for a in lookup_response_json['expansion']['contains']:
        _list.append(a['code'])
        _list2.append(a['display'])
      
    #convert to a pandas dataframe and rename column headers
    dataset = pd.DataFrame(list(zip(_list, _list2)),
                  columns =['code', 'display'])
    
    return dataset

### Find comorbidities in patient lists, add age weighting, calculate CCMI

In [None]:
dfs=[MyocardialInfarction(), CongestiveHeartFailure(), PeripheralVascularDisease(), CVA(), Dementia(), COPD(), ConnectiveTissueDisease(), PepticUlcer(), MildLiverDisease(), ModerateSevereLiverDisease(), DiabetesNoComplications(), DiabetesComplications(), Hemiplegia(), ChronicKidneyDisease(), MalignantHematologicalDisease(), MetastaticDisease(),LeukaemiaMultipleMyeloma() , Lymphoma(),AIDS()]

In [None]:
df_final = ft.reduce(lambda left, right: pd.merge(left, right, how='outer',on='mrn'), dfs)
df_final.set_index('mrn', inplace=True)
df_weighted= df_final.assign(**CoMorb_weightings).mul(df_final)

# For conditional weightings in: liver disease, Diabetes Mellitus, solid tumour
df_weighted.loc[~df_weighted['ModerateSevereLiverDisease'].isnull(), 'MildLiverDisease'] = np.nan
df_weighted.loc[~df_weighted['DiabetesComplications'].isnull(), 'DiabetesNoComplications'] = np.nan
df_weighted.loc[~df_weighted['MetastaticDisease'].isnull(), 'MalignantHematologicalDisease'] = np.nan


CCMI= df_weighted.sum(1).to_frame(name='comorbidities')

In [None]:
now = pd.Timestamp('now')
agesql.set_index('mrn', inplace=True)
agesql['Birth_Dt']=pd.to_datetime(agesql['Birth_Dt'])
agesql['years']= now - agesql['Birth_Dt'] 
agesql['years'] = agesql['years'].apply(lambda x: x / np.timedelta64(1,'Y')).apply(np.floor)

Age_weightings= {'(0.0, 50.0]':0,'(50.0, 60.0]':1,'(60.0, 70.0]':2,'(70.0, 80.0]':3,'(80.0, 120.0]':4}
agesql['band']=pd.cut(agesql['years'], bins=[0,50,60, 70, 80, 120.0])
agesql.dropna(inplace=True)
agesql['band']=agesql['band'].astype(str)
age=agesql.replace({"band": Age_weightings})
age.drop(columns=['years','Birth_Dt'], inplace=True)

In [None]:
# join age with other ccmi
df = age.merge(CCMI, on='patientID', how = 'left') # require an age

# for comorbidities replace nan with 0
df['comorbidities'] = df['comorbidities'].fillna(0)
df=df.astype('int32')
df['CCMI']=df.sum(axis=1)
df.drop(columns=['band','comorbidities'], inplace=True)

In [None]:
df.head()