## Framingham Risk Calculator - Jupyter

INSTRUCTIONS (<i>from the the MDCalc version of the calculator found [here](https://www.mdcalc.com/framingham-risk-score-hard-coronary-heart-disease)</i>):

There are several distinct Framingham risk models. MDCalc uses the 'Hard' coronary Framingham outcomes model, which is intended for use in non-diabetic patients age 30-79 years with no prior history of coronary heart disease or intermittent claudication, as it is the most widely applicable to patients without previous cardiac events. See the official Framingham [website](https://www.framinghamheartstudy.org/fhs-risk-functions/hard-coronary-heart-disease-10-year-risk/) for additional Framingham risk models.

### Overview

This notebook will include the following:
* We will use Python to program our own version of the Framingham Risk Calculator. 
* We will pull patient resources from a FHIR server to test our calculator.

A final version of the calculator is included in its own cell at the end of the notebook for those who don't want to go through the entire notebook.

### Calculating the risk

The formula for both men and women can be found [here](https://www.mdcalc.com/framingham-risk-score-hard-coronary-heart-disease#evidence)

In [1]:
import math
male_avg = {range(30,35):'1%',range(35,45):'4%',range(45,50):'8%',range(50,55):'10%',range(55,60):'13%',range(60,65):'20%',range(65,70): '22%',range(70,75):'25%'}
female_avg = {range(30,35):'<1%',range(35,40):'<1%',range(40-45):'1%',range(45,50):'2%',range(50,55):'3%',range(55,60):'7%',range(60,65):'8%',range(65,70): '8%',range(70,75):'11%'}

In [2]:
def male_fram(age,smoker,total_chol,hdl_chol,sys,treated):
    if age < 30 or age > 79:
        return ['old']
    if age > 70:
        age_smoke = 70
    else:
        age_smoke = age
            
    l = 52.000961*math.log(age) + 20.014077*math.log(total_chol) + -0.905964*math.log(hdl_chol) + \
    1.305784*math.log(sys) + 0.241549*treated + 12.096316*smoker + -4.605038*math.log(age)*math.log(total_chol) + \
    -2.84367*math.log(age_smoke)*smoker + -2.93323*math.log(age)*math.log(age) - 172.300168
    
    prob = 1 - 0.9402**math.exp(l)
    if age > 75:
        return [prob*100,'old']
    else:
        for group in male_avg:
            if age in group:
                return [prob*100, male_avg[group]]

In [3]:
def female_fram(age,smoker,total_chol,hdl_chol,sys,treated):
    if age < 30 or age > 79:
        return ['old']
    if age > 78:
        age_smoke = 78
    else:
        age_smoke = age
            
    l = 31.764001*math.log(age) + 22.465206*math.log(total_chol) + -1.187731*math.log(hdl_chol) + \
    2.552905*math.log(sys) + 0.420251*treated + 13.07543*smoker + -5.060998*math.log(age)*math.log(total_chol) + \
    -2.996945*math.log(age_smoke)*smoker - 146.5933061
    
    prob = 1 - 0.98767**math.exp(l)
    if age > 75:
        return [prob*100,'old']
    else:
        for group in female_avg:
            if age in group:
                return [prob*100, female_avg[group]]

### Pulling FHIR data

We need the following data points for each patient:
* Age
* Gender
* Smoking status (Yes = 1, No = 0)
* Systolic blood pressure
* Total cholesterol
* HDL cholesterol
* Is the patient's blood pressure being treated with medicines? (Yes = 1, No = 0)

We will use the Python requests library for executing our queries against the server. This means we need the appropriate URL for each of the resources we want. A helpful tool for building request URLs is the clinFHIR [Server Query](http://clinfhir.com/query.html) tool. It will help you see which request fields are available for each resource. The alternative would be to find request example URLs in the FHIR specification for each resource.

In [5]:
import requests,json,datetime,pandas
base = 'https://api.logicahealth.org/stratfhireducation/open/'

#### Patient resource - name, age, and gender

Typically, a FHIR app is launched within the EHR and, as a result, will have the ID of the patient resource for the subject.

In [6]:
r = requests.get(base + 'Patient?_id=fake')

A query to a FHIR server will typically return a bundle resource. If it did not find the resource you were asking for then the bundle won't contain any resources and will look something like this.

In [7]:
print(r.text)

{
  "resourceType": "Bundle",
  "id": "1501eb6b-9e9b-4e9b-b5b0-1f0027dc5a18",
  "meta": {
    "lastUpdated": "2019-10-23T18:35:23.985+00:00"
  },
  "type": "searchset",
  "total": 0,
  "link": [
    {
      "relation": "self",
      "url": "https://api.logicahealth.org/stratfhireducation/open/Patient?_id=fake"
    }
  ]
}


FHIR resources are in the JSON format. This means that all data fields have a key and a value. However, be aware that some values can be lists of additional key:value pairs. You can identify lists because they use square brackets [ ]. We will print out the result of a valid query so you can get an idea for what this looks like.

In [8]:
#Available patient IDs: 422, 378, 338, 672, 260, 588, 538, 191, 467, 3
patient_id = '422'
r = requests.get(base + 'Patient?_id=' + patient_id)
#print(r.text)

Remember, if the value of a key is a list, you must specify which item in the list you want. As an example, we will extract the actual patient resource from the bundle resource which is the default return type for server queries. Since this is the only patient resource with the ID we used, there is only one item in the "entry" list. This means we want to take the item at position 0 of the "entry" list (the 1st item). 

Python currently sees our response as a string. Before we can extract fields we must cast it as a JSON object using the json.loads() command. 

In [9]:
bundle = json.loads(r.text)
p = bundle['entry'][0]['resource']
#print(p)

We can extract the gender of the patient using a similar command

In [10]:
gender = p['gender']
print(gender)

male


The 'name' field can be confusing at first. The 'given' field (first name) is usually a list while the 'family' field (last name) is usually just a string. The numbers in this example are common to all patients on this server since it's synthetic data.

In [11]:
first = p['name'][0]['given'][0]
last = p['name'][0]['family']
name = first + ' ' + last
print(name)

Dmitri 1Fram


Age will also present some difficulty since we are only provided with the date of birth. Thankfully, Python's datetime library has some helpful tools for working with dates. We can extract the birth year by splitting the string on '-' and selecting the first item in the resulting list. Then, if we convert that to an integer, we can calculate the age.

In [12]:
birth_year = p['birthDate'].split('-')[0]
current_year = datetime.date.today().year
age = current_year - int(birth_year)
print(age)

37


#### Observation resource - smoking status, systolic BP, total cholesterol, HDL cholesterol

Observation resources are a very generic resource and can represent many different things. To make sure that everyone take a standardized approach to using observations, profiles have been created to constrain how to use observations to represent clinical concepts.

A good first step when building the query for a concept you haven't worked with before is to consult the US Core Profile [guide](https://build.fhir.org/ig/HL7/US-Core-R4/). The guidelines change depending on the FHIR version you are using. This example uses the R4 version of FHIR.

If we look into the guide we'll see that there is a defined SmokingStatus [profile](https://build.fhir.org/ig/HL7/US-Core-R4/StructureDefinition-us-core-smokingstatus.html). As we explore the profile we find that a SmokingStatus observation should be given the LOINC code 72166-2. We'll also see that the possible values for each SmokingStatus observation are constrained to a ValueSet of the possible statuses: Current every day smoker, Current some day smoker, Former smoker, Never smoker, Smoker - current status unknown, Unknown if ever smoked, Current Heavy tobacco smoker, Current Light tobacco smoker.

We can then query for the most recent observation with a 'subject' equal to the ID we used already to pull the patient resource, and which also has a 'code' of 72166-2.

If the SmokingStatus observation is 'Never Smoker' then we will give the patient a 'No' value for the calculator, otherwise, we will assume that they are a smoker and will give a 'Yes' value.

In [13]:
url = base + 'Observation?subject=' + patient_id + '&code=72166-2'
smoker = 1

#Send the query
r = requests.get(url)
#Create a JSON object of the result
bundle = json.loads(r.text)
#Check to make sure a resource was returned
if bundle['total'] != 0:
    #Parse the contents of that resource
    status = bundle['entry'][0]['resource']['valueCodeableConcept']['text']
    if status == 'Never smoker':
        smoker = 0
    else:
        print(status)
else:
    print('None found')

Blood pressure is part of the Vital Signs Panel [profile](http://hl7.org/fhir/R4/observation-vitalsigns.html). From that profile we see that there is a defined structure for representing blood pressure levels. We will query for the most recent blood pressure observation and will extract the systolic measurement which are stored as a component in that observation.

In [14]:
url = base + 'Observation?subject=' + patient_id + '&code=55284-4'
sys = ''
#Send the query
r = requests.get(url)
#Create a JSON object of the result
bundle = json.loads(r.text)
#Check to make sure a resource was returned
if bundle['total'] != 0:
    #Parse the contents of that resource
    for component in bundle['entry'][0]['resource']['component']:
        #Check for the systolic value (we don't want dialstolic)
        if component['code']['text'] == 'Systolic Blood Pressure':
            sys = component['valueQuantity']['value']
            print(sys)
else:
    print('None found')

125


Cholesterol values are stored together as part of a single lipid panel, as described in the lipid lab report [profile](https://www.hl7.org/fhir/lipidprofile.html). The actual results are stored as Observation resources. We will search for the most recent values.

In [15]:
#Total cholesterol
url = base + 'Observation?subject=' + patient_id + '&code=2093-3'
total_chol = ''
#Send the query
r = requests.get(url)
#Create a JSON object of the result
bundle = json.loads(r.text)
#Check to make sure a resource was returned
if bundle['total'] != 0:
    #Parse the contents of that resource
    total_chol = bundle['entry'][0]['resource']['valueQuantity']['value']
    print(total_chol)
else:
    print('None found')

200


In [16]:
#HDL cholesterol
url = base + 'Observation?subject=' + patient_id + '&code=2085-9'
hdl_chol = ''
#Send the query
r = requests.get(url)
#Create a JSON object of the result
bundle = json.loads(r.text)
#Check to make sure a resource was returned
if bundle['total'] != 0:
    #Parse the contents of that resource
    hdl_chol = bundle['entry'][0]['resource']['valueQuantity']['value']
    print(hdl_chol)
else:
    print('None found')

58


#### MedicationRequest resource - Is the patient's blood pressure being treated with medicines? 

This last piece of needed information will give a good introduction to working with patient medications. There are a few resources which each represent different aspects of how medications are used in a clinical setting. They are each described in the [R4 Medications Module](http://hl7.org/fhir/medications-module.html). A MedicationRequest resource is generated when the provider creates the instruction for a patient to take a certain medication. As you can imagine, this is usually the most convenient way to see which medications a patient should be taking or should have taken at some point of time. Whether or not the patient actually takes the medication is another story. We will assume that if a MedicationRequest resource exists for a certain medication that the patient took that medication as prescribed.

We will also need to determine whether a given medication is a being used to treat blood pressure or not. To do this we will use a ValueSet resource. Explained [here](https://www.hl7.org/fhir/valueset.html), these are handy resources for constraining possible values which, for our case, is blood pressure medications. A ValueSet resource was created for this example and is stored in the server. It contains 5 values for possible blood pressure medications which are common to our synthetic patients. We will query the ValueSet resource by title which is "blood-pressure-medication" and will extract the associated medication codes. We will then build our URL from this ValueSet to search for any MedicationRequest resources with codes in that URL.

In [17]:
#ValueSet
url = base + 'MedicationRequest?subject=' + patient_id + '&code='
r = requests.get(base + 'ValueSet?title=blood-pressure-medications')
bundle = json.loads(r.text)
valueset = bundle['entry'][0]['resource']['expansion']['contains']
for meds in valueset:
    #Add each code to the URL we will use for our MedicationRequest query
    url += meds['code'] + ','
#Take out trailing comma
url = url[:-1]

#MedicationRequest
treated = 0
#Send the query
r = requests.get(url)
#Create a JSON object of the result
bundle = json.loads(r.text)
#If any resources were returned, then they have been treated
if bundle['total'] > 0:
    treated = 1
    print(treated)
else:
    print('None found')

1


### Putting it all together

The last step is to send each of the variables which we have collected through FHIR resources to the calculating functions. To do this, we will create functions for each of the processes described above (so that you don't need to run each cell in this notebook sequentially) and will call the appropriate calculating function. 

In [3]:
import requests,json,datetime,pandas,math

base = 'https://api.logicahealth.org/stratfhireducation/open/'
patient_ID = '422'
male_avg = {range(30,35):'1%',range(35,45):'4%',range(45,50):'8%',range(50,55):'10%',range(55,60):'13%',range(60,65):'20%',range(65,70): '22%',range(70,75):'25%'}
female_avg = {range(30,35):'<1%',range(35,40):'<1%',range(40-45):'1%',range(45,50):'2%',range(50,55):'3%',range(55,60):'7%',range(60,65):'8%',range(65,70): '8%',range(70,75):'11%'}

def launch():
    result = ''
    d = demographics()
    if d[0] == 'error':
        print('Malformed patient resource for: ' + patient_ID)
        return
    name = d[0]
    gender = d[1]
    age = d[2]
    result += 'Name:\t\t\t' + d[0]
    result += '\nGender:\t\t\t' + d[1]
    result += '\nAge:\t\t\t' + str(d[2])
    smoker = smoking_status()
    result += '\nSmoking status:\t\t' + str(smoker)
    sys = systolic()
    if sys == 'error':
        print('Systolic BP not found for patient: ' + patient_ID)
        return
    result += '\nSystolic BP:\t\t' + str(sys)
    c = cholesterol()
    if c[0] == 'error':
        print('Total Cholesterol or HDL Cholesterol not found for patient: ' + patient_ID)
    total_chol = c[0]
    hdl_chol = c[1]
    result += '\nTotal Cholesterol:\t' + str(c[0])
    result += '\nHDL Cholesterol:\t' + str(c[1])  
    treated = bp_med_status()
    if treated == 1:
        result += '\nTreated with BP meds:\tYES'
    else:
        result += '\nTreated with BP meds:\tNO'
    #Calculate risk
    r = ''
    if gender == 'female':
        r = female_fram(age,smoker,total_chol,hdl_chol,sys,treated)
    else:
        r = male_fram(age,smoker,total_chol,hdl_chol,sys,treated)
    risk = r[0]
    if risk == 'old':
        print('Risk can only be calculated for ages 30-79\n Age: ' + str(age))
        return
    result += '\n--------RESULT--------'
    avg = r[1]
    if avg == 'old':
        result += '\n10-year risk of MI or death: ' + str(round(risk, 2)) + ' (No average reference above 74 years old)'
    else:
        result += '\n10-year risk of MI or death: ' + str(round(risk, 2)) + '  (Average: ' + avg + ')'
    print(result)

def demographics(): 
    r = requests.get(base + 'Patient?_id=' + patient_ID)
    bundle = json.loads(r.text)
    try:
        p = bundle['entry'][0]['resource']
        #Name
        first = p['name'][0]['given'][0]
        last = p['name'][0]['family']
        name = first + ' ' + last
        #Gender
        gender = p['gender']
        #Age
        birth_year = p['birthDate'].split('-')[0]
        current_year = datetime.date.today().year
        age = current_year - int(birth_year)
        return [name,gender,age]
    except:
        return ['error']

def smoking_status():
    url = base + 'Observation?subject=' + patient_ID + '&code=72166-2'
    smoker = 1
    #Send the query
    r = requests.get(url)
    #Create a JSON object of the result
    bundle = json.loads(r.text)
    #Check to make sure a resource was returned
    if bundle['total'] != 0:
        #Parse the contents of that resource
        status = bundle['entry'][0]['resource']['valueCodeableConcept']['text']
        if status == 'Never smoker':
            smoker = 0
    return smoker

def systolic():
    url = base + 'Observation?subject=' + patient_ID + '&code=55284-4'
    sys = ''
    #Send the query
    r = requests.get(url)
    #Create a JSON object of the result
    bundle = json.loads(r.text)
    #Check to make sure a resource was returned
    if bundle['total'] != 0:
        #Parse the contents of that resource
        for component in bundle['entry'][0]['resource']['component']:
            #Check for the systolic value (we don't want diastolic)
            if component['code']['text'] == 'Systolic Blood Pressure':
                sys = component['valueQuantity']['value']
                return sys
    else:
        return 'error'
    
def cholesterol():
    #Total cholesterol
    url = base + 'Observation?subject=' + patient_ID + '&code=2093-3'
    total_chol = ''
    #Send the query
    r = requests.get(url)
    #Create a JSON object of the result
    bundle = json.loads(r.text)
    #Check to make sure a resource was returned
    if bundle['total'] != 0:
        #Parse the contents of that resource
        total_chol = bundle['entry'][0]['resource']['valueQuantity']['value']
    else:
        return ['error']
    #HDL cholesterol
    url = base + 'Observation?subject=' + patient_ID + '&code=2085-9'
    hdl_chol = ''
    #Send the query
    r = requests.get(url)
    #Create a JSON object of the result
    bundle = json.loads(r.text)
    #Check to make sure a resource was returned
    if bundle['total'] != 0:
        #Parse the contents of that resource
        hdl_chol = bundle['entry'][0]['resource']['valueQuantity']['value']
    else:
        return ['error'] 
    return [total_chol,hdl_chol]


def bp_med_status():
    #ValueSet
    url = base + 'MedicationRequest?subject=' + patient_ID + '&code='
    r = requests.get(base + 'ValueSet?title=blood-pressure-medications')
    bundle = json.loads(r.text)
    for meds in bundle['entry'][0]['resource']['expansion']['contains']:
        #Add each code to the URL we will use for our MedicationRequest queries
        url += meds['code'] + ','
    #Take out trailing comma
    url = url[:-1]
    #MedicationRequest
    treated = 0
    #Send the query
    r = requests.get(url)
    #Create a JSON object of the result
    bundle = json.loads(r.text)
    #If any resources were returned, then they have been treated
    if bundle['total'] != 0:
        treated = 1
    return treated

def male_fram(age,smoker,total_chol,hdl_chol,sys_bp,treated):
    if age < 30 or age > 79:
        return ['old']
    if age > 70:
        age_smoke = 70
    else:
        age_smoke = age
            
    l = 52.000961*math.log(age) + 20.014077*math.log(total_chol) + -0.905964*math.log(hdl_chol) + \
    1.305784*math.log(sys_bp) + 0.241549*treated + 12.096316*smoker + -4.605038*math.log(age)*math.log(total_chol) + \
    -2.84367*math.log(age_smoke)*smoker + -2.93323*math.log(age)*math.log(age) - 172.300168
    
    prob = 1 - 0.9402**math.exp(l)
    if age > 75:
        return [prob*100,'old']
    else:
        for group in male_avg:
            if age in group:
                return [prob*100, male_avg[group]]
            
def female_fram(age,smoker,total_chol,hdl_chol,sys_bp,treated):
    if age < 30 or age > 79:
        return ['old']
    if age > 78:
        age_smoke = 78
    else:
        age_smoke = age
            
    l = 31.764001*math.log(age) + 22.465206*math.log(total_chol) + -1.187731*math.log(hdl_chol) + \
    2.552905*math.log(sys_bp) + 0.420251*treated + 13.07543*smoker + -5.060998*math.log(age)*math.log(total_chol) + \
    -2.996945*math.log(age_smoke)*smoker - 146.5933061
    
    prob = 1 - 0.98767**math.exp(l)
    if age > 75:
        return [prob*100,'old']
    else:
        for group in female_avg:
            if age in group:
                return [prob*100, female_avg[group]]
            
launch()

Name:			Dmitri 1Fram
Gender:			male
Age:			37
Smoking status:		0
Systolic BP:		125
Total Cholesterol:	200
HDL Cholesterol:	58
Treated with BP meds:	YES
--------RESULT--------
10-year risk of MI or death: 0.86  (Average: 4%)
