# ICEES API Notebook

## https://icees.renci.org/apidocs/

<div class="alert alert-block alert-success">
<b>NOTE: </b> 'verify = False' is included in the call to the API to avoid an SSL error.
    'requests.packages.urllib3.disable_warnings()' is used to suppress warnings about the 'verify=False' parameter
</div>

<div class="alert alert-block alert-success">
<b>NOTE: </b> We'll start with the "accept" : "text/tabular" header/output, as it is more human-readable.
    The "accept" : "application/json" header/output will be demonstrated in parts '1c' and '2c'
</div>

<div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>1a)</b> First, create a cohort using the '/cohort' endpoint, linked below. This will define
    COHORT:45
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 1a Inputs:</b> <br> version = 1.0.0, <br> table = patient, <br> year = 2010, <br> {"AvgDailyPM2.5Exposure": {
        "value": 2,
        "operator": ">"
      }}
</div>

In [1]:
import requests
import json

requests.packages.urllib3.disable_warnings()

tabular_headers = {"Content-Type" : "application/json", "accept": "text/tabular"}

data_1a = '{"AvgDailyPM2.5Exposure": { "value": 2, "operator": ">" }}'

response_1a = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort', data=data_1a, headers = tabular_headers, verify = False)

output_1a = response_1a.text
print(output_1a)

The Translator Integrated Clinical and Environmental Exposures Service (ICEES) is providing you with Data that have been de-identified in accordance with 45 C.F.R. Â§Â§ 164.514(a) and (b) and that UNC Health Care System (UNCHCS) is permitted to provide under 45 C.F.R. Â§ 164.502(d)(2). Recipient agrees to notify UNCHCS via NC TraCS in the event that Recipient receives any identifiable data in error and to take such measures to return the identifiable data and/or destroy it at the direction of UNCHCS.

Restrictions on Recipientâs Use of Data. Recipient further agrees to use the data exclusively for the purposes and functionalities provided by the ICEES: cohort discovery; feature-rich cohort discovery; hypothesis-driven queries; and exploratory queries. Recipient agrees to use appropriate safeguards to protect the Data from misuse and unauthorized access or disclosure. Recipient will report to UNCHCS any unauthorized access, use, or disclosure of the Data not provided for by the Servic

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>1b)</b> With COHORT:45 (n = 21492) defined, let's look at the '/associations_to_all_features' endpoint/functionality for this cohort
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 1b Inputs:</b> <br>version = 1.0.0, <br>table = patient, 
<br>year = 2010, <br>cohort_id = COHORT:45,
<br>a feature variable and minimum p value = {"feature":{"TotalEDInpatientVisits":{"operator":"&lt;", "value":2}}, "maximum_p_value":0.1}
</div>

In [2]:
data_1b = '{"feature":{"TotalEDInpatientVisits":{"operator":"<", "value":2}},"maximum_p_value":0.1}'
response_1b = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A45/associations_to_all_features', headers = tabular_headers, data=data_1b, verify = False)
output_1b = response_1b.text
output_1b_disclaimer_removed = output_1b.split('Green Team members who contributed to the work.')[1]
print(output_1b_disclaimer_removed)



+-----------------------+------------------------------+-------------------------------+---------+
| feature               | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AgeStudyStart = 0-2   | 1705   94.41%                | 101    5.59%                  | 1806    |
|                       | 9.52%  7.93%                 | 2.81%  0.47%                  | 8.40%   |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 3-17  | 3319    96.20%               | 131    3.80%                  | 3450    |
|                       | 18.54%  15.44%               | 3.65%  0.61%                  | 16.05%  |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 18-34 | 2434    73.05%               | 898     26.95%                | 3332    |
|                       | 13.60%  11.33%               | 25.01%  4.18%                 | 15.50%  |
+-------

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>1c)</b> The above COHORT:45 output format, produced by using the "accept":"text/tabular" header, is human-readable but more difficult to manipulate programmatically. <br><br>Now, we change the header to "accept":"application/json" to retrieve a more useful dictionary type output.<br><br>This output type is useful, for example to look at medication data in a raw format, printed below:
</div>

In [3]:
json_headers = {"Content-Type" : "application/json", "accept": "application/json"}

data_1c = data_1b

response_1c_json = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A45/associations_to_all_features', headers = json_headers, data=data_1c, verify = False)
output_1c_json = response_1c_json.json()

In [4]:
asthma_drug_list = ['Prednisone', 'Mometasone', 'Salmeterol', 'Budesonide', 'Albuterol', 'Diphenhydramine', 'Cetirizine', 'Ipratropium']

medication_data_list_1 = []
return_value_1c = output_1c_json['return value']

print('List of Medication features:')
print()
for section in return_value_1c:
    feature_name = section['feature_b']['feature_name']
    if feature_name in asthma_drug_list:
        medication_data_list_1.append(section)
        print('Medication:', feature_name)
        print()
        print(section)
        print()

List of Medication features:

Medication: Prednisone

{'feature_b': {'feature_qualifiers': [{'value': 0, 'operator': '='}, {'value': 1, 'operator': '='}], 'feature_name': 'Prednisone'}, 'columns': [{'percentage': 0.8329611018053229, 'frequency': 17902}, {'percentage': 0.1670388981946771, 'frequency': 3590}], 'feature_a': {'feature_qualifiers': [{'value': 2, 'operator': '<'}, {'value': 2, 'operator': '>='}], 'feature_name': 'TotalEDInpatientVisits'}, 'rows': [{'percentage': 0.8951702959240647, 'frequency': 19239}, {'percentage': 0.10482970407593523, 'frequency': 2253}], 'p_value': 1.1180903063022454e-23, 'feature_matrix': [[{'total_percentage': 0.7538153731621068, 'row_percentage': 0.8420915848017049, 'column_percentage': 0.9049826834990504, 'frequency': 16201}, {'total_percentage': 0.14135492276195794, 'row_percentage': 0.15790841519829513, 'column_percentage': 0.8462395543175487, 'frequency': 3038}], [{'total_percentage': 0.07914572864321608, 'row_percentage': 0.7549933422103862, 'col

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>1d)</b> Examine the first blurb of text in the list above, Prednisone. For the two Prednisone groups, control and medicated, compare the percentage of EDInpatientVisits&lt;2, as follows.</div>

In [5]:
print()
prednisone_control_group_ED_under_2 = medication_data_list_1[0]['feature_matrix'][0][0]['row_percentage']
prednisone_control_group_ED_under_2_percentage = round(prednisone_control_group_ED_under_2*100, 2)
print('Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone control group:', prednisone_control_group_ED_under_2_percentage)
print()
prednisone_medicated_group_ED_under_2 = medication_data_list_1[0]['feature_matrix'][1][0]['row_percentage']
prednisone_medicated_group_ED_under_2_percentage = round(prednisone_medicated_group_ED_under_2*100, 2)
print('Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone medicated group:', prednisone_medicated_group_ED_under_2_percentage)


Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone control group: 84.21

Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone medicated group: 75.5


## Above, we have a programmatic way to pull details from the ICEES API call and produce scripts for looking at and comparing those details.

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>2a)</b> Now, define a second cohort with "MaxDailyPM2.5Exposure > 2" whereas in the previous example
    we were looking at "AvgDailyPM2.5Exposure > 2". This second cohort will be COHORT:46
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 2a Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>{"MaxDailyPM2.5Exposure": {
        "value": 2,
        "operator": ">"
      }}
</div>

In [6]:
data_2a = '{"MaxDailyPM2.5Exposure": {"value": 2, "operator": ">" }}'

response_2a = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort', data=data_2a, headers = tabular_headers, verify = False)

output_2a = response_2a.text
print(output_2a)

The Translator Integrated Clinical and Environmental Exposures Service (ICEES) is providing you with Data that have been de-identified in accordance with 45 C.F.R. Â§Â§ 164.514(a) and (b) and that UNC Health Care System (UNCHCS) is permitted to provide under 45 C.F.R. Â§ 164.502(d)(2). Recipient agrees to notify UNCHCS via NC TraCS in the event that Recipient receives any identifiable data in error and to take such measures to return the identifiable data and/or destroy it at the direction of UNCHCS.

Restrictions on Recipientâs Use of Data. Recipient further agrees to use the data exclusively for the purposes and functionalities provided by the ICEES: cohort discovery; feature-rich cohort discovery; hypothesis-driven queries; and exploratory queries. Recipient agrees to use appropriate safeguards to protect the Data from misuse and unauthorized access or disclosure. Recipient will report to UNCHCS any unauthorized access, use, or disclosure of the Data not provided for by the Servic

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>2b)</b> With COHORT:46 (n = 4950) defined, let's also look at the '/associations_to_all_features' endpoint/functionality for this cohort. The output here (as opposed to 1b, above) will be limited to the first 1000 characters, for brevity.
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 2b Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>cohort_id = COHORT:46,
    <br>"{"feature":{"TotalEDInpatientVisits":{"operator":"&lt;",                                           "value":2}},"maximum_p_value":0.1}
</div>

In [7]:
data_2b = '{"feature":{"TotalEDInpatientVisits":{"operator":"<", "value":2}},"maximum_p_value":0.1}'
response_2b = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A46/associations_to_all_features', headers = tabular_headers, data=data_1b, verify = False)

In [8]:
output_2b = response_2b.text
output_2b_disclaimer_removed = output_2b.split('Green Team members who contributed to the work.')[1]
print(output_2b_disclaimer_removed[0:1000])



+-----------------------+------------------------------+-------------------------------+---------+
| feature               | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AgeStudyStart = 0-2   | 392    95.15%                | 20     4.85%                  | 412     |
|                       | 9.90%  7.92%                 | 2.02%  0.40%                  | 8.32%   |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 3-17  | 717     95.86%               | 31     4.14%                  | 748     |
|                       | 18.12%  14.48%               | 3.12%  0.63%                  | 15.11%  |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 18-34 | 572     66.74%               | 285     33.26%                | 857     |
|       


<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>2c)</b> Cast the above output, for COHORT:46, into the "accept":"application/json" dictionary output format. We'll skip printing the list of medication features (as was previously demonstrated in section 1c). </div>

In [9]:
json_headers = {"Content-Type" : "application/json", "accept": "application/json"}
data_2c = data_2b
response_2c_json = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A46/associations_to_all_features', headers = json_headers, data=data_2c, verify = False)

In [10]:
output_2c_json = response_2c_json.json()
asthma_drug_list = ['Prednisone', 'Mometasone', 'Salmeterol', 'Budesonide', 'Albuterol', 'Diphenhydramine', 'Cetirizine', 'Ipratropium']

medication_data_list_2 = []
return_value_2c = output_2c_json['return value'] #modified for section 2c, from section 1c

for section in return_value_2c:     
    feature_name = section['feature_b']['feature_name']
    if feature_name in asthma_drug_list:
        medication_data_list_2.append(section)

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>2d)</b> Analagous to section 1d, make the same Prednisone control/medicated example with COHORT:46 </div>

In [11]:
print()
prednisone_control_group_ED_under_2 = medication_data_list_2[0]['feature_matrix'][0][0]['row_percentage']
prednisone_control_group_ED_under_2_percentage = round(prednisone_control_group_ED_under_2*100, 2)
print('Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone control group:', prednisone_control_group_ED_under_2_percentage)
print()
prednisone_medicated_group_ED_under_2 = medication_data_list_2[0]['feature_matrix'][1][0]['row_percentage']
prednisone_medicated_group_ED_under_2_percentage = round(prednisone_medicated_group_ED_under_2*100, 2)
print('Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone medicated group:', prednisone_medicated_group_ED_under_2_percentage)


Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone control group: 81.18

Percentage of Patients in TotalEDInpatientVisits < 2, Prednisone medicated group: 70.27


## For both cohorts, COHORT:45 (AvgDailyPM2.5Exposure > 2) and COHORT:46 (MaxDailyPM2.5Exposure > 2), the number of patients with the condition 'TotalEDInpatientVisits &lt; 2' decreases by about 10%.

<br> <div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>3)</b> Consider a cohort of all patients, COHORT:22. We will now try out the '/feature_association' endpoint
    using this full cohort.
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 3 Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>cohort_id = COHORT:22, <br>"feature_a":{"TotalEDInpatientVisits":{"operator":"&lt;", "value":2}}, <br>"feature_b":{"AvgDailyPM2.5Exposure":{"operator":"&lt;", "value":3}}
</div>

In [12]:
data_3 = '{"feature_a":{"TotalEDInpatientVisits":{"operator":"<", "value":2}}, "feature_b":{"AvgDailyPM2.5Exposure":{"operator":"<", "value":3}}}'

response_3 = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A22/feature_association', headers = tabular_headers, data=data_3, verify = False)

output_3 = response_3.text

output_3_disclaimer_removed = output_3.split('Green Team members who contributed to the work.')[1]
print(output_3_disclaimer_removed)



+----------------------------+------------------------------+-------------------------------+---------+
| feature                    | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AvgDailyPM2.5Exposure < 3  | 1474   92.07%                | 127    7.93%                  | 1601    |
|                            | 7.61%  6.38%                 | 3.42%  0.55%                  | 6.93%   |
+----------------------------+------------------------------+-------------------------------+---------+
| AvgDailyPM2.5Exposure >= 3 | 17902   83.30%               | 3590    16.70%                | 21492   |
|                            | 92.39%  77.52%               | 96.58%  15.55%                | 93.07%  |
+----------------------------+------------------------------+-------------------------------+---------+
|                            | 19376                        | 3717                          | 23093   |
|                            | 83.90%                       | 

<div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>4)</b> Again using COHORT:22, take a look at the endpoint '/associations_to_all_features'
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 4 Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>cohort_id = COHORT:22, \
        <br>"feature":{"TotalEDInpatientVisits":{"operator":">", <br>"value":1}},"maximum_p_value":0.1}
</div>

In [13]:
data_4 = '{"feature":{"TotalEDInpatientVisits":{"operator":"<", "value":2}},"maximum_p_value":0.1}'

response_4 = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A22/associations_to_all_features', headers = tabular_headers, data=data_4, verify = False)

output_4 = response_4.text

output_4_disclaimer_removed = output_4.split('Green Team members who contributed to the work.')[1]
print(output_4_disclaimer_removed[0:1000])



+-----------------------+------------------------------+-------------------------------+---------+
| feature               | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AgeStudyStart = 0-2   | 1908   94.78%                | 105    5.22%                  | 2013    |
|                       | 9.85%  8.26%                 | 2.82%  0.45%                  | 8.72%   |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 3-17  | 3614    96.14%               | 145    3.86%                  | 3759    |
|                       | 18.65%  15.65%               | 3.90%  0.63%                  | 16.28%  |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 18-34 | 2615    73.83%               | 927     26.17%                | 3542    |
|       
