# ICEES API Notebook - RENCI Green Team


<br>
<br>


<div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>1a)</b> First, create a cohort using the '/cohort' endpoint, linked below. This will define
    COHORT:45
</div>

https://icees.renci.org/apidocs/#/default/post__version___table___year__cohort

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 1a Inputs:</b> <br> version = 1.0.0, <br> table = patient, <br> year = 2010, <br> {"AvgDailyPM2.5Exposure": {
        "value": 2,
        "operator": ">"
      }}
</div>

<div class="alert alert-block alert-success">
<b>NOTE: </b> 'verify = False' is included in the call to the API to avoid an SSL error.
    'requests.packages.urllib3.disable_warnings()' is used to suppress warnings about the 'verify=False' parameter
</div>

<div class="alert alert-block alert-success">
<b>NOTE: </b> We'll start with the "accept" : "text/tabular" header/output, as it is more human-readable.
    The "accept" : "application/json" header/output will be demonstrated in parts '1c' and '2c'
</div>

In [31]:
import requests
requests.packages.urllib3.disable_warnings() 

tabular_headers = {"Content-Type" : "application/json", "accept": "text/tabular"}

data_1a = '{"AvgDailyPM2.5Exposure": { \
        "value": 2, \
        "operator": ">" \
      }}'

response_1a = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort', \
    data=data_1a, headers = tabular_headers, verify = False)

output_1a = response_1a.text
print(output_1a)

The Translator Integrated Clinical and Environmental Exposures Service (ICEES) is providing you with Data that have been de-identified in accordance with 45 C.F.R. Â§Â§ 164.514(a) and (b) and that UNC Health Care System (UNCHCS) is permitted to provide under 45 C.F.R. Â§ 164.502(d)(2). Recipient agrees to notify UNCHCS via NC TraCS in the event that Recipient receives any identifiable data in error and to take such measures to return the identifiable data and/or destroy it at the direction of UNCHCS.

Restrictions on Recipientâs Use of Data. Recipient further agrees to use the data exclusively for the purposes and functionalities provided by the ICEES: cohort discovery; feature-rich cohort discovery; hypothesis-driven queries; and exploratory queries. Recipient agrees to use appropriate safeguards to protect the Data from misuse and unauthorized access or disclosure. Recipient will report to UNCHCS any unauthorized access, use, or disclosure of the Data not provided for by the Servic

<div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>1b)</b> With COHORT:45 (n = 21492) defined, let's look at the '/associations_to_all_features' endpoint/functionality for this cohort
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 1b Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>cohort_id = COHORT:45,
    <br>{"feature":{"TotalEDInpatientVisits":{"operator":"<",
                                            "value":2}},"maximum_p_value":0.1}
</div>

<div class="alert alert-block alert-success">
<b>NOTE: </b> From here on out, we'll remove the opening text blurb (using '.split()') after the request/response is received, for brevity
</div>



In [34]:
data_1b = '{"feature":{"TotalEDInpatientVisits":{"operator":"<", "value":2}},"maximum_p_value":0.1}'
response_1b = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A45/associations_to_all_features',\
    headers = tabular_headers, data=data_1b, verify = False)
output_1b = response_1b.text
output_1b_disclaimer_removed = output_1b.split('Green Team members who contributed to the work.')[1]
print(output_1b_disclaimer_removed)



+-----------------------+------------------------------+-------------------------------+---------+
| feature               | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AgeStudyStart = 0-2   | 1705   94.41%                | 101    5.59%                  | 1806    |
|                       | 9.52%  7.93%                 | 2.81%  0.47%                  | 8.40%   |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 3-17  | 3319    96.20%               | 131    3.80%                  | 3450    |
|                       | 18.54%  15.44%               | 3.65%  0.61%                  | 16.05%  |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 18-34 | 2434    73.05%               | 898     26.95%                | 3332    |
|                       | 13.60%  11.33%               | 25.01%  4.18%                 | 15.50%  |
+-------

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>1c)</b> The above COHORT:45 output format, gained by using the "accept":"text/tabular" header, is human-readable but more difficult to manipulate. Now, we change the header to "accept":"application/json" to retrieve a more programmatically useful (and content-identical) data set.
</div>

In [26]:
import json

json_headers = {"Content-Type" : "application/json", "accept": "application/json"}

response_1c_json = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A45/associations_to_all_features', headers = json_headers, data=data_4, verify = False)
output_1c_json = response_1c_json.json()

In [27]:
print(output_1c_json)

{'return value': [{'feature_b': {'feature_qualifiers': [{'value': '0-2', 'operator': '='}, {'value': '3-17', 'operator': '='}, {'value': '18-34', 'operator': '='}, {'value': '35-50', 'operator': '='}, {'value': '51-69', 'operator': '='}, {'value': '70+', 'operator': '='}], 'feature_name': 'AgeStudyStart'}, 'columns': [{'percentage': 0.8329611018053229, 'frequency': 17902}, {'percentage': 0.1670388981946771, 'frequency': 3590}], 'feature_a': {'feature_qualifiers': [{'value': 2, 'operator': '<'}, {'value': 2, 'operator': '>='}], 'feature_name': 'TotalEDInpatientVisits'}, 'rows': [{'percentage': 0.08403126744835288, 'frequency': 1806}, {'percentage': 0.1605248464544947, 'frequency': 3450}, {'percentage': 0.15503443141634096, 'frequency': 3332}, {'percentage': 0.21440536013400335, 'frequency': 4608}, {'percentage': 0.27721943048576214, 'frequency': 5958}, {'percentage': 0.10878466406104598, 'frequency': 2338}], 'p_value': 2.2695809715411036e-220, 'feature_matrix': [[{'total_percentage': 0.

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>2a)</b> Now, define a second cohort with "MaxDailyPM2.5Exposure > 2" whereas in the previous example
    we were looking at "AvgDailyPM2.5Exposure > 2". This second cohort will be COHORT:46
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 2a Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>{"MaxDailyPM2.5Exposure": {
        "value": 2,
        "operator": ">"
      }}
</div>

In [20]:
data_2a = '{"MaxDailyPM2.5Exposure": { \
        "value": 2, \
        "operator": ">" \
      }}'

response_2a = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort', \
    data=data_2a, headers = tabular_headers, verify = False)

output_2a = response_2a.text
print(output_2a)

The Translator Integrated Clinical and Environmental Exposures Service (ICEES) is providing you with Data that have been de-identified in accordance with 45 C.F.R. Â§Â§ 164.514(a) and (b) and that UNC Health Care System (UNCHCS) is permitted to provide under 45 C.F.R. Â§ 164.502(d)(2). Recipient agrees to notify UNCHCS via NC TraCS in the event that Recipient receives any identifiable data in error and to take such measures to return the identifiable data and/or destroy it at the direction of UNCHCS.

Restrictions on Recipientâs Use of Data. Recipient further agrees to use the data exclusively for the purposes and functionalities provided by the ICEES: cohort discovery; feature-rich cohort discovery; hypothesis-driven queries; and exploratory queries. Recipient agrees to use appropriate safeguards to protect the Data from misuse and unauthorized access or disclosure. Recipient will report to UNCHCS any unauthorized access, use, or disclosure of the Data not provided for by the Servic

<div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>2b)</b> With COHORT:46 (n = 4950) defined, let's also look at the '/associations_to_all_features' endpoint/functionality for this cohort
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 2b Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>cohort_id = COHORT:46,
    <br>{"feature":{"TotalEDInpatientVisits":{"operator":"<",
                                            "value":2}},"maximum_p_value":0.1}
</div>

In [22]:
data_2b = '{"feature":{"TotalEDInpatientVisits":{"operator":"<", \
                                            "value":2}},"maximum_p_value":0.1}'

response_2b = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A46/associations_to_all_features', \
                           headers = tabular_headers, data=data_1b, verify = False)

output_2b = response_1b.text

output_2b_disclaimer_removed = output_1b.split('Green Team members who contributed to the work.')[1]
print(output_1b_disclaimer_removed)



+-----------------------+------------------------------+-------------------------------+---------+
| feature               | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AgeStudyStart = 0-2   | 1705   94.41%                | 101    5.59%                  | 1806    |
|                       | 9.52%  7.93%                 | 2.81%  0.47%                  | 8.40%   |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 3-17  | 3319    96.20%               | 131    3.80%                  | 3450    |
|                       | 18.54%  15.44%               | 3.65%  0.61%                  | 16.05%  |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 18-34 | 2434    73.05%               | 898     26.95%                | 3332    |
|                       | 13.60%  11.33%               | 25.01%  4.18%                 | 15.50%  |
+-------

<br><div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>2c)</b> Once more, we will cast the above output, this time for COHORT:46, into the "accept":"application/json"
        output format.</div>

In [29]:
json_headers = {"Content-Type" : "application/json", "accept": "application/json"}

response_2c_json = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A46/associations_to_all_features', headers = json_headers, data=data_4, verify = False)
output_2c_json = response_2c_json.json()

In [30]:
print(output_2c_json)

{'return value': [{'feature_b': {'feature_qualifiers': [{'value': '0-2', 'operator': '='}, {'value': '3-17', 'operator': '='}, {'value': '18-34', 'operator': '='}, {'value': '35-50', 'operator': '='}, {'value': '51-69', 'operator': '='}, {'value': '70+', 'operator': '='}], 'feature_name': 'AgeStudyStart'}, 'columns': [{'percentage': 0.7995959595959596, 'frequency': 3958}, {'percentage': 0.20040404040404042, 'frequency': 992}], 'feature_a': {'feature_qualifiers': [{'value': 2, 'operator': '<'}, {'value': 2, 'operator': '>='}], 'feature_name': 'TotalEDInpatientVisits'}, 'rows': [{'percentage': 0.08323232323232323, 'frequency': 412}, {'percentage': 0.1511111111111111, 'frequency': 748}, {'percentage': 0.17313131313131314, 'frequency': 857}, {'percentage': 0.22646464646464645, 'frequency': 1121}, {'percentage': 0.28383838383838383, 'frequency': 1405}, {'percentage': 0.08222222222222222, 'frequency': 407}], 'p_value': 2.3385527315697134e-59, 'feature_matrix': [[{'total_percentage': 0.079191

<br> <div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>3)</b> Consider a cohort of all patients, COHORT:22. We will now try out the '/feature_association' endpoint
    using this full cohort.
</div>

https://icees.renci.org/apidocs/#/default/post__version___table___year__cohort__cohort_id__feature_association

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 3 Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>cohort_id = COHORT:22, <br>"feature_a":{"TotalEDInpatientVisits":{"operator":"<", "value":2}}, <br>"feature_b":{"AvgDailyPM2.5Exposure":{"operator":"<", "value":3}}
</div>

In [5]:
data_3 = '{"feature_a":{"TotalEDInpatientVisits":{"operator":"<", "value":2}}, \
                                  "feature_b":{"AvgDailyPM2.5Exposure":{"operator":"<", "value":3}}}'

response_3 = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A22/feature_association', \
                           headers = tabular_headers, data=data_3, verify = False)

output_3 = response_3.text

output_3_disclaimer_removed = output_3.split('Green Team members who contributed to the work.')[1]
print(output_3_disclaimer_removed)



+----------------------------+------------------------------+-------------------------------+---------+
| feature                    | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AvgDailyPM2.5Exposure < 3  | 1474   92.07%                | 127    7.93%                  | 1601    |
|                            | 7.61%  6.38%                 | 3.42%  0.55%                  | 6.93%   |
+----------------------------+------------------------------+-------------------------------+---------+
| AvgDailyPM2.5Exposure >= 3 | 17902   83.30%               | 3590    16.70%                | 21492   |
|                            | 92.39%  77.52%               | 96.58%  15.55%                | 93.07%  |
+----------------------------+------------------------------+-------------------------------+---------+
|                            | 19376                        | 3717                          | 23093   |
|                            | 83.90%                       | 

<div class="alert alert-block alert-info" style="font-size:20px; font-weight: normal">
<b>4)</b> Again using COHORT:22, take a look at the endpoint '/associations_to_all_features'
</div>

<div class="alert alert-block alert-warning" style="font-size:16px">
<b>Part 4 Inputs:</b> <br>version = 1.0.0, <br>table = patient, <br>year = 2010, <br>cohort_id = COHORT:22, \
        <br>"feature":{"TotalEDInpatientVisits":{"operator":">", <br>"value":1}},"maximum_p_value":0.1}
</div>

In [6]:
data_4 = '{"feature":{"TotalEDInpatientVisits":{"operator":"<", \
                                            "value":2}},"maximum_p_value":0.1}'

response_4 = requests.post('https://icees.renci.org/1.0.0/patient/2010/cohort/COHORT%3A22/associations_to_all_features', \
                           headers = tabular_headers, data=data_4, verify = False)

output_4 = response_4.text

output_4_disclaimer_removed = output_4.split('Green Team members who contributed to the work.')[1]
print(output_4_disclaimer_removed)



+-----------------------+------------------------------+-------------------------------+---------+
| feature               | TotalEDInpatientVisits < 2   | TotalEDInpatientVisits >= 2   |         |
| AgeStudyStart = 0-2   | 1908   94.78%                | 105    5.22%                  | 2013    |
|                       | 9.85%  8.26%                 | 2.82%  0.45%                  | 8.72%   |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 3-17  | 3614    96.14%               | 145    3.86%                  | 3759    |
|                       | 18.65%  15.65%               | 3.90%  0.63%                  | 16.28%  |
+-----------------------+------------------------------+-------------------------------+---------+
| AgeStudyStart = 18-34 | 2615    73.83%               | 927     26.17%                | 3542    |
|                       | 13.50%  11.32%               | 24.94%  4.01%                 | 15.34%  |
+-------