test/test.sh
Deployment of ICEES API services have been migrated to use the kubernetes infrastructure as part of the translator-devops repo. The Helm Charts for deploying different instances of ICEES API services are detailed in the README file. An updated docker image is made on each new release which is pulled automatically when ICEES API services are deployed by the Helm Charts as part of the kubernetes infrastructure. The subsections below documents details regarding updating configurations and code to build a docker image for automated kubernetes deployment of services.
ICEES API allows to define custom schema. The schema is stored at config/features.yml
. Edit to fit your dataset.
ICEES API has the following assumptions:
- Each table named
<table>
should have a column named<Table>Id
as the identifier where<Table>
is<table>
capitalized. For example, for tablepatient
, the id column isPatientId
. - Each table has a column named
year
.
These columns do not need to be specified in features.yml
.
Data for the sqlite database file named example.db
is created in a separate icees-db repo.
The created sqlite database path is set by DB_PATH
environment variable in .env
.
The .env
file contains environmental variables that control the services. Edit it to fit your application.
ICEES_PORT
: the database port in the container
ICEES_HOST
: the database host in the container
ICEES_API_LOG_PATH
: the path where logs are stored on the host
ICEES_API_HOST_PORT
: the port where icees api is listening to on the host
OPENAPI_TITLE
: the title for the OpenAPI schema (default "ICEES API")
OPENAPI_HOST
: the host where icees api is deployed
OPENAPI_SCHEME
: the protocol where icees api is deployed
OPENAPI_SERVER_MATURITY
: The server maturity (ie 'development' or 'production')
DB_PATH
: the path to the SQLite database file on the host
CONFIG_PATH
: the directory where schema is stored
ICEES_API_INSTANCE_NAME
: icees api instance name
ICEES_INFORES_CURIE
: ICEES instance identifier (see https://docs.google.com/spreadsheets/d/1Ak1hRqlTLr1qa-7O0s5bqeTHukj9gSLQML1-lg6xIHM)
run
docker-compose up --build -d
docker build . -t icees-api:0.4.0
A feature qualifier limits values of a feature
<operator> ::= <
| >
| <=
| >=
| =
| <>`
<feature_qualifier> ::= {"operator":<operator>, "value":<value>}
| {"operator":"in", "values":[<value>, ..., <value>]}
| {"operator":"between", "value_a":<value>, "value_b":<value>}
There are two ways to specify a feature or a set of features, using a list or a dict. We show the schema for the former first, then show the schema for the latter.
<feature> ::= {
"feature_name": "<feature name>",
"feature_qualifier": <feature_qualifier>
[,"year": <year>]
}
where
feature name
: see config/features.yml
year
is optional. When year
is specified, it uses features from that year, otherwise it gets the year from context
Example:
{
"feature_name": "AgeStudyStart",
"feature_qualifier": {
"operator": "=",
"value": "0-2"
}
}
<features> ::= [<feature>, ..., <feature>]
Example:
[{
"feature_name": "AgeStudyStart",
"feature_qualifier": {
"operator": "=",
"value": "0-2"
}
}, {
"feature_name": "ObesityBMI",
"feature_qualifier": {
"operator": "=",
"value": 0
}
}]
In the apis that allow aggregation of bins, we can specify multiple feature qualifiers for each feature.
<feature2> ::= {
"feature_name": "<feature name>",
"feature_qualifiers": [<feature_qualifiere>, ..., <feature_qualifier>]
[,"year": <year>]
}
Example:
{
"feature_name": "AgeStudyStart",
"feature_qualifiers": [
{
"operator":"=",
"value":"0-2"
}, {
"operator":"between",
"value_a":"3-17",
"value_b":"18-34"
}, {
"operator":"in",
"values":["35-50","51-69"]
}, {
"operator":"=",
"value":"70+"
}
]
}
Similarly for a set of features
<features2> ::= [<feature2>, ..., <feature2>]
Example:
[{
"feature_name": "AgeStudyStart",
"feature_qualifiers": [
{
"operator":"=",
"value":"0-2"
}, {
"operator":"between",
"value_a":"3-17",
"value_b":"18-34"
}, {
"operator":"in",
"values":["35-50","51-69"]
},{
"operator":"=",
"value":"70+"
}
]
}, {
"feature_name": "EstResidentialDensity",
"feature_qualifiers": [
{
"operator": "<",
"value": 1
}
]
}]
in
and between
are currently only supported in <feature2>
.
Now, we turn to define a feature or a feature set using a dict.
<feature> ::= {"<feature name>": <feature_qualifier>}
<features> ::= {"<feature name>": <feature_qualifier>, ..., "<feature name>": <feature_qualifier>}
<feature2> ::= {"<feature name>": [<feature_qualifier>, ..., <feature_qualifier>]}
<features2> ::= {"<feature name>": [<feature_qualifier>, ..., <feature_qualifier>], ..., "<feature name>": [<feature_qualifier>, ..., <feature_qualifier>]}
method
POST
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort
schema
<features>
method
GET
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort/<cohort id>
method
GET
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort/<cohort id>/features
method
GET
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort/dictionary
method
POST
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort/<cohort id>/feature_association
schema
{"feature_a":<feature>,"feauture_b":<feature>}
method
POST
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort/<cohort id>/feature_association2
schema
{"feature_a":<feature2>,"feature_b":<feature2>[,"check_coverage_is_full":<boolean>]}
example
{
"feature_a": {
"feature_name": "AgeStudyStart",
"feature_qualifiers": [
{
"operator":"=",
"value":"0-2"
}, {
"operator":"between",
"value_a":"3-17",
"value_b":"18-34"
}, {
"operator":"in",
"values":["35-50","51-69"]
},{
"operator":"=",
"value":"70+"
}
]
},
"feature_b": {
"feature_name": "ObesityBMI",
"feature_qualifiers": [
{
"operator":"=",
"value":0
}, {
"operator":"<>",
"value":0
}
]
}
}
method
POST
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort/<cohort id>/associations_to_all_features
schema
{
"feature": <feature>,
"maximum_p_value": <maximum p value>,
"correction": {
"method": <correction method>
[,"alpha": <correction alpha>]
}
}
where correction
is optional, alpha
is optional. method
and alpha
are specified here: https://www.statsmodels.org/dev/generated/statsmodels.stats.multitest.multipletests.html
method
POST
route
/(patient|visit)/(2010|2011|2012|2013|2014|2015|2016)/cohort/<cohort id>/associations_to_all_features2
schema
{
"feature": <feature>,
"maximum_p_value": <maximum p value>
[,"check_coverage_is_full": <boolean>],
"correction": {
"method": <correction method>
[,"alpha": <correction alpha>]
}
}
where correction
is optional, alpha
is optional. method
and alpha
are specified here: https://www.statsmodels.org/dev/generated/statsmodels.stats.multitest.multipletests.html
example
{
"feature":{
"AgeStudyStart":[
{
"operator":"=",
"value":"0-2"
}, {
"operator":"between",
"value_a":"3-17",
"value_b":"18-34"
}, {
"operator":"in",
"values":["35-50","51-69"]
},{
"operator":"=",
"value":"70+"
}
]
},
"maximum_p_value": 0.1
}
method
POST
route
/knowledge_graph?reasoner=&verbose=
input parameters:
query_options
table
: ICEES tableyear
: ICEES yearcohort_features
: features for defining the cohortfeature
: a feature and operator and value for spliting the cohort to two subcohortsmaximum_p_value
: ICEES maximum p value. The p value is calculated for each ICEES feature intable
, using 2 * n contingency table where the rows are subcohorts and the columns are individual values of that feature. Any feature with p value greater than maximum p value is filtered out.regex
: filter target node name by regex.
if reasoner
is specified, then it returns a Reason API response.
example
{
"query_options": {
"table": "patient",
"year": 2010,
"cohort_features": {
"AgeStudyStart": {
"operator": "=",
"value": "0-2"
}
},
"feature": {
"EstResidentialDensity": {
"operator": "<",
"value": 1
}
},
"maximum_p_value":1
},
"message": {
"query_graph": {
"nodes": {
"n00": {
"categories": ["biolink:PopulationOfIndividualOrganisms"]
},
"n01": {
"categories": ["biolink:ChemicalSubstance"]
}
},
"edges": {
"e00": {
"predicates": ["biolink:correlated_with"],
"subject": "n00",
"object": "n01"
}
}
}
}
}
method
POST
route
/knowledge_graph_overlay?reasoner=&verbose=
input parameters:
<query_options> ::= {
"table": <string>,
"year": <integer>,
"cohort_features": <features>
}
| {
"cohort_id": <string>
}
{
"query_options": <query_options>,
"message": {
"knowledge_graph": <knowledge_graph>
}
}
method
POST
route
/query?reasoner=&verbose=
if reasoner
is specified, then it returns a Reason API response.
input parameters:
{
"query_options": <query_options>,
"message": {
"query_graph": <query_graph>
}
}
get cohort of all patients
curl -k -XPOST https://localhost:8080/patient/2010/cohort -H "Content-Type: application/json" -H "Accept: application/json" -d '{}'
get cohort of all patients active in a year
curl -k -XPOST https://localhost:8080/patient/2010/cohort -H "Content-Type: application/json" -H "Accept: application/json" -d '[{
"feature_name": "Active_In_Year",
"feature_qualifier": {
"operator": "=",
"value": 1
}
}]'
get cohort of patients with AgeStudyStart = 0-2
curl -k -XPOST https://localhost:8080/patient/2010/cohort -H "Content-Type: application/json" -H "Accept: application/json" -d '[{
"feature_name": "AgeStudyStart",
"feature_qualifier": {
"operator":"=",
"value":"0-2"
}
}]'
Assuming we have cohort id COHORT:10
get definition of cohort
curl -k -XGET https://localhost:8080/patient/2010/cohort/COHORT:10 -H "Accept: application/json"
get features of cohort
curl -k -XGET https://localhost:8080/patient/2010/cohort/COHORT:10/features -H "Accept: application/json"
get cohort dictionary
curl -k -XGET https://localhost:8080/patient/2010/cohort/COHORT:10/features -H "Accept: application/json"
get feature association
curl -k -XPOST https://localhost:8080/patient/2010/cohort/COHORT:10/feature_association -H "Content-Type: application/json" -d '{
"feature_a": {
"feature_name": "AgeStudyStart",
"feature_qualifier: {"operator":"=", "value":"0-2"}
},
"feature_b": {
"feature_name": "ObesityBMI",
"feature_qualifier": {"operator":"=", "value":0}
}
}'
get association to all features
curl -k -XPOST https://localhost:8080/patient/2010/cohort/COHORT:10/associations_to_all_features -H "Content-Type: application/json" -d '{
"feature": {
"feature_name": "AgeStudyStart",
"feature_qualifier": {"operator":"=", "value":"0-2"}
},
"maximum_p_value":0.1
}' -H "Accept: application/json"
knowledge graph
curl -X POST -k "http://localhost:5000/knowledge_graph" -H "accept: application/json" -H "Content-Type: application/json" -d '
{
"query_options": {
"table": "patient",
"year": 2010,
"cohort_features": {
"AgeStudyStart": {
"operator": "=",
"value": "0-2"
}
},
"feature": {
"EstResidentialDensity": {
"operator": "<",
"value": 1
}
},
"maximum_p_value":1
},
"message": {
"query_graph": {
"nodes": {
"n00": {
"categories": ["biolink:PopulationOfIndividualOrganisms"]
},
"n01": {
"categories": ["biolink:ChemicalSubstance"]
}
},
"edges": {
"e00": {
"predicates": ["biolink:correlated_with"],
"subject": "n00",
"object": "n01"
}
}
}
}
}
'
knowledge graph schema
curl -X GET -k "http://localhost:5000/knowledge_graph/schema" -H "accept: application/json"
The qc tool is under the qctool
directory. The following commands are run in the qctool
directory
pip install -r requirements.txt
Example:
python src/qc.py \
--a_type features \
--a ../config/all_features.yaml \
--b_type mapping \
--b ../config/FHIR_mappings.yml \
--update_a ../config/all_features_update.yaml \
--update_b ../config/FHIR_mappings.yml \
--number_entries 10 \
--similarity_threshold 0.5 \
--table patient visit \
--ignore_suffix Table _flag_first _flag_last
Usage:
python src/qc.py --help