# OMOP query API tutorial

This notebook shows examples of how to use the cyclops.query API to query EHR databases that follow the OMOP common data model. We showcase the examples on:

1. [Synthea](https://github.com/synthetichealth/synthea) in OMOP format.

    * First, generate synthea data using their releases. We used [v2.7.0](https://github.com/synthetichealth/synthea/releases/tag/v2.7.0) to generate data .
    * Follow instructions provided in [ETL-Synthea](https://github.com/OHDSI/ETL-Synthea) to load the CSV data into a postgres database, and perform ETL to load the data into OMOP format.

## Imports and instantiate `OMOPQuerier`

In [1]:
import pandas as pd

import cyclops.query.ops as qo
from cyclops.query import OMOPQuerier

synthea = OMOPQuerier(
    dbms="postgresql",
    port=5432,
    host="localhost",
    database="synthea_fl",
    user="postgres",
    password="pwd",
    schema_name="cdm_synthea10",
)
# List all tables.
synthea.list_tables()

2023-01-30 14:47:55,129 [1;37mINFO[0m cyclops.query.orm - Database setup, ready to run queries!


['visit_occurrence',
 'visit_detail',
 'person',
 'measurement',
 'observation',
 'concept',
 'care_site',
 'provider']

## Example 1. Get all patient visits in or after 2010.

In [2]:
ops = qo.Sequential([qo.ConditionAfterDate("visit_start_date", "2010-01-01")])
visits = synthea.visit_occurrence(ops=ops).run()
print(f"{len(visits)} rows extracted!")
pd.to_datetime(visits["visit_start_date"]).dt.year.value_counts().sort_index()

2023-01-30 14:48:00,694 [1;37mINFO[0m cyclops.query.orm - Query returned successfully!
2023-01-30 14:48:00,695 [1;37mINFO[0m cyclops.utils.profile - Finished executing function run_query in 5.527511 s


355294 rows extracted!


2010     7968
2011     8380
2012    10250
2013    29625
2014    44476
2015    30647
2016    30158
2017    30804
2018    31251
2019    32257
2020    39073
2021    29654
2022    30542
2023      209
Name: visit_start_date, dtype: int64

## Example 2. Get measurements for all visits in or after 2020, limit to first 100 rows.

In [3]:
ops = qo.Sequential([qo.ConditionAfterDate("visit_start_date", "2020-01-01")])
visits = synthea.visit_occurrence(ops=ops)
measurements = synthea.measurement(
    join=qo.JoinArgs(join_table=visits.query, on="visit_occurrence_id")
).run(limit=100)
print(f"{len(measurements)} rows extracted!")

2023-01-30 14:48:00,923 [1;37mINFO[0m cyclops.query.orm - Query returned successfully!
2023-01-30 14:48:00,924 [1;37mINFO[0m cyclops.utils.profile - Finished executing function run_query in 0.073149 s


100 rows extracted!
