## PyQE Download Dataframe Query Demo

1) Setup the imports

In [1]:
# Path has to be set before importing pyqe
import sys, os
sys.path.append(os.path.join(sys.path[0],'..', '..'))

import json
from pyqe import *

2) Define authenticated query

In [4]:
query = Query('All_Patients_Dataframe_Query')
query.set_study('703c5d8a-a1d9-4d42-a314-5b9aad513390')

### NOTE: only one filter card type can be added for dataframe download
# Add Patient
query.add_filters([Person.Patient()])
# Add Interaction (OMOP)
query.add_filters([Interactions.ConditionOccurrence('condition')])


23-Nov-2023 15:38:55 - INFO - Username is: admin
23-Nov-2023 15:38:55 - INFO - Authenticating with CFUAA..
23-Nov-2023 15:38:57 - INFO - Authentication successful


3) Generate download cohort request

In [5]:
request = query.get_dataframe_cohort()
# Print request
# print(f'\nRequest: {json.dumps(request)}')

4) Download dataframe with request

In [6]:
patient_dataframe = Result().download_dataframe(request, "patient.csv", limit = 50)
print(f'\nPatient dataframe: \n{patient_dataframe}')


Patient dataframe: 
       0  age  county               ethnicity  gender  \
0    999   85   23750  Not Hispanic or Latino  FEMALE   
1   1011   96   42250  Not Hispanic or Latino    MALE   
2   1001   82   32190  Not Hispanic or Latino    MALE   
3    963   85   45431  Not Hispanic or Latino    MALE   
4   1000  105   45720  Not Hispanic or Latino  FEMALE   
5    968   99   11660  Not Hispanic or Latino  FEMALE   
6    971   81   31150  Not Hispanic or Latino  FEMALE   
7    984   91   37570  Not Hispanic or Latino  FEMALE   
8   1012   70   49430  Not Hispanic or Latino    MALE   
9   1014   85   10280  Not Hispanic or Latino    MALE   
10   975   85    5380  Not Hispanic or Latino    MALE   
11   973   84   26140  Not Hispanic or Latino    MALE   
12   982   71   44270  Not Hispanic or Latino  FEMALE   
13   997   75   15190  Not Hispanic or Latino  FEMALE   
14   979   98   31000  Not Hispanic or Latino  FEMALE   
15  1010   82   10650  Not Hispanic or Latino    MALE   
16  1004  

5. Download a portion of dataframe with request

In [7]:
patient_dataframe_subset = Result().download_dataframe(request, "patient.csv", limit=50, offset=15)
print(f'\nPatient dataframe: \n{patient_dataframe_subset}')


Patient dataframe: 
       0  age  county               ethnicity  gender  \
0    999   85   23750  Not Hispanic or Latino  FEMALE   
1   1011   96   42250  Not Hispanic or Latino    MALE   
2   1001   82   32190  Not Hispanic or Latino    MALE   
3   1000  105   45720  Not Hispanic or Latino  FEMALE   
4   1032   81   21030  Not Hispanic or Latino    MALE   
5   1031  104    5530  Not Hispanic or Latino  FEMALE   
6    984   91   37570  Not Hispanic or Latino  FEMALE   
7   1012   70   49430  Not Hispanic or Latino    MALE   
8   1033  104    5030  Not Hispanic or Latino  FEMALE   
9   1029   88    2110  Not Hispanic or Latino  FEMALE   
10  1014   85   10280  Not Hispanic or Latino    MALE   
11  1025   93   21040  Not Hispanic or Latino  FEMALE   
12   982   71   44270  Not Hispanic or Latino  FEMALE   
13   997   75   15190  Not Hispanic or Latino  FEMALE   
14  1010   82   10650  Not Hispanic or Latino    MALE   
15  1004   85   44780  Not Hispanic or Latino  FEMALE   
16   988  

6) Define authenticated query with female filter

In [10]:
female_query = Query('Female_Patients_Dataframe_Query')
female_query.set_study('703c5d8a-a1d9-4d42-a314-5b9aad513390')
female_patient = Person.Patient()
female_constraint = Constraint().add(Expression(ComparisonOperator.EQUAL, 'Female'))
female_patient.add_gender([female_constraint])

female_query.add_filters([female_patient])

7) Generate download cohort request with specific column config paths for one filter card type

In [11]:
specific_columns_request = female_query.get_dataframe_cohort(['patient.attributes.pid', 'patient.attributes.Age', 'patient.attributes.Gender'])
# Print request
# print(f'\nRequest: {json.dumps(specific_columns_request)}')

8) Download dataframe with request

In [12]:
female_patient_dataframe = Result().download_dataframe(specific_columns_request, "female_patient.csv")
print(f'\nFemale patient dataframe: \n{female_patient_dataframe}')


Female patient dataframe: 
     age  gender  pid
0     91  FEMALE  964
1     86  FEMALE  966
2     92  FEMALE  967
3     99  FEMALE  968
4    101  FEMALE  969
..   ...     ...  ...
534   87  FEMALE  119
535  110  FEMALE  124
536   95  FEMALE  126
537   81  FEMALE  127
538   83  FEMALE  129

[539 rows x 3 columns]


9. Download portion of dataframe with request

In [13]:
female_patient_dataframe_subset = Result().download_dataframe(specific_columns_request, "female_patient.csv", limit = 100, offset = 10)
print(f'\nFemale patient dataframe: \n{female_patient_dataframe_subset}')


Female patient dataframe: 
    age  gender   pid
0    89  FEMALE  1094
1    89  FEMALE  1040
2    94  FEMALE  1084
3    91  FEMALE   984
4    90  FEMALE  1083
..  ...     ...   ...
95   96  FEMALE   844
96   93  FEMALE   689
97   83  FEMALE  1069
98  101  FEMALE  1022
99  104  FEMALE  1033

[100 rows x 3 columns]
