### Running queries on the Starter Kit server from Python

A demonstration of running the Postman example queries via Python.

First using the requests module directly

In [8]:
import requests, json
q2 = {
  "query": "select sample_name, sex, population_code, population_name, cram_drs_uri, crai_drs_uri, bundle_drs_uri from one_thousand_genomes_sample where population_code='PUR' and sex='female';",
  "parameters": []
}
r = requests.post("http://localhost:4800/search", json = q2)
print(json.dumps(r.json(), indent=3))

{
   "data": [
      {
         "sample_name": "HG00740",
         "sex": "female",
         "population_code": "PUR",
         "population_name": "Puerto Rican",
         "cram_drs_uri": "drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.cram",
         "crai_drs_uri": "drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.crai",
         "bundle_drs_uri": "drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.bundle"
      },
      {
         "sample_name": "HG01070",
         "sex": "female",
         "population_code": "PUR",
         "population_name": "Puerto Rican",
         "cram_drs_uri": "drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.cram",
         "crai_drs_uri": "drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.crai",
         "bundle_drs_uri": "drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.bundle"
      },
      {
         "sample_name": "HG01326",
         "sex": "female",
         "population_code": "PUR",
   

### A useful query function
It works, but the SQL query is long. In many cases we can split the query over several lines for readability.

We'll define a function to take a query and reformat it so the server can execute it.

In [16]:
def run_param_query(query, debug=False):

    # remove tabs and new-lines
    query_text = query['query'].replace("\n", " ").replace("\t", " ")
    # strip any leading and trailing spaces
    query_text = query_text.strip()
    query2 = {"query":query_text, "parameters":query['parameters']}
    if debug:
        print("Query: {}".format(query2))
    response = requests.post("http://localhost:4800/search", json = query2)
    print(json.dumps(response.json(), indent=3))

### Rewrite our query over multiple lines
Note the ''' that Python uses to allow us to break a string over several lines

In [22]:
multi_line_query = {
  "query": '''select sample_name, sex, population_code, population_name,
  cram_drs_uri, crai_drs_uri, bundle_drs_uri 
  from one_thousand_genomes_sample 
  where population_code='PUR' and sex='female';''',
  "parameters": []
}


run_param_query(multi_line_query)

{
   "data": [
      {
         "sample_name": "HG00740",
         "sex": "female",
         "population_code": "PUR",
         "population_name": "Puerto Rican",
         "cram_drs_uri": "drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.cram",
         "crai_drs_uri": "drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.crai",
         "bundle_drs_uri": "drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.bundle"
      },
      {
         "sample_name": "HG01070",
         "sex": "female",
         "population_code": "PUR",
         "population_name": "Puerto Rican",
         "cram_drs_uri": "drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.cram",
         "crai_drs_uri": "drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.crai",
         "bundle_drs_uri": "drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.bundle"
      },
      {
         "sample_name": "HG01326",
         "sex": "female",
         "population_code": "PUR",
   

### Using the DataConnect Client
We can use the Data Connect client from session 2 and 3 which does the above and other things for us.

In [23]:
from fasp.search import DataConnectClient
dc_client =DataConnectClient("http://localhost:4800")

In [24]:
df = dc_client.run_param_query(multi_line_query)
df

Retrieving the query
____Page1_______________


[['HG00740',
  'female',
  'PUR',
  'Puerto Rican',
  'drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.cram',
  'drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.crai',
  'drs://localhost:5000/HG00740.1000genomes.lowcov.downsampled.bundle'],
 ['HG01070',
  'female',
  'PUR',
  'Puerto Rican',
  'drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.cram',
  'drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.crai',
  'drs://localhost:5000/HG01070.1000genomes.lowcov.downsampled.bundle'],
 ['HG01326',
  'female',
  'PUR',
  'Puerto Rican',
  'drs://localhost:5000/HG01326.1000genomes.lowcov.downsampled.cram',
  'drs://localhost:5000/HG01326.1000genomes.lowcov.downsampled.crai',
  'drs://localhost:5000/HG01326.1000genomes.lowcov.downsampled.bundle']]

In [25]:
dc_client.list_tables()

Retrieving the table list
____Page1_______________
phenopacket_v1
one_thousand_genomes_sample


['phenopacket_v1', 'one_thousand_genomes_sample']

In [26]:
dc_client.list_table_info('one_thousand_genomes_sample', verbose=True)

_Schema for tableone_thousand_genomes_sample_
{
   "name": "one_thousand_genomes_sample",
   "description": "Table / directory containing JSON files for one thousand genomes sample from https://www.internationalgenome.org",
   "data_model": {
      "$id": "/table/one_thousand_genomes_sample/info",
      "$schema": "http://json-schema.org/draft-07/schema#",
      "description": "one thousand genomes sample JSON data model",
      "properties": {
         "sample_name": {
            "type": "string",
            "description": "An identifier specific for this genome sample"
         },
         "sex": {
            "type": "string",
            "enum": [
               "male",
               "female"
            ]
         },
         "biosample_id": {
            "type": "string",
            "description": "bio sample identifier"
         },
         "population_code": {
            "type": "string",
            "enum": [
               "ITU",
               "ASW",
               "JPT

<fasp.search.data_connect_client.SearchSchema at 0x12d6a0580>