# GA4GH GenomicInterpretation

This notebook demonstrates how to use the oncopacket Python package to create GA4GH GenomicInterpretation messages from Cancer Data Aggregator (CDA) data.
We first extract data about a specimen in a CDA cohort and then use the package to create the GA4GH [Biosample](https://phenopacket-schema.readthedocs.io/en/latest/biosample.html) messages.

In [1]:
from oncoexporter.cda import CdaTableImporter, CdaMutationFactory

In [2]:
from cdapython import ( Q, set_default_project_dataset, set_host_url, set_table_version )

set_default_project_dataset("gdc-bq-sample.dev")
set_host_url("http://35.192.60.10:8080/")
set_table_version("all_merged_subjects_v3_2_final")

In [6]:
cohort_name = "lung cancer cohort"
query = 'treatment_anatomic_site = "Lung"'
Tsite = Q('treatment_anatomic_site = "Lung"')
tableImporter = CdaTableImporter(cohort_name=cohort_name, query_obj=Tsite)
mutation_df = tableImporter.get_mutation_df();

Output()

In [7]:
mutation_df.head()

In [5]:
mutation_factory = CdaMutationFactory()
ga4gh_genomic_interpretations = []
for _, row in mutation_df.iterrows():
    ga4gh_genomic_interpretations.append(mutation_factory.to_ga4gh(row=row))
print(f"We extracted {len(ga4gh_genomic_interpretations)} GA4GH Phenopacket Biosample messages")