# Molecular Oncology Almanac workshop
This notebook is being used as a demonstration for the [Cancer Genomics Consortium 2023](https://cancergenomics.org/meetings/cgc_annual_meeting_2023.php) Bioinformatics Workshop on Exploring the Clinical Interpretation Resource Landscape. It can be found on GitHub at [github.com/vanallenlab/2023-cgc-moalmanac](https://github.com/vanallenlab/2023-cgc-moalmanac) and you can learn more at [moalmanac.org](moalmanac.org).

<img src="img/moalmanac-browser.png" alt="Molecular Oncology Almanac browser" width="750"/>

<a id='toc'></a>
## Table of contents
- <a href="#API">Example usage of Application Program Interface (API)</a>
    - <a href="#assertions">Get entire database</a>
    - <a href="#assertion">Get specific database record</a>
    - <a href="#genes">Get genes cataloged in database</a>
    - <a href="#therapies">Get therapies cataloged in database</a>
    - <a href="#sources">Get evidence sources cataloged in database</a>
- <a href="#conclusion">Conclusion</a>

<a id='API'></a>
## Application Program Interface (API)
While the full database's API specifications are available on [SwaggerHub](https://app.swaggerhub.com/apis-docs/vanallenlab/almanac-browser/0.2#/), the primary endpoints to be used as GET requests are,
- `/api/assertions/`, to get the entire database as a JSON
- `/api/assertions/{assertion_id}`, to get a specific database record
- `/api/genes`, to list all genes contained within the database
- `/api/therapies`, to list all therapies within the database
- `/api/sources`, to list all sources cited within the database

In [1]:
! cd /Users/brendan/Github/moalmanac/moalmanac

In [2]:
import json
import requests

def indent_dictionary(dictionary):
    return json.dumps(dictionary, indent=4)

<a id='assertions'></a>
### Entire database
The `/api/assertions` endpoint may be used to get the entire database as a JSON object.

In [3]:
request = f"https://moalmanac.org/api/assertions"
response = requests.get(request)
print(f"Response code: {response.status_code}")
print('')
print(f"Molecular Oncology Almanac contains {len(response.json())} records in its 2023 July 6th database content release.")
print('')

Response code: 200

Molecular Oncology Almanac contains 894 records in its 2023 July 6th database content release.



<a id='assertion'></a>
### Specific database record
Specific database records can be obtained by querying their specific id in the database using the `/api/assertions/{assertion_id}` endpoint. Records are primarily made up of three areas: the genomic alteration(s), the clinical context and implication, and citation. Full details can be read through our [Standard Operating Procedure](https://github.com/vanallenlab/moalmanac-db/blob/main/docs/sop.md) for curation.

In [4]:
record_id = 628
request = f"https://moalmanac.org/api/assertions/{record_id}"
response = requests.get(request)
print(f"Response code: {response.status_code}")
print('')
record = response.json()

Response code: 200



In [5]:
record

{'assertion_id': 628,
 'context': 'Advanced, treated with three or more prior lines of chemotherapy',
 'created_on': '08/11/23',
 'description': 'The U.S. Food and Drug Administration (FDA) granted approval for niraparib for the treatment of adult patients with advanced ovarian, fallopian tube, or primary peritoneal cancer who have been treated with three or more prior chemotherapy regimens and whose cancer is associated with homologous recombination deficiency (HRD) positive status defined by either a deleterious or suspected deleterious BRCA mutation or genominc instability and who have progressed more than six months after response to the last platinum-based chemotherapy.',
 'disease': 'Ovarian Epithelial Tumor',
 'favorable_prognosis': '',
 'features': [{'attributes': [{'alternate_allele': None,
     'cdna_change': None,
     'chromosome': None,
     'end_position': None,
     'exon': None,
     'feature_type': 'germline_variant',
     'gene': 'BRCA2',
     'pathogenic': '1',
     

In [6]:
evidence = {
    'associated_evidence': record['predictive_implication'],
    'description': record['description'],
    'citation': record['sources'][0]['citation'],
    'url': record['sources'][0]['url'],
    'last_updated': record['last_updated']
}

print(indent_dictionary(evidence))

{
    "associated_evidence": "FDA-Approved",
    "description": "The U.S. Food and Drug Administration (FDA) granted approval for niraparib for the treatment of adult patients with advanced ovarian, fallopian tube, or primary peritoneal cancer who have been treated with three or more prior chemotherapy regimens and whose cancer is associated with homologous recombination deficiency (HRD) positive status defined by either a deleterious or suspected deleterious BRCA mutation or genominc instability and who have progressed more than six months after response to the last platinum-based chemotherapy.",
    "citation": "GlaxoSmithKline. Zejula (niraparib) [package insert]. U.S. Food and Drug Administration website. www.accessdata.fda.gov/drugsatfda_docs/label/2020/208447s015s017lbledt.pdf. Revised April 2020. Accessed October 15, 2020.",
    "url": "https://www.accessdata.fda.gov/drugsatfda_docs/label/2020/208447s015s017lbledt.pdf",
    "last_updated": "2020-10-15"
}


In [7]:
genomics = {
    'feature_type': record['features'][0]['feature_type'],
    'gene': record['features'][0]['attributes'][0]['gene'],
    'pathogenic': bool(record['features'][0]['attributes'][0]['pathogenic'])
}

print(indent_dictionary(genomics))

{
    "feature_type": "germline_variant",
    "gene": "BRCA2",
    "pathogenic": true
}


In [8]:
clinical_context_and_implication = {
    'disease': record['disease'],
    'oncotree_term': record['oncotree_term'],
    'oncotree_code': record['oncotree_code'],
    'clinical context': record['context'],
    'therapy_name': record['therapy_name'],
    'therapy_strategy': record['therapy_strategy'],
    'therapy_type': record['therapy_type'],
    'therapy_sensitivity': record['therapy_sensitivity'],
    'therapy_resistance': record['therapy_resistance'],
    'favorable_prognosis': record['favorable_prognosis']
}

print(indent_dictionary(clinical_context_and_implication))

{
    "disease": "Ovarian Epithelial Tumor",
    "oncotree_term": "Ovarian Epithelial Tumor",
    "oncotree_code": "OVT",
    "clinical context": "Advanced, treated with three or more prior lines of chemotherapy",
    "therapy_name": "Niraparib",
    "therapy_strategy": "PARP inhibition",
    "therapy_type": "Targeted therapy",
    "therapy_sensitivity": 1,
    "therapy_resistance": "",
    "favorable_prognosis": ""
}


<a href="#toc">Return to Table of Contents</a>

<a id='genes'></a>
### All genes within database
All genes contained within the database can be queried using the `/api/genes` endpoint, and there are currently 149 genes cataloged in the database as of this presentation.

In [9]:
request = f"https://moalmanac.org/api/genes"
response = requests.get(request)
print(f"Response code: {response.status_code}")
print('')
print(f"Molecular Oncology Almanac contains {len(response.json())} genes in its 2023 July 6th database content release.")
print('')

print(response.json()[:10])

Response code: 200

Molecular Oncology Almanac contains 149 genes in its 2023 July 6th database content release.

['ABL1', 'AKT1', 'AKT2', 'AKT3', 'ALK', 'AR', 'ARAF', 'ARID1A', 'ASXL1', 'ATM']


<a id='therapies'></a>
### All therapies within the database

Likewise, all therapies and sources can be specifically obtained using the endpoints `/api/therapies` and `/api/sources`, respectively. 

In [10]:
request = f"https://moalmanac.org/api/therapies"
response = requests.get(request)
print(f"Response code: {response.status_code}")
print('')
print(f"Molecular Oncology Almanac contains {len(response.json())} therapies in its 2023 July 6th database content release.")
print('')

print(response.json()[:10])

Response code: 200

Molecular Oncology Almanac contains 155 therapies in its 2023 July 6th database content release.

['5-Fluorouracil', 'AMG 510', 'AZD3759', 'AZD8186', 'Abiraterone', 'Abiraterone + Prednisone + Olaparib', 'Adagrasib', 'Ado-Trastuzumab Emtansine', 'Afatinib', 'Alectinib']


<a id='sources'></a>
### All sources cited within the database

In [11]:
request = f"https://moalmanac.org/api/sources"
response = requests.get(request)
print(f"Response code: {response.status_code}")
print('')
print(f"Molecular Oncology Almanac contains {len(response.json())} sources in its 2023 July 6th database content release.")
print('')
print(indent_dictionary(response.json()[0]))

Response code: 200

Molecular Oncology Almanac contains 276 sources in its 2023 July 6th database content release.

{
    "citation": "Pfizer Inc. Bosulif (bosutinib) [package insert]. U.S. Food and Drug Administration website. https://www.accessdata.fda.gov/drugsatfda_docs/label/2021/203341s020lbl.pdf. Revised May 2021. Accessed September 16, 2021.",
    "doi": "",
    "nct": "",
    "pmid": "",
    "source_id": 1,
    "source_type": "FDA",
    "url": "https://www.accessdata.fda.gov/drugsatfda_docs/label/2021/203341s020lbl.pdf"
}


<a href="#toc">Return to Table of Contents</a>

<a id='conclusion'></a>
## In conclusion

**Thank you very much for inviting us to give this workshop today, please reach out with any additional questions or comments!**

- Email: moalmanac@ds.dfci.harvard.edu
- Twitter: [@moalmanac](https://twitter.com/moalmanac), [@vanallenlab](https://twitter.com/vanallenlab)

Slides and code from this workshop are available on GitHub at [https://github.com/vanallenlab/2023-cgc-moalmanac](https://github.com/vanallenlab/2023-cgc-moalmanac).

<img src="img/conclusion-slide.png" alt="Thank you!" width="750"/>

Molecular Oncology Almanac (MOAlmanac) is a clinical interpretation algorithm paired with an underlying knowledge base for precision oncology. The primary objective of MOAlmanac is to identify and associate molecular alterations with therapeutic sensitivity and resistance as well as disease prognosis. This is done for “first-order” genomic alterations -- individual events such as somatic variants, copy number alterations, fusions, and germline -- as well as “second-order” events -- those that are not associated with one single mutation, and may be descriptive of global processes in the tumor such as tumor mutational burden, microsatellite instability, mutational signatures, and whole-genome doubling. In addition to clinical insights, MOAlmanac will annotate and evaluate first-order events based on their presence in numerous other well established datasources as well as highlight connections between them. This method currently geared towards hg19/b37 reference files and whole-exome or targeted sequencing data.

There are several other resources within [the Molecular Oncology Almanac ecosystem](https://github.com/topics/molecular-oncology-almanac): 
- Molecular Oncology Almanac, the clinical interpretation algorithm for precision cancer medicine [[GitHub](https://github.com/vanallenlab/moalmanac)].
- [Molecular Oncology Almanac Browser](https://moalmanac.org), a website to browse our underlying knowledge base [[GitHub](https://github.com/vanallenlab/moalmanac-browser)].
- [Molecular Oncology Almanac Connector](https://chrome.google.com/webstore/detail/molecular-oncology-almana/jliaipolchffpaccagodphgjpfdpcbcm?hl=en), a Google Chrome extension to quickly suggest literature for cataloging [[GitHub](https://github.com/vanallenlab/moalmanac-extension)].
- Molecular Oncology Almanac Database, the content and release notes of our underlying knowledge base [[GitHub](https://github.com/vanallenlab/moalmanac-db)].
- [Molecular Oncology Almanac Portal](https://portal.moalmanac.org), a website to launch this method on the Broad Institute's Google Cloud platform called Terra [[GitHub](https://github.com/vanallenlab/moalmanac-portal)].

This method is also available on [Docker](https://hub.docker.com/repository/docker/vanallenlab/moalmanac) and [Terra](https://portal.firecloud.org/#methods/vanallenlab/moalmanac/). We have also released [code on GitHub to help facilitate analyses of multiple samples that have been interpreted using the Molecular Oncology Almanac](https://github.com/vanallenlab/moalmanac-cohort). 

If you use this method, please cite our publication:
> [Reardon, B., Moore, N.D., Moore, N.S., *et al*. Integrating molecular profiles into clinical frameworks through the Molecular Oncology Almanac to prospectively guide precision oncology. *Nat Cancer* (2021). https://doi.org/10.1038/s43018-021-00243-3](https://www.nature.com/articles/s43018-021-00243-3)
