# Hands on: Interacting with human-curated networks in pathway databases

In [8]:
!pip install -q -r requirements.txt

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
conda-repo-cli 1.0.75 requires requests_mock, which is not installed.
streamlit 1.30.0 requires pandas<3,>=1.3.0, but you have pandas 1.2.5 which is incompatible.
conda-repo-cli 1.0.75 requires clyent==1.2.1, but you have clyent 1.2.2 which is incompatible.
xarray 2023.6.0 requires pandas>=1.4, but you have pandas 1.2.5 which is incompatible.[0m[31m
[0m

## 1. Introduction

### Recap - Example Dataset

We use an example dataset produced from an MSstats differential abundance analysis.  This dataset is a small molecule dataset with known inhibition targets.  It includes 8 small molecule inhibitors and a control DMSO holdout. 

In [2]:
import pandas as pd

DATA_PATH = "dataProcessOutput.csv"

def import_data(filename):
    pandas_df = pd.read_csv(filename)
    return pandas_df

input_data = import_data(DATA_PATH)
input_data

Unnamed: 0,RUN,Protein,LogIntensities,originalRUN,GROUP,SUBJECT,TotalGroupMeasurements,NumMeasuredFeature,MissingPercentage,more50missing,NumImputedFeature
0,1,1433B_HUMAN,12.873423,230719_THP-1_Chrom_end2end_Plate1_DMSO_A02_DIA,DMSO,2,1210,10,0.0,False,0
1,2,1433B_HUMAN,12.866217,230719_THP-1_Chrom_end2end_Plate1_DMSO_A05_DIA,DMSO,5,1210,10,0.0,False,0
2,3,1433B_HUMAN,12.686827,230719_THP-1_Chrom_end2end_Plate1_DMSO_A10_DIA,DMSO,10,1210,10,0.0,False,0
3,4,1433B_HUMAN,12.625462,230719_THP-1_Chrom_end2end_Plate1_DMSO_A12_DIA,DMSO,12,1210,10,0.0,False,0
4,5,1433B_HUMAN,12.538365,230719_THP-1_Chrom_end2end_Plate1_DMSO_B01_DIA,DMSO,13,1210,10,0.0,False,0
...,...,...,...,...,...,...,...,...,...,...,...
1189821,266,ZZZ3_HUMAN,10.384438,230719_THP-1_Chrom_end2end_Plate3_DMSO_A10,VTP50469,202,170,10,0.0,False,0
1189822,267,ZZZ3_HUMAN,10.231615,230719_THP-1_Chrom_end2end_Plate3_DMSO_B03,VTP50469,207,170,10,0.0,False,0
1189823,268,ZZZ3_HUMAN,10.502691,230719_THP-1_Chrom_end2end_Plate3_DbET6_C07,VTP50469,223,170,10,0.0,False,0
1189824,269,ZZZ3_HUMAN,10.674776,230719_THP-1_Chrom_end2end_Plate3_DMSO_C11,VTP50469,227,170,10,0.0,False,0


### Experimental Factors:
| Treatment    | Target |
| :-------- | :------- |
| DMSO  | Control    |
| VTP50469  | MEN1    |
| PF477736 | Chk1    |
| Jakafi    | JAK1/2    |
| K-975  | TEAD1   |
| VE-821 | ATR    |
| dBET6    | BRD2/3/4   |


Our next goal is to demonstrate how we can pull pathway data from a pathway database. For instance, we could take one of the drugs or one of the proteins in our dataset, and query PathwayCommons for its neighborhood.

## 2. How can we query pathway data from a pathway database?

PyBioPAX implements the BioPAX level 3 object model (http://www.biopax.org/release/biopax-level3-documentation.pdf) as a set of Python classes. It exposes API functions to read OWL files into this object model, and to dump OWL files from this object model. This allows for the processing and creation of BioPAX models natively in Python.

Gyori BM, Hoyt CT (2022). PyBioPAX: biological pathway exchange in Python. Journal of Open Source Software, 7(71), 4136, https://doi.org/10.21105/joss.04136

In [11]:
import pybiopax
import requests

We can query for the neighborhood around JAK1 stored in pathway commons.  We should expect that the small molecule associated with Jakafi should be in the neighborhood.  Note that Ruxolitinib is another term for Jakafi, which we determined from gilda in a previous module.

In [12]:
def _get_query_entity(ent):
    if isinstance(ent, str):
        return ent
    elif isinstance(ent, (list, tuple)):
        return ','.join(ent)
    else:
        raise ValueError('Invalid query entity: %s' % str(ent))

In [61]:
kind = 'neighborhood'
source = ['JAK1']
# source = ['Ruxolitinib']
# target = ['Ruxolitinib']
pc2_url = 'https://www.pathwaycommons.org/pc2/'
params = {}
params['format'] = 'BIOPAX'
params['organism'] = '9606'
params['datasource'] = None
# Get the "kind" string
kind_str = kind.lower()
if kind not in ['neighborhood', 'pathsbetween', 'pathsfromto']:
    raise ValueError('Invalid query type %s' % kind_str)
params['kind'] = kind_str
# Get the source string
params['source'] = _get_query_entity(source)
params['limit'] = 1
if kind == 'pathsfromto':
    params['target'] = _get_query_entity(target)

res = requests.get(pc2_url + 'graph', params=params)

In [62]:
model = pybiopax.model_from_owl_str(res.text)

Processing OWL elements:   0%|          | 0.00/102k [00:00<?, ?it/s]

In [66]:
for object in model.objects.keys():
    if isinstance(model.objects.get(object), pybiopax.biopax.SmallMolecule):
        if 'Ruxolitinib' in model.objects.get(object).display_name:
            break
model.objects.get(object)

SmallMolecule(Ruxolitinib)

In [None]:
# model = pybiopax.model_from_pc_query('pathsfromto', ['MAP2K1'], ['MAPK1'])