# MELODI Presto Use Cases

In [1]:
import json
import pandas as pd
import requests
import matplotlib.pyplot as plt
%matplotlib inline 
import numpy as np
from utils import enrich, overlap, sentence, API_URL

## Configure parameters

In [3]:
requests.get(f"{API_URL}/status").json()

True

# Run the overlap function on some examples

### Genetic basis of psoriasis

https://www.cell.com/ajhg/fulltext/S0002-9297(12)00157-7

In [35]:
q1=['AP1S3','IL36RN','CARD14']
q2=['Psoriasis']

overlap_df = overlap(q1,q2)

set_x   object_type_x  object_name_x                
ap1s3   gngm           CARD14                             5
card14  aapp           NF-kappa B                         3
        dsyn           Psoriasis                        861
                       Autoimmune Diseases               10
                       skin disorder                      8
        gngm           NF-kappa B                        21
                       CARD14                             5
il36rn  aapp           Interleukin-1 beta                 5
                       Interleukin Receptor               1
        dsyn           Arthritis, Psoriatic              33
                       Pustulosis of Palms and Soles      3
        gngm           CARD14                             5
                       Interleukin Receptor               1
Name: object_name_x, dtype: int64

In [5]:
overlap_counts = overlap_df.groupby(['set_x','object_type_x'])['object_name_x'].value_counts()
overlap_counts

set_x   object_type_x  object_name_x                
ap1s3   gngm           CARD14                             5
card14  aapp           NF-kappa B                         3
        dsyn           Psoriasis                        861
                       Autoimmune Diseases               10
                       skin disorder                      8
        gngm           NF-kappa B                        21
                       CARD14                             5
il36rn  aapp           Interleukin Receptor               1
        dsyn           Arthritis, Psoriatic              33
                       Pustulosis of Palms and Soles      3
        gngm           CARD14                             5
                       Interleukin Receptor               1
Name: object_name_x, dtype: int64

### Drug repurposing

https://www.medrxiv.org/content/10.1101/2020.05.07.20093286v1

In [32]:
q1=['DHODH', 'ITGB5', 'JAK2']
q2=['Leflunomide','Cilengitide','Baricitinib']

overlap_df = overlap(q1,q2)

In [33]:
overlap_counts = overlap_df.groupby(['set_x','set_y','object_type_x'])['object_name_x'].value_counts()
overlap_counts

set_x  set_y        object_type_x  object_name_x                     
dhodh  leflunomide  aapp           Phosphotransferases                    1
                    dsyn           Rheumatoid Arthritis                   9
                    gngm           Dihydroorotate dehydrogenase          60
                                   Dihydroorotate dehydrogenase|DHODH     1
                    orch           leflunomide                           92
                                   Pyrimidine                            12
jak2   baricitinib  aapp           Janus kinase                           4
                                   Janus kinase 1|JAK1                    4
                                   cytokine                               3
                    gngm           Janus kinase                          10
                                   Janus kinase 1|JAK1                    4
                                   cytokine                               2
                  

### Obesity and thyroid cancer

https://academic.oup.com/jcem/article/105/7/dgaa250/5835841

In [36]:
q1=['obesity']
q2=['thyroid cancer']

overlap_df = overlap(q1,q2)
overlap_counts = overlap_df.groupby(['object_type_x'])['object_name_x'].value_counts()
overlap_counts

object_type_x  object_name_x                     
aapp           Insulin                               17
               ghrelin                               11
               Proto-Oncogene Proteins c-akt|AKT1     5
               PPAR gamma                             4
               FRAP1 protein, human|MTOR              2
               Sex Hormone-Binding Globulin           2
               Corticotropin-Releasing Hormone        1
               Somatostatin                           1
dsyn           Syndrome                              30
               Hypothyroidism                         3
               Crohn's disease                        2
               Thrombophilia                          2
               Thrombus                               2
gngm           Insulin                                7
               PPAR gamma                             7
               Proto-Oncogene Proteins c-akt|AKT1     5
               ghrelin                                

### Coronavirus and dexamethasone

Recent work (https://www.recoverytrial.net/) has demonstrated a potential beneficial effect of dexamethasone on covid-19. Here we can explore the potential intermediates connecting them, including genes, diseases and hormones. We can also separate the two terms `coronavirus` and `covid-19` to distinguish which semantic terms are associated with each query. 

In [37]:
q1=['dexamethasone']
q2=['coronavirus','covid-19']

overlap_df = overlap(q1,q2)
overlap_counts = overlap_df.groupby(['set_y','object_type_x'])['object_name_x'].value_counts()
overlap_counts

set_y        object_type_x  object_name_x                            
coronavirus  aapp           cytokine                                      12
                            TNF protein, human|TNF                         6
                            TGFB1 protein, human|TGFB1                     3
                            Tumor Necrosis Factor-alpha|TNF                3
                            Endopeptidases                                 1
                            Pulmonary Surfactant-Associated Protein A      1
                            Pulmonary Surfactant-Associated Protein D      1
             dsyn           Infection                                     24
                            Pneumonia                                     10
                            Syndrome                                       8
                            Virus Diseases                                 8
                            Hypertensive disease                           7
      

# Identifying risk factors for a disease

We can explore one disease in detail to identify risk factors. In this case, `asthma`.

In [None]:
q='breast cancer'
enrich_df=enrich(q)

#map to objects that contain the query term
enrich_df = enrich_df[enrich_df['object_name'].str.contains(q,case=False)]
#print(enrich_df)
print(enrich_df.shape)

#list of risk factor predicates
rf_preds=['CAUSES','PREDISPOSES','PRECEDES','STIMULATES']
rf=enrich_df[enrich_df['predicate'].isin(rf_preds)]

#make sure pval is a float
rf['pval']=rf['pval'].astype(float)

#look at the top 20 ordered by enrichment pvalue
rf.sort_values(by='pval',ascending=True).head(20)[['subject_name','subject_type','predicate','object_name','pval','localCount']]

# All against all

The excellent performance capabilities of MELODI Presto means we can perform an all-against-all analysis for a range of terms, e.g. a list of genes and diseases. This can create a network, highlighting potential shared mechanisms of action. 

As an example we examine the relationships between 

In [17]:
#just run enrich on each then create network