##### Canned question: Mechanism of Action (MoA) cross-results

This is a simple demo showing how we can use an LLM to group results from Translator according to their mechanisms of action.

In [20]:
from graphwerk import trapimsg
import summarizers.trapi_tools as tools
import summarizers.common_utils as cu
from driver import load_trapi_response

In [22]:
trapi_path = 'data/101635ed-6bb7-46fc-a2f1-96d814c34040.json'
trapi_response_msg = load_trapi_response(trapi_path)

In [57]:
orig_msg = trapi_response_msg['fields']['data']['message']
orig_kg = orig_msg['knowledge_graph']
orig_ag = orig_msg['auxiliary_graphs']
orig_qg = orig_msg['query_graph']
retval = ''
object_node_id, object_node_data = tools.get_object_node_data(orig_qg, orig_kg)
res_nodes = {} # The node section is universal, not per-result

per_result = ''
counter = 1
# fetch the first 10 results for manually reviewing
idx_range = range(0, min(10, len(orig_msg['results'])))
pub_cutoff = 0
for i in idx_range:
    cur_res = orig_msg['results'][i]
    subject_id = cur_res['node_bindings']['sn'][0]['id']
    subject_name = orig_kg['nodes'][subject_id]['name']
    per_result += f"** Result {counter}: {subject_name}\n"
    res_edges = {}
    res_sgs = {}
    trapimsg.collect_edges_and_sgs_for_res_elem(cur_res, orig_kg, orig_ag, res_edges, res_sgs)
    trapimsg.collect_nodes_for_edge_collection(res_edges, orig_kg, res_nodes)
    presum_edges = tools.create_edge_presummary_raw_data(res_edges, res_nodes, pub_cutoff)
    presum_nodes = tools.create_node_presummary_raw_data(res_nodes)
    per_result += cu.create_edge_data_summary(presum_edges, 1) + '\n'
    counter += 1

In [1]:
# print(per_result.rstrip())

In [69]:
def generate_prompt(results):
    if results:
        prompt = f"""I’m not looking for detailed mechanism of actions of each drug/chemical. Please analyze the cross-result patterns across all inferred paths in the provided results: what common MoA themes, gene targets, or pathways emerge? Group the drugs/chemicals into categories based on shared mechanisms rather than listing them individually.
Results: 
{results}
        """
    else: prompt = None

    return prompt

In [72]:
prompt = generate_prompt(per_result.rstrip())
print(prompt)

I’m not looking for detailed mechanism of actions of each drug/chemical. Please analyze the cross-result patterns across all inferred paths in the provided results: what common MoA themes, gene targets, or pathways emerge? Group the drugs/chemicals into categories based on shared mechanisms rather than listing them individually.
Results: 
** Result 1: Prednisone
| <SUBJECT> | <PREDICATE> | <OBJECT> | <PUBMED IDS> | <CLINICAL TRIAL IDS> |
| Prednisone | treats_or_applied_or_studied_to_treat | Multiple Pulmonary Nodules |  |  |

** Result 2: dexamethasone sodium phosphate
| <SUBJECT> | <PREDICATE> | <OBJECT> | <PUBMED IDS> | <CLINICAL TRIAL IDS> |
| dexamethasone sodium phosphate | in_clinical_trials_for | Nodule of lung |  | NCT00906503 |
| NCF2 | gene_associated_with_condition | Nodule of lung |  |  |
| STAT1 | gene_associated_with_condition | Nodule of lung |  |  |
| CTLA4 | gene_associated_with_condition | Nodule of lung |  |  |
| HLA-DPA1 | gene_associated_with_condition | Nodule of

In [71]:
from ollama import Client

client = Client()

if prompt:
    messages = [
      {
        'role': 'user',
        'content': prompt,
      },
    ]
    
    response = client.chat(
        model='gpt-oss:20b',
        messages=messages,
        options={'num_ctx': 8192},  # 8192 is the recommended lower limit for the context window
      )

    response_content = response['message']['content']
else:
    response_content = "No prompt is provided."

print(response_content)

**Cross‑pathway overview**

| Category | Core mechanism(s) | Shared gene‑targets / pathways | Representative drugs (listed as *type* rather than each name) |
|----------|-------------------|---------------------------------|--------------------------------------------------------------|
| **1. Glucocorticoid‑mediated transcriptional re‑pression** | Activation of the glucocorticoid receptor (GR) → recruitment of histone deacetylases & co‑repressors → suppression of NF‑κB, AP‑1, and STAT1‑driven transcription. | **STAT1, CTLA4, NCF2** (down‑regulated); also dampens HLA‑DP expression indirectly. | Prednisone, Dexamethasone (and analogues) |
| **2. Cytokine/Immune‑checkpoint modulators that act through JAK‑STAT / CTLA4** | Exogenous cytokines or receptor antagonists alter the JAK‑STAT cascade (primarily STAT1) and/or directly influence T‑cell costimulatory molecules. | **STAT1, CTLA4, PTPN22, PRTN3** (modulated); also influence HLA‑DP genes. | Interferon‑γ, IL‑10, Tocilizumab (IL‑6R blocka

In [None]:
# TODO:
# review the response from LLM
# propose any improvement we can do

In [None]:
# Next steps:
# The input results should be limited up to 10-20(?) because of the context length / attention length limitation(?)
# Need to include the node details if we do not want the knowledge from the traning data of a LLM
# Possible for the user to choose the input results?
# Tools development for LLM to fetch/retrieve better backgroud knowledge about the nodes and relations 
# from the Translator result graph. e.g. fetch gene information, gene enrichment/related genes(?), etc.