## Path Setup
Add the parent directory to the Python path so that the notebook can find the modules

In [2]:
import sys
import os

cwd = os.getcwd() # Current working directory
dirname = os.path.dirname(cwd) # Parent directory
print(cwd)
print(dirname)
sys.path.append(dirname)# Add the parent directory to the Python path
print(sys.path)

/Users/rudi/Documents/GitHub/agent_evaluation/notebooks
/Users/rudi/Documents/GitHub/agent_evaluation
['/Users/rudi/Documents/GitHub/agent_evaluation/notebooks', '/Users/rudi/anaconda3/envs/dengue/lib/python311.zip', '/Users/rudi/anaconda3/envs/dengue/lib/python3.11', '/Users/rudi/anaconda3/envs/dengue/lib/python3.11/lib-dynload', '', '/Users/rudi/anaconda3/envs/dengue/lib/python3.11/site-packages', '/Users/rudi/Documents/GitHub/agent_evaluation']


## Get a Hierarchy


In [3]:
from agent_evaluation.hierarchy import Hierarchy
import json
import ndex2 
from ndex2.cx2 import RawCX2NetworkFactory

# Create NDEx2 python client
client = ndex2.client.Ndex2()

# Create CX2Network factory
factory = RawCX2NetworkFactory()

# Download BioGRID: Protein-Protein Interactions (SARS-CoV) from NDEx
# https://www.ndexbio.org/viewer/networks/669f30a3-cee6-11ea-aaef-0ac135e8bacf
# client_resp = client.get_network_as_cx2_stream('669f30a3-cee6-11ea-aaef-0ac135e8bacf')

# Dengue string interactome network c223d6db-b0e2-11ee-8a13-005056ae23aa
client_resp = client.get_network_as_cx2_stream('c223d6db-b0e2-11ee-8a13-005056ae23aa')

# Convert downloaded interactome network to CX2Network object
interactome = factory.get_cx2network(json.loads(client_resp.content))

# Dengue hierarchy
# https://www.ndexbio.org/viewer/networks/59bbb9f1-e029-11ee-9621-005056ae23aa
client_resp = client.get_network_as_cx2_stream('59bbb9f1-e029-11ee-9621-005056ae23aa')

# Convert downloaded interactome network to CX2Network object
hierarchy = factory.get_cx2network(json.loads(client_resp.content))

# Display information about the hierarchy network and output 1st 100 characters of CX2
print('Name: ' + hierarchy.get_name())
print('Number of nodes: ' + str(len(hierarchy.get_nodes())))
print('Number of nodes: ' + str(len(hierarchy.get_edges())))

# Display information about the interactome network 
print('Name: ' + interactome.get_name())
print('Number of nodes: ' + str(len(interactome.get_nodes())))
print('Number of nodes: ' + str(len(interactome.get_edges())))


Name: Dengue model - hidef string 12.0 0.7 (GPT-4 annotated) - L2R
Number of nodes: 203
Number of nodes: 249
Name: dengue string 12.0 0.7
Number of nodes: 1375
Number of nodes: 2792


## Get Datasets

In [4]:
dengue_hierarchy = Hierarchy(hierarchy, interactome)
print(dengue_hierarchy.get_experiment_description())
datasets = dengue_hierarchy.get_datasets(member_attributes=["name", "DV3_24h-Mock_24h"],
                                         filter={"max_size": 6})[32:33:]   #[1:33:31]          
for dataset in datasets:
    print(dataset.data)

None
[{'name': 'SP110', 'DV3_24h-Mock_24h': 0.842989962}, {'name': 'PARP9', 'DV3_24h-Mock_24h': 0.971403167}, {'name': 'SAMD9L', 'DV3_24h-Mock_24h': 1.172951102}, {'name': 'DTX3L'}, {'name': 'PARP15'}]


## Analyst Agents

In [5]:
from agent_evaluation.analyst import Analyst
from agent_evaluation.llm import OpenAI_LLM

gpt3_5 = OpenAI_LLM("gpt-3.5-turbo-1106")
gpt4 = OpenAI_LLM("gpt-4-0125-preview")

Model: gpt-3.5-turbo-1106, Temperature: 0, Max Tokens: 2048, Seed: 42
Model: gpt-4-0125-preview, Temperature: 0, Max Tokens: 2048, Seed: 42


In [6]:
# Analist 1 > Jane (GPT-3.5-turbo-1106)

analyst_1_context = """
You are a helpful analyst of genomic, proteomic, and other biological data. 
"""

analyst_1_prompt_template = """ 
The provided proteomics "dataset" includes interacting proteins and the measurements of their differential abundance as a ratio between treated and non-treated samples, where the treatment is the infection of human cells with Dengue virus. 
Not all proteins in the dataset have differential abundance measurements.

The dataset has 2 columns with the following headers: name, DV3_24h-Mock_24h. 
The first column contains the protein names and the last columns contains the abundance measurements.
Please note that measurements <0 reflect a "decreased abundance" while measurements >0 indicate an "increased abundance".

Your task is to leverage this dataset to analyze a subset of interacting proteins that are defined as “proteins of interest".

First, determine what proteins of interest show a differential abundance recorded in the dataset. 
Then, based on this information and on the known functions of all other proteins of interest, 
I want you to generate a hypothesis describing the mechanisms that may contribute to the disease state 
and could potentially be targeted by drug therapies.

Your hypothesis should meet the following criteria:
1) Include one or more molecular mechanism involving one or more proteins of interest
2) Be plausible - grounded in known molecular functions and interactions
3) Be novel - proposing mechanisms either not known or not known to be relevant to the experimental context
4) Be actionable - can be validated with relatively low-cost experimental techniques

When presenting your results, please adhere to the following guidelines:

- Avoid including any code.
- Do not describe the analytical steps you took.
- Do not merely list the proteins of interest, regardless whether they show a differential abundance recorded in the dataset or not.
- Build your hypotheses taking into consideration the interplay among all proteins of interest, not only those that show a differential abundance in the dataset.

- Your output should consist solely of the identified proteins of interest with changed abundance levels, and the hypothesis you propose.

Here is the set of proteins of interest: 
{data}
"""

analyst_1 = Analyst(gpt3_5, analyst_1_context, analyst_1_prompt_template, "Jane", "The first analyst")


In [7]:
# Analist 2 > John (GPT-4-0125-preview)

analyst_2_context = analyst_1_context

analyst_2_prompt_template = analyst_1_prompt_template


analyst_2 = Analyst(gpt4, analyst_2_context, analyst_2_prompt_template, "John", "The second analyst")

## The TestPlan

In [8]:
from agent_evaluation.test import TestPlan

test_plan = TestPlan(analysts=[analyst_1, analyst_2], datasets=datasets)


## Run the Test

OpenAi python package cannot be > 0.28.

- https://github.com/openai/openai-python

- https://github.com/openai/openai-python/discussions/742

If Genai package is used, Openai must be 0.27.x 

In [9]:
from agent_evaluation.test import Test

test = Test(test_plan)
test.run()

Generating hypothesis by Jane on [{'name': 'SP110', 'DV3_24h-Mock_24h': 0.842989962}, {'name': 'PARP9', 'DV3_24h-Mock_24h': 0.971403167}, {'name': 'SAMD9L', 'DV3_24h-Mock_24h': 1.172951102}, {'name': 'DTX3L'}, {'name': 'PARP15'}]
Generating hypothesis by John on [{'name': 'SP110', 'DV3_24h-Mock_24h': 0.842989962}, {'name': 'PARP9', 'DV3_24h-Mock_24h': 0.971403167}, {'name': 'SAMD9L', 'DV3_24h-Mock_24h': 1.172951102}, {'name': 'DTX3L'}, {'name': 'PARP15'}]


In [10]:
for hypothesis in test.hypotheses:
    print(f"{hypothesis.analyst.name} ({hypothesis.analyst.llm.model_name}):")
    print(hypothesis.description)
    print("---")

Jane (gpt-3.5-turbo-1106):
('The proteins of interest with differential abundance recorded in the dataset are:\n- SP110 (DV3_24h-Mock_24h: 0.842989962, indicating decreased abundance)\n- PARP9 (DV3_24h-Mock_24h: 0.971403167, indicating decreased abundance)\n- SAMD9L (DV3_24h-Mock_24h: 1.172951102, indicating increased abundance)\n\nBased on the known functions of these proteins and the other proteins of interest, a hypothesis can be proposed:\n\nHypothesis:\nThe differential abundance of SP110, PARP9, and SAMD9L in response to Dengue virus infection suggests a potential mechanism involving the regulation of host immune response and antiviral defense. SP110 and PARP9 are known to be involved in innate immune response and have been implicated in antiviral activities. The decreased abundance of SP110 and PARP9 may indicate a subversion of host immune response by the virus, leading to compromised antiviral defense. On the other hand, the increased abundance of SAMD9L may reflect a compensa

In [11]:
# Check the number of hypotheses generated (should be just 2, 1 by Jane and 1 by John)

len(test.hypotheses)

2

## Reviewers

In [12]:
from agent_evaluation.reviewer import Reviewer


# Reviewer 1 > James Watson (GPT-3.5-turbo-1106)

reviewer_1_context = "You are a full professor with extensive knowledge of molecular mechanisms in biology and human diseases"

reviewer_1_prompt_template = """
Starting from an experimental dataset and a list of proteins of intertest, our analysts have generated 2 hypotheses anlyst's 
that might explain the observed data upon infection of a human cell line with the Dengue virus.

Your task is to carefully review the 2 hypotheses provided, and choose the best one based on the following evaluation criteria:

1) Mechanistic - The hypothesis includes one or more molecular mechanisms involving one or more proteins of interest.
2) Plausible - The hypothesis is plausible is grounded in known molecular functions and interactions.
3) Novel - The hypothesis proposes mechanisms either not known or not known to be relevant to the experimental context.
4) Actionable - The hypothesis actionable can be validated with relatively simple, low-cost experimental techniques".

You must execute your evaluation using only the information provided in the 2 hypotheses.

When presenting your output, only include the following info:
1) Which analyst's hypothesis you deem to be the best one ({analyst_a} or {analyst_b}).
2) What are the reasons that dictated your decision.
3) If the 2 hypotheses are of equivalent quality, don't make a choice and provide a brief explanation supporting your decision.

Here are the hypotheses:
{analyst_a}: {hypothesis_a}
{analyst_b}: {hypothesis_b}
"""

reviewer_1 = Reviewer(gpt3_5, reviewer_1_context, reviewer_1_prompt_template, "James Watson", "The first reviewer")


In [13]:
# Reviewer 2 > Francis Crick (GPT-4-0125-preview)

reviewer_2_context = reviewer_1_context
reviewer_2_prompt_template = reviewer_1_prompt_template

reviewer_2 = Reviewer(gpt4, reviewer_2_context, reviewer_2_prompt_template, "Francis Crick", "The second reviewer")

## The ReviewPlan

In [14]:
from agent_evaluation.review import ReviewPlan

review_plan = ReviewPlan(reviewers=[reviewer_1, reviewer_2], test=test)


## Run the Review

In [15]:
from agent_evaluation.review import Review

review = Review(review_plan)    
review.run()


Generating comparison by James Watson...
Generating comparison by James Watson...
Generating comparison by Francis Crick...
Generating comparison by Francis Crick...


In [16]:
for comparison in review.comparisons:
    print(f"{comparison.reviewer.name} ({comparison.reviewer.llm.model_name})")
    print(comparison.comment)
    print("----")

James Watson (gpt-3.5-turbo-1106)
("Based on the evaluation criteria provided, I would choose John's hypothesis as the best one. Here are the reasons for my decision:\n\n1) Mechanistic: John's hypothesis provides a detailed molecular mechanism involving the interplay of SP110, PARP9, and SAMD9L in response to Dengue virus infection, including their roles in antiviral responses, DNA repair activation, and inflammation. It also discusses the potential consequences of the increased abundance of these proteins on the disease state.\n\n2) Plausible: The hypothesis is grounded in known molecular functions and interactions of the proteins of interest, linking their increased abundance to specific cellular responses and potential disease outcomes.\n\n3) Novel: John's hypothesis proposes a novel mechanism by which the interplay of SP110, PARP9, and SAMD9L could exacerbate disease during Dengue virus infection, providing new insights into the potential consequences of the host's antiviral respon

## Creating a mock analyst & hypothesis

Create a function to wrap up all the below

In [18]:
mock_analyst = Analyst(llm=None, context=None, prompt_template=None, name="Rudi", description=None)

In [20]:
mock_analyst.name

'Rudi'

In [21]:
mock_description = "Identified Proteins of Interest with Changed Abundance Levels:\n- SP110: Increased abundance (0.842989962)\nThe interaction and differential abundance of SP110, in response to Dengue virus (DV) infection suggest a novel molecular mechanism that could contribute to the disease state and offer potential targets for drug therapy. \n\nSP110 is a nuclear body protein involved in innate immunity and has been implicated in viral defense mechanisms. Its increased abundance upon DV infection suggests an enhanced cellular attempt to mount an antiviral response. SP110 and its interacting partner DTX3L, although DTX3L's abundance was not measured, form a complex known to be involved in the regulation of DNA damage responses and inflammation. The increased abundance of PARP9 could indicate an activation of DNA repair mechanisms and inflammatory responses following DV infection. SAMD9L, similarly, plays a role in antiviral responses and has been shown to be involved in the negative regulation of cell proliferation.\n\nGiven the roles of these proteins, we propose that the increased abundance of SP110, PARP9, and SAMD9L upon DV infection leads to a heightened state of antiviral response, DNA repair activation, and inflammation. However, this response might inadvertently contribute to the disease state by promoting excessive inflammation and potentially interfering with normal cell functions, leading to cellular stress and damage. This imbalance could facilitate viral replication or exacerbate disease symptoms.\n\nTo validate this hypothesis, we suggest the following low-cost experimental approaches:\n1. **siRNA Knockdown Experiments**: Use siRNA to knock down SP110, PARP9, and SAMD9L in DV-infected human cell lines to assess changes in viral replication and cell viability. A decrease in viral load or alleviation of cell damage upon knockdown would support the hypothesis that these proteins, while part of the antiviral response, contribute to the disease state.\n2. **Inflammatory Cytokine Profiling**: Measure the levels of inflammatory cytokines in DV-infected cells with and without the knockdown of SP110. An increase in these proteins should correlate with higher cytokine levels, supporting their role in inflammation during DV infection.\n3. **DNA Damage Assays**: Perform assays to assess DNA damage (e.g., comet assay, γ-H2AX foci formation) in DV-infected cells with altered expression of PARP9. This would help to elucidate the role of PARP9 in DNA damage response during DV infection.\n\nThis hypothesis not only proposes a novel mechanism by which DV infection could exacerbate disease through the interplay of SP110, PARP9, and SAMD9L but also suggests that targeting these proteins could mitigate disease severity, offering a new avenue for therapeutic intervention."

In [23]:
from agent_evaluation.hypothesis import Hypothesis
mock_hypothesis = Hypothesis(dataset, mock_analyst, mock_description)

In [24]:
mock_test_plan = TestPlan(analysts=[analyst_2, mock_analyst], datasets=datasets)

In [29]:
mock_test = Test(mock_test_plan)
mock_test.hypotheses=[mock_hypothesis, test.hypotheses[1]]

In [30]:
mock_review_plan = ReviewPlan(mock_test,reviewers=[reviewer_1, reviewer_2])

In [31]:
mock_review = Review(mock_review_plan)

In [32]:
mock_test

<agent_evaluation.test.Test at 0x1171bdb50>

In [37]:
mock_review.run()

Generating comparison by James Watson...
Generating comparison by James Watson...
Generating comparison by Francis Crick...
Generating comparison by Francis Crick...


In [38]:
for comparison in mock_review.comparisons:
    print(f"{comparison.reviewer.name} ({comparison.reviewer.llm.model_name})")
    print(comparison.comment)
    print("----")

James Watson (gpt-3.5-turbo-1106)
("Based on the evaluation criteria provided, I would choose Rudi's hypothesis as the best one. Here are the reasons for my decision:\n\n1) Mechanistic: Rudi's hypothesis includes a clear molecular mechanism involving the proteins of interest (SP110, PARP9, and SAMD9L) and their roles in antiviral response, DNA repair activation, and inflammation.\n\n2) Plausible: The hypothesis is grounded in known molecular functions and interactions of the proteins involved in innate immunity, viral defense mechanisms, DNA damage responses, and inflammation.\n\n3) Novel: Rudi's hypothesis proposes mechanisms that are relevant to the experimental context and suggests a novel interplay of the identified proteins in exacerbating disease through the heightened state of antiviral response and inflammation.\n\n4) Actionable: The hypothesis provides low-cost experimental approaches for validation, including siRNA knockdown experiments, inflammatory cytokine profiling, and D