In [2]:
import os
from dotenv import load_dotenv
# Load environment variables from .env file

load_dotenv()

openai_api_key = os.getenv("OPENAI_API_KEY")



https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_stateflow.ipynb - see here for examples of coder and paper summariser agent

1. initialiser (GC manager)
2. Coder - get papers API (why is it a coder)
3. Execute the code? - human input 
4. Scientist read the papers and write a summary



In [3]:
import os
import autogen

load_dotenv() 

openai_api_key = os.getenv("OPENAI_API_KEY")


In [18]:
import json
import autogen
import os

# Get the API key from environment variable
openai_api_key = os.environ.get("OPENAI_API_KEY")

# Load the config list from JSON
config_list = autogen.config_list_from_json(
    "OA_config_list.json",
    filter_dict={"model": ["gpt-4"]}
)

# Add the API key to each config in the list
for config in config_list:
    config["api_key"] = openai_api_key

In [30]:
import tempfile
from autogen.coding import LocalCommandLineCodeExecutor
from autogen import GroupChat
from APIs.pubmed import PubMedAPI # Import the PubMedAPI class for literature search


pubmed_api =PubMedAPI(email="sanazkazemi@hotmail.com") # for enterez to contact you if neccessary

    # Add other database APIs as needed
def use_pubmed_api(query, max_results=10):
    """
        Search PubMed for scientific literature related to the given query.

        This function should be used when you need to retrieve recent, peer-reviewed scientific 
        information from biomedical literature. It's particularly useful for:
        - Finding evidence to support scientific claims
        - Gathering information on recent advancements in a specific area of biomedical research
        - Identifying key papers or authors in a particular field
        - Checking the current state of knowledge on a specific topic

        Use this function when:
        - You need up-to-date, scientifically validated information
        - You want to cite specific papers to support your arguments
        - You need to explore the current research landscape on a topic

        Do not use this function when:
        - You need information from non-scientific sources
        - The query is not related to biomedical or life sciences
        - You require full-text articles (this API only provides abstracts)
        - You need information from a specific paper (use DOI or PMID instead)
        - You're looking for general knowledge that doesn't require scientific citation

        Args:
        query (str): The search query. Can include Boolean operators (AND, OR, NOT) and 
                    field tags (e.g., [Title], [Author], [Journal]).
        max_results (int, optional): Maximum number of results to return. Default is 5.
                                    Increasing this number will increase the API call duration.

        Returns:
        str: A formatted string containing details of the found papers, including titles, 
            authors, journal names, publication years, DOIs, and abstracts.

        Example:
        >>> result = use_pubmed_api("ACE inhibitors hypertension", 3)
        >>> print(result)
        Title: Comparative Effectiveness of ACE Inhibitors and ARBs in Hypertension Treatment
        Authors: Smith J, Johnson M, Williams R
        Journal: Journal of Hypertension, 2023
        DOI: 10.1000/jht.2023.1234
        Abstract: This study compares the effectiveness of ACE inhibitors and ARBs in...

        Note:
        - This function makes real-time API calls to PubMed. Use it judiciously to avoid 
        overwhelming the server or exceeding usage limits.
        - The results are based on PubMed's relevance ranking and may not always return 
        the most recent papers first.
        - Always critically evaluate the returned information and cross-reference when necessary.
        """
    
    print(f"PubMed API called with query: {query}, max_results: {max_results}")
    papers = pubmed_api.query(query, max_results)
    print(f"Retrieved {len(papers)} papers from PubMed")
    result = pubmed_api.format_results(papers)
    print("PubMed API call completed")
    
    return result
    

temp_dir = tempfile.TemporaryDirectory()
executor = LocalCommandLineCodeExecutor(
    timeout=10,  # Timeout for each code execution in seconds.
    work_dir=temp_dir.name,  # Use the temporary directory to store the code files.
)

gpt4_config = {
    "cache_seed": False,  # change the cache_seed for different trials
    "temperature": 0,
    "config_list": config_list,
    "timeout": 120,
}

Moderator = autogen.UserProxyAgent(
    name="Moderator",
    code_execution_config=False,
    human_input_mode="NEVER",
    system_message="""
                    You are the Moderator. You receive prompts from the 
    human and coordinate the discussion between the other 
    agents. After receiving input from all agents, you summarize 
    and present the final output. You have access to a PubMed search function. 
    Use it to find relevant scientific literature when needed.""",)


scientific_rational = autogen.AssistantAgent(
    name="Scientific_Rational",
    llm_config=config_list[0],
    system_message="""You provide scientific reasoning and explanations for the given 
                    prompt using articles and evidence. You reference the sources and 
                    provide a summary of the key points. Use the search_pubmed function
                    to find relevant scientific literature when needed.""",)


safety_officer = autogen.AssistantAgent(
    name="Safety_Assistant",
    system_message="""You are the safety assistant. You ensure that the chosen drug target is safe for human use. 
                    You provide a safety assessment of the drug target and suggest any necessary modifications.Use the search_pubmed function
                    to find relevant scientific literature when needed.",)
                     human_input_mode="NEVER""",
    llm_config=config_list[0],
)
target_assessment = autogen.AssistantAgent(
    name="Target_Assessment",
    llm_config=config_list[0],
    system_message="""You develop strategies to assess and evaluate the target or goal mentioned in the prompt. 
                    You Provide a detailed plan for the assessment and evaluation of the target.Use the search_pubmed function
                    to find relevant scientific literature when needed.""",
)

literature_agent = autogen.AssistantAgent(
    name="Literature_Agent",
    llm_config=config_list[0],
    system_message="""You provide relevant literature and references related to the prompt. 
                    You summarize the key points and provide a list of references.Use the search_pubmed function
                    to find relevant scientific literature when needed.""",
)

def state_transition(last_speaker, groupchat):
    print(f"checking if {last_speaker.name} used the pubmed api...")
    if hasattr(last_speaker, 'last_function_call') and last_speaker.last_function_call:
        if "use_pubmed_api" in last_speaker.last_function_call:
            print(f"{last_speaker.name} used the PubMed API")
            print(f"Query: {last_speaker.last_function_call['use_pubmed_api']['args'][0]}")
            print(f"Results: {last_speaker.last_function_call['use_pubmed_api']['return_value']}")
        else:
            print(f"{last_speaker.name} did not use the PubMed API")

    if last_speaker == Moderator:
        return scientific_rational
    elif last_speaker == scientific_rational:
        return safety_officer
    elif last_speaker == safety_officer:
        return target_assessment
    elif last_speaker == target_assessment:
        return Moderator
    else:
        return None


agent_list=[Moderator, scientific_rational, safety_officer, target_assessment]

for agent in agent_list:
    agent.register_function(
        function_map = {
            "use_pubmed_api": use_pubmed_api,
        })
    print(f"Registered use_pubmed_api function for agent: {agent.name}")
    

groupchat = autogen.GroupChat(
    agents=agent_list,
    messages=[],
    max_round=30,
    speaker_selection_method=state_transition,
    send_introductions=True,
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=config_list[0])

Registered use_pubmed_api function for agent: Moderator
Registered use_pubmed_api function for agent: Scientific_Rational
Registered use_pubmed_api function for agent: Safety_Assistant
Registered use_pubmed_api function for agent: Target_Assessment


In [31]:
chat_result = Moderator.initiate_chat(
    manager, message=
    
    """You are a AI assistant with a background in drug discovery.

Given target: intasomes
Given disease: HIV
Given mode of action: Inhibition of HIV intasomes

Context:
Intasome inhibitors are widely used for the treatment of HIV. The intasome is a complex of viral and cellular proteins that mediates the integration of the viral DNA into the host genome. Inhibiting intasomes can prevent the integration of the viral DNA, thereby blocking the replication of the virus.
 This approach has been successful in the development of antiretroviral drugs for HIV treatment.

Task 1: Develop a scientific rationale for Intasome inhibition in HIV.

Highlight the working hypothesis for the clinical target rationale and human biology evidence by minimum 2000 words.

Describe as much as possible the evidence in humans or in human tissue that link the target, target space or approach to the pathogenesis of interest.
If known, also describe here the wanted mode of action with regards to desired clinical outcome.
Please avoid including only pre-clinical data in this section.

Use the following structure and provide a detailed description for each point:
- Working hypothesis:
  - Create a detailed description of the following idea: inhibiting ACE to reduce blood pressure
  - Is there are significant unmet medical need?
  - Is it suitable for combination therapy?
  - Which predictive biomarkers exist for the target related to the disease?
    - Provide a detailed description of existing clinical relevant biomarkers.

- Clinical target rationale:
  - How relevant is the target location to the disease biology?
  - How it the target expression altered in human disease?
  - How is the target involved in the physiological process relevant to the disease?
  - Which phenotypes and genotypes were identified for the target?
  - How is the genetic link between the target and the disease?
  - Describe the evidence provided in clinics or by tools acting on the pathway where the target is involved.
  - Which kind of target modulation is required to treat the disease?

- Challenges for the drug discovery program related to the target.
  - Check the following idea for details on small molecule compounds: idea.
  - Is a 'information driven approach' (IDA) strategy based on available small molecules possible?
    - Which small molecular modulators of the target known?
    - Which inhibitors, antagonists, agonists, negative allosteric modulators (NAM), positive allosteric modulators (PAM) are required for target modulation in the given disease? 
  - Which patients would respond the therapy?
  - Is the proposed mode of action on the target desirable and commercially viable in a clinical setting?
  - What are advantages and disadvantages of different therapeutic modalities (antibodies, small molecules, antisense oligonucleotides, PROTACs, molecular glue, peptide macrocycles, and so on) for tackling the target?

- Alternative indications:
  - Describe alternative indication for modulators of the target and explain why.

Task 2: Develop a target assessment strategy for ACE in hypertension in maximal 500 words.

Outline a 1-year Target Assessment (TA) to Lead Identification (LI) plan. Describe High Level TA-LI plans.
- Make an emphasis on key inflection points that will inform the feasibility of the project. 
- Address status of in-vitro platforms, translational in vivo models (mechanistic models, not necessarily so called 'disease models')
  and describe what needs to be established. Elaborate on tractability and major challenges for advancement in a drug discovery portfolio.
- Discuss potential biomarkers and readouts for efficacy and target engagement.

Task 3: Safety assessment
- Does the target show bias towards expression in the desired organ (e.g. CNS)?
- Is it specifically expressed in the organ (e.g. brain)?
- Are there disease specific expression databases?
- Are there tissue-selective isoforms of the target?
- Are there condition-specific isoforms of the target?
- What regulates the alternative splicing that makes one isoform versus the other?
- How large is the expression of the target in the mouse model intended for in vivo tests?
- Is major phenotype reported in target knockouts and/or expression of rodent models?
- Are there published differences in expression between human and rodent models.
- What are the species differences that could be used to interpret rodent safety data on the target?
- What are the peripherial safety risks (oncogensis)?
- Can the modulation of the target promote tumor formation?
- Is there a way to assess on-target safety concerns?
- What are the safety concerns in case of exaggerated pharmacology?
- Will it disrupt cellular functions (e.g. endosomes, lysosomes, nuclear, mitochondrial) function with all its safety liability?
- How large is the risk for immunogenicity (related to biologics/antibody based approaches)?
- If the target is an enzyme, do polymorphisms in the human gene alter the protein enzyme activity?

Provide the corresponding literature references in the format (First author, Journal, Year, Volume, Issue, Pages, DOI) if any information is not there provide "N/A".

Let's work this out in a step by step way to be sure we have the right answer.

The moderator should terminate the conversation when it feels the question has been answered sufficiently.""",
"""
)

[33mModerator[0m (to chat_manager):

You are a AI assistant with a background in drug discovery.

Given target: intasomes
Given disease: HIV
Given mode of action: Inhibition of HIV intasomes

Context:
Intasome inhibitors are widely used for the treatment of HIV. The intasome is a complex of viral and cellular proteins that mediates the integration of the viral DNA into the host genome. Inhibiting intasomes can prevent the integration of the viral DNA, thereby blocking the replication of the virus.
 This approach has been successful in the development of antiretroviral drugs for HIV treatment.

Task 1: Develop a scientific rationale for Intasome inhibition in HIV.

Highlight the working hypothesis for the clinical target rationale and human biology evidence by minimum 2000 words.

Describe as much as possible the evidence in humans or in human tissue that link the target, target space or approach to the pathogenesis of interest.
If known, also describe here the wanted mode of action w

In [57]:
print(chat_result.summary)

Having heard from our team, we can outline a cohesive response:

ACE inhibitors, crucial for conditions like hypertension and heart failure, have side effects ranging from a persistent dry cough to serious issues like renal dysfunction and angioedema (Mayo Clinic, MedlinePlus). 

The Safety_Assistant stressed the need for a comprehensive patient history, possible drug interactions, dose adjustments, and the contraindication of ACE inhibitors in pregnancy. Additional safety measures include educating the patient about potential side effects and actions required in case they occur, advising against unmonitored intake of potassium supplements, coordinating with other healthcare providers involved with the patient and arranging a clear workflow for patient monitoring and follow-up. Medication safety involves healthcare providers, scientific data, and direct patient participation in full circle.

The Target_Assessment outlines a systematic strategy for evaluating ACE inhibitors safety, whic