# A1 Agent Testing Notebook
This notebook demonstrates the usage of A1 agent with various biomedical databases for CRISPR screen planning.

In [2]:
# Import required libraries
from biomni.agent import A1


In [3]:
# Initialize the A1 agent
agent = A1(
    path='./data',
    llm='gpt-4.1',
    source = 'OpenAI',
    load_datalake=False
)

Skipping datalake download (load_datalake=False)
Note: Some tools may require datalake files to function properly.


## Test: Query clinical trials and drug databases.

In [9]:
# Validate genes against clinical data
agent.go(f"""
For Aspirin (acetylsalicylic acid):
1. Using PubChem:
   - Get chemical structure and properties
   - List all known synonyms
   - Find compound classification
   
2. Using DrugCentral:
   - Get primary mechanisms of action
   - List approved indications
   - Get ATC classification
   - Extract pharmacological class information
   
3. Using ChEMBL:
   - Find binding targets and their affinities
   - List documented biological activities
   - Get drug-like properties
   
4. Using ClinicalTrials.gov:
   - Search for trials using Aspirin
   - List relevant interventions
   - Get safety profiles
   
5. Using openFDA:
   - Review adverse event reports
""")

Using prompt-based retrieval with the agent's LLM


For Aspirin (acetylsalicylic acid):
1. Using PubChem:
   - Get chemical structure and properties
   - List all known synonyms
   - Find compound classification
   
2. Using DrugCentral:
   - Get primary mechanisms of action
   - List approved indications
   - Get ATC classification
   - Extract pharmacological class information
   
3. Using ChEMBL:
   - Find binding targets and their affinities
   - List documented biological activities
   - Get drug-like properties
   
4. Using ClinicalTrials.gov:
   - Search for trials using Aspirin
   - List relevant interventions
   - Get safety profiles
   
5. Using openFDA:
   - Review adverse event reports

parsing error...

Each response must include thinking process followed by either <execute> or <solution> tag. But there are no tags in the current response. Please follow the instruction, fix and regenerate the response again.
parsing error...

Each response must include thinking process fol

 'Thinking:\nThe openFDA query returned adverse event data for Aspirin, highlighting the most common reported reactions and a high proportion of serious reports. This completes all possible steps from the original plan, with the exception of DrugCentral and ChEMBL (which could not be accessed for this compound).\n\nHere is a summary of all findings, organized by database:\n\n<solution>\nAspirin (acetylsalicylic acid): Multi-Database Summary\n\n1. PubChem\n   - Chemical structure and properties:\n     - Molecular Formula: C9H8O4\n     - Molecular Weight: 180.16\n     - SMILES: CC(=O)OC1=CC=CC=C1C(=O)O\n     - InChI: InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)\n   - Synonyms: Over 100 names, including "acetylsalicylic acid", "2-Acetoxybenzoic acid", "Aspirin", and numerous trade names.\n   - Compound classification: Classified as a "Salicylate", "Nonsteroidal Anti-inflammatory Drug (NSAID)", "Benzene and substituted derivatives", and "Carboxylic acid derivatives"

## Comprehensive Drug Analysis: Aspirin Case Study
We'll perform a detailed analysis of Aspirin using multiple databases to understand its properties, interactions, and clinical applications.

In [None]:
# 1. Chemical and Basic Drug Information
agent.go("""
For Aspirin (acetylsalicylic acid):
1. Using PubChem:
   - Get chemical structure and properties
   - List all known synonyms
   - Find compound classification
   
2. Using DrugCentral:
   - Get primary mechanisms of action
   - List approved indications
   - Get ATC classification
   - Extract pharmacological class information
   
3. Using ChEMBL:
   - Find binding targets and their affinities
   - List documented biological activities
   - Get drug-like properties
""")

 'My thinking:\nThe ChEMBL query returned detailed information for Aspirin, including:\n- Drug-like properties (Lipinski analysis, logP, H-bond donors/acceptors, etc.)\n- Synonyms and identifiers\n- ATC classifications\n- Indication class\n- ChEMBL ID\n\nHowever, the output is lengthy and detailed binding target/affinity and biological activity information may need to be explicitly extracted. I will summarize the main findings and indicate if a follow-up query is needed for specific targets and affinities.\n\nUpdated checklist:\n1. [✓] PubChem: structure, properties, synonyms, classification\n2. [✗] DrugCentral: (not directly accessible via current API)\n3. [✓] ChEMBL: drug-like properties, synonyms, ATC codes, indication class (targets/affinity may need further expansion)\n\nLet’s summarize the information gathered so far and clarify if you want a deeper listing of specific targets/affinities from ChEMBL.\n\n<solution>\nSummary for Aspirin (acetylsalicylic acid):\n\n1. PubChem:\n   - 

In [None]:
# 2. Clinical Trials Analysis
agent.go("""
Analyze Aspirin clinical trials:
1. Using ClinicalTrials.gov:
   - Find all cancer-related trials (active and completed)
   - Group trials by cancer types
   - Extract dosage information
   - Identify combination therapy trials
   - List primary and secondary outcomes
   
2. Cross-reference with DrugCentral:
   - Compare approved vs. investigational uses
   - Identify potential repurposing opportunities
""")

In [5]:
# 3. Safety and Pharmacovigilance
result = agent.go("""
Comprehensive safety analysis of Aspirin:
1. Using openFDA:
   - Extract adverse event reports
   - Calculate reporting odds ratios
   - Identify black box warnings
   - List drug-drug interactions
   - Get patient demographic patterns

2. Using DailyMed:
   - Get current labeling information
   - Extract contraindications
   - List special populations warnings
   - Compare different formulation safety profiles
""")
print(result)

Using prompt-based retrieval with the agent's LLM


Comprehensive safety analysis of Aspirin:
1. Using openFDA:
   - Extract adverse event reports
   - Calculate reporting odds ratios
   - List drug-drug interactions
   - Get patient demographic patterns

2. Using DailyMed:
   - Get current labeling information
   - Extract contraindications
   - Compare different formulation safety profiles

parsing error...

Each response must include thinking process followed by either <execute> or <solution> tag. But there are no tags in the current response. Please follow the instruction, fix and regenerate the response again.

Thank you for pointing out the formatting requirement. Here’s my revised approach:

Thinking process:  
To begin a comprehensive safety analysis of Aspirin, I need to first extract adverse event reports from openFDA. This will provide an overview of the types and frequencies of adverse events reported in association with Aspirin, which is foundational for the subsequent ste

In [3]:
# 4. Drug Interactions and Cross-References
agent.go("""
Map Aspirin across databases:
1. Using UniChem:
   - Get all database identifiers
   - Cross-reference with other systems

2. Using DrugCentral and ChEMBL:
   - List all known drug interactions
   - Categorize by severity
   - Identify mechanism-based interactions
   - Find structural analogs
""")

Using prompt-based retrieval with the agent's LLM


Map Aspirin across databases:
1. Using UniChem:
   - Get all database identifiers
   - Cross-reference with other systems

2. Using DrugCentral and ChEMBL:
   - List all known drug interactions
   - Categorize by severity
   - Identify mechanism-based interactions
   - Find structural analogs

parsing error...

Each response must include thinking process followed by either <execute> or <solution> tag. But there are no tags in the current response. Please follow the instruction, fix and regenerate the response again.

Thank you for your feedback. Here is my revised response including the required tags:

To address the mapping of Aspirin across databases, I will follow this checklist:

1. [ ] Query UniChem for Aspirin to get all database identifiers and cross-references.
2. [ ] Query DrugCentral for Aspirin to retrieve all known drug interactions.
3. [ ] Query ChEMBL for Aspirin to obtain more drug interaction data and structural analog

 'ChEMBL returned a list of structural analogs for Aspirin (CHEMBL25), including its derivatives and close analogs with at least 70% structural similarity. These analogs include compounds such as Aspirin DL-lysine (CHEMBL1697753) and others.\n\nHere’s the updated checklist:\n1. [✗] Query UniChem for Aspirin to get all database identifiers and cross-references (failed).\n2. [✗] Query DrugCentral for Aspirin to retrieve all known drug interactions (requires direct database/API access).\n3. [✓] Query ChEMBL for Aspirin to obtain mechanism of action data (no direct interactions, but mechanism and analogs available).\n4. [✗] Combine and categorize drug interactions by severity (not possible without interaction data).\n5. [ ] Identify mechanism-based interactions (partial: mechanism known, interactions not directly listed).\n6. [✓] List structural analogs found in ChEMBL.\n\nSummary of findings:\n- UniChem and DrugCentral could not return data due to API or access limitations.\n- ChEMBL prov

In [None]:
# 5. Molecular and Pathway Analysis
agent.go("""
Analyze molecular aspects:
1. Using QuickGO:
   - Get GO terms for Aspirin targets
   - Analyze biological processes affected

2. Using OLS:
   - Map to relevant pathways
   - Find disease associations
   - Identify molecular functions

3. Combine with ChEMBL data:
   - Analyze target protein families
   - Map to signaling pathways
""")