# Notice

To run this notebook, you need to install pipenv via

```
pipenv install --dev
```

In [1]:
import sys
import os
current_dir = os.getcwd()
REPO_BASE_DIR = os.path.dirname(os.path.abspath(current_dir))
sys.path.append(REPO_BASE_DIR)

from dotenv import load_dotenv
load_dotenv(os.path.join(REPO_BASE_DIR, ".env"))

True

In [2]:
from biodsa.agents import DeepEvidenceAgent
agent = DeepEvidenceAgent(
    model_name="gpt-5",
    api_type="azure",
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    model_kwargs={
        "max_completion_tokens": 5000,
        "reasoning_effort": "minimal",
    },
    subagent_action_rounds_budget=5, # the number of action rounds for the sub research agents to run
    main_search_rounds_budget=2, # the number of search rounds for the main orchestrator agent to run
    main_action_rounds_budget=15, # the number of action rounds for the main orchestrator agent to run
    light_mode=False, # a light mode agent that does not use the memory graph
    llm_timeout=120,
)

# run the agent
execution_results = agent.go(
    "Summarizing the cutting-edge immunotherapy drugs in late clinical trial phase or have been approved for NSCLC?",
    knowledge_bases=["pubmed_papers", "clinical_trials", "drug", "disease"], # select the knowledge bases to use
)

2025-12-23 14:41:31,976 - INFO - Installing biodsa.tools module in sandbox...
2025-12-23 14:41:32,236 - INFO - Uploaded biodsa.tools module to sandbox
2025-12-23 14:41:32,302 - INFO - Successfully extracted biodsa.tools module
2025-12-23 14:41:32,395 - INFO - Created .pth file at /usr/local/lib/python3.12/site-packages/biodsa_tools.pth
2025-12-23 14:41:32,395 - INFO - biodsa.tools module installed in sandbox at /workdir/biodsa
2025-12-23 14:41:32,396 - INFO - You can now use 'from biodsa.tools import xxx' in your sandbox code
2025-12-23 14:41:32,396 - INFO - Sandbox initialized successfully and biodsa.tools installed


[1m[92mðŸ‘¤ Human[0m
[92mSummarizing the cutting-edge immunotherapy drugs in late clinical trial phase or have been approved for NSCLC?[0m



2025-12-23 14:42:05,927 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current search round (BFS/DFS calls): 0/2
Current action round (orchestrator calls): 1/15
[1m[94mðŸ¤– AI Assistant[0m
[96m1. [ ] Standardize the disease term "NSCLC" and retrieve ontology-backed details to ground the search.
2. [ ] Run a breadth-first search across drug and clinical trial knowledge bases to list immunotherapy agents approved or in late-phase (Phase III) for NSCLC, including key checkpoint classes and leading candidates.
3. [ ] Run a depth-first search to pull Phase III trial names, approvals, and line-of-therapy specifics (e.g., PD-L1 thresholds, histology/stage, combinations).
4. [ ] Add the curated disease and drug entities plus key observations (approval status, trial names, indications) into the evidence graph.
5. [ ] Retrieve the evidence graph to review the compiled findings and ensure coverage.
6. [ ] Provide a concise, up-to-date summary of cutting-edge immunotherapies approved or late-phase for NSCLC, grouped by mechanism and key indications.

Proceeding w

2025-12-23 14:42:38,103 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current search round (BFS/DFS calls): 1/2
Current action round (orchestrator calls): 2/15
[1m[94mðŸ¤– AI Assistant[0m
[96m1. [âœ“] Standardize the disease term "NSCLC" and retrieve ontology-backed details to ground the search. (completed)
2. [ ] Run a breadth-first search across drug and clinical trial knowledge bases to list immunotherapy agents approved or in late-phase (Phase III) for NSCLC, including key checkpoint classes and leading candidates.
3. [ ] Run a depth-first search to pull Phase III trial names, approvals, and line-of-therapy specifics (e.g., PD-L1 thresholds, histology/stage, combinations).
4. [ ] Add the curated disease and drug entities plus key observations (approval status, trial names, indications) into the evidence graph.
5. [ ] Retrieve the evidence graph to review the compiled findings and ensure coverage.
6. [ ] Provide a concise, up-to-date summary of cutting-edge immunotherapies approved or late-phase for NSCLC, grouped by mechanism and key indications.

2025-12-23 14:43:18,077 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current round of the breadth-first search agent: 1/3
[1m[94mðŸ¤– AI Assistant[0m

[93mðŸ“ž Tool Calls (1):[0m
  [1m[93m[1] search_clinical_trials[0m
      [90mtask_name:[0m NSCLC IO P3
      [90mmax_pages:[0m 1
      [90mconditions:[0m ['non-small cell lung cancer', 'NSCLC']
      [90mterms:[0m ['KEYNOTE', 'CheckMate', 'IMpower', 'EMPOWER- ... PER', 'JAVELIN', 'RELATIVITY', 'ARC', 'STAR']
      [90minterventions:[0m ['pembrolizumab', 'nivolumab', 'cemiplimab',  ...  'tiragolumab', 'domvanalimab', 'relatlimab']
      [90mrecruiting_status:[0m ANY
      [90mstudy_type:[0m INTERVENTIONAL
      [90mphase:[0m PHASE3
      [90mpage_size:[0m 200
      [90msort:[0m RELEVANCE

[1m[95mðŸ”§ Tool Response[0m [90m(search_clinical_trials)[0m
[95m### Executed Code:
```python

import pandas as pd
from biodsa.tools.clinical_trials.trials import search_trials

# Search for clinical trials
next_page_token = None
all_df = []
for page in range(1, 1 + 1):
    df, md_str, to

2025-12-23 14:44:30,529 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current round of the breadth-first search agent: 2/3
[1m[94mðŸ¤– AI Assistant[0m

[93mðŸ“ž Tool Calls (1):[0m
  [1m[93m[1] code_execution[0m
      [90mcode:[0m import pandas as pd
rows = [
    {
        'D ... 'EMA_NSCLC_Approval','Key_PhaseIII_Trials']])

[1m[95mðŸ”§ Tool Response[0m [90m(code_execution)[0m
[95m### Executed Code:
```python
import pandas as pd
rows = [
    {
        'Drug': 'pembrolizumab',
        'Target': 'PD-1',
        'FDA_NSCLC_Approval': 'Approved',
        'EMA_NSCLC_Approval': 'Approved',
        'Indications': 'Metastatic 1L monotherapy (PD-L1 TPS â‰¥50% KEYNOTE-024; expanded to â‰¥1% KEYNOTE-042); 1L + chemo (non-squamous: pemetrexed+platinum KEYNOTE-189; squamous: carboplatin+taxane KEYNOTE-407); Adjuvant post-resection IB(â‰¥4 cm)-IIIA (KEYNOTE-091); Perioperative (neoadjuvant+adjuvant) resectable stage IIâ€“IIIB (N2) (KEYNOTE-671). No EGFR/ALK.',
        'Histology_PD-L1_Notes': 'Mono requires PD-L1; combos all histologies; adjuvant/perio

2025-12-23 14:44:45,088 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current round of the breadth-first search agent: 3/3
[1m[94mðŸ¤– AI Assistant[0m
[96m# Files saved:  
- /workdir/NSCLC_IO_P3.csv: Phase III NSCLC immunotherapy trial hits for specified checkpoint inhibitors and programs (KEYNOTE, CheckMate, IMpower, EMPOWER-Lung, PACIFIC, POSEIDON, etc.).
- /workdir/NSCLC_IO_Approvals_Summary.csv: Consolidated approvals/indications for NSCLC IO agents with PD-L1 thresholds, histology, combos, and key Phase III trial IDs.

Main findings:  
Approved NSCLC IOs include PD-1 (pembrolizumab, nivolumab, cemiplimab), PD-L1 (atezolizumab, durvalumab) and CTLA-4 combos (ipilimumab, tremelimumab); avelumab not approved, TIGIT/LAG-3 programs in Phase III. PD-L1 cutoffs and histology/combination contexts captured per agent; trial identifiers mapped for major programs.[0m

[1m[95mðŸ”§ Tool Response[0m [90m(go_breadth_first_search)[0m
[95m# Files saved:  
- /workdir/NSCLC_IO_P3.csv: Phase III NSCLC immunotherapy trial hits for specified checkpoint inhibito

2025-12-23 14:45:40,242 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current search round (BFS/DFS calls): 2/2
Current action round (orchestrator calls): 3/15
[1m[94mðŸ¤– AI Assistant[0m
[96m1. [âœ“] Standardize the disease term "NSCLC" and retrieve ontology-backed details to ground the search. (completed)
2. [âœ“] Run a breadth-first search across drug and clinical trial knowledge bases to list immunotherapy agents approved or in late-phase (Phase III) for NSCLC, including key checkpoint classes and leading candidates. (completed)
3. [ ] Run a depth-first search to pull Phase III trial names, approvals, and line-of-therapy specifics (e.g., PD-L1 thresholds, histology/stage, combinations).
4. [ ] Load and consolidate BFS/DFS outputs from /workdir to curate key approvals, indications, and trial identifiers.
5. [ ] Add the curated disease and drug entities plus key observations (approval status, trial names, indications) into the evidence graph.
6. [ ] Retrieve the evidence graph to review the compiled findings and ensure coverage.
7. [ ] Provide a co

2025-12-23 14:46:21,592 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current round of the depth-first search agent: 1/3
[1m[94mðŸ¤– AI Assistant[0m

[93mðŸ“ž Tool Calls (1):[0m
  [1m[93m[1] find_entities[0m
      [90mquery_text:[0m non-small cell lung cancer
      [90mconcept_type:[0m DISEASE
      [90mlimit:[0m 10

[1m[95mðŸ”§ Tool Response[0m [90m(find_entities)[0m
[95m### Executed Code:
```python

from biodsa.tools.pubmed.pubtator_api import pubtator_api_find_entities
import pandas as pd

# Find entities
results = pubtator_api_find_entities(
    query_text='non-small cell lung cancer',
    concept_type='DISEASE',
    limit=10,
    max_retries=3,
    max_requests_per_second=3.0
)

# Output results
if len(results) == 0:
    print("No entities found. Please try again with different query.")
else:
    if isinstance(results, pd.DataFrame):
        print(results.to_markdown())
    else:
        print("No entities found. Please try again with different query.")

```

### Output:
```
Successfully retrieved 1 entity suggestions
|    | PubT

2025-12-23 14:47:33,386 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current round of the depth-first search agent: 2/3
[1m[94mðŸ¤– AI Assistant[0m

[93mðŸ“ž Tool Calls (1):[0m
  [1m[93m[1] code_execution[0m
      [90mcode:[0m import pandas as pd
from datetime import date ... print('Saved files:', trial_path, tigit_path)

[1m[95mðŸ”§ Tool Response[0m [90m(code_execution)[0m
[95m### Executed Code:
```python
import pandas as pd
from datetime import datetime
trials = [
    {'drug':'Pembrolizumab','trial':'KEYNOTE-024','phase':'III','approval_FDA':'Approved','approval_EMA':'Approved','line_of_therapy':'1L','setting':'Metastatic','PD-L1_thresholds':'â‰¥50%','histology_constraints':'All (exclude EGFR/ALK)','key_outcomes':'OS/PFS significant'},
    {'drug':'Pembrolizumab','trial':'KEYNOTE-042','phase':'III','approval_FDA':'Approved','approval_EMA':'Approved','line_of_therapy':'1L','setting':'Metastatic','PD-L1_thresholds':'â‰¥1% (subgroups â‰¥20%, â‰¥50%)','histology_constraints':'All (exclude EGFR/ALK)','key_outcomes':'OS significant; PFS mixe

2025-12-23 14:47:56,126 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current round of the depth-first search agent: 3/3
[1m[94mðŸ¤– AI Assistant[0m
[96m# Files saved:  
- /workdir/nsclc_phase3_immunotherapy_map.csv: Phase III NSCLC immunotherapy trials mapped with approvals, line, setting, PD-L1 thresholds, histology, and key OS/PFS outcomes.  
- /workdir/nsclc_tigit_lag3_phase3_status.csv: TIGIT and LAG-3 Phase III NSCLC program status summarized (trial names, settings, PD-L1, approvals).  

Main findings:  
Phase III trials and regulatory approvals for pembrolizumab, nivolumabÂ±ipilimumab, atezolizumab, durvalumabÂ±tremelimumab, and cemiplimab are mapped across metastatic, adjuvant, neoadjuvant, perioperative, and unresectable stage III settings with PD-L1/histology details. TIGIT (tiragolumab, domvanalimab) and LAG-3 (relatlimab combos) have Phase III activity but no NSCLC approvals to date.[0m

[1m[95mðŸ”§ Tool Response[0m [90m(go_depth_first_search)[0m
[95m# Files saved:  
- /workdir/nsclc_phase3_immunotherapy_map.csv: Phase III NSCLC im

2025-12-23 14:48:29,627 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current search round (BFS/DFS calls): 2/2
Current action round (orchestrator calls): 4/15
[1m[94mðŸ¤– AI Assistant[0m
[96m1. [âœ“] Standardize the disease term "NSCLC" and retrieve ontology-backed details to ground the search. (completed)
2. [âœ“] Run a breadth-first search across drug and clinical trial knowledge bases to list immunotherapy agents approved or in late-phase (Phase III) for NSCLC, including key checkpoint classes and leading candidates. (completed)
3. [âœ“] Run a depth-first search to pull Phase III trial names, approvals, and line-of-therapy specifics (e.g., PD-L1 thresholds, histology/stage, combinations). (completed)
4. [ ] Load and consolidate BFS/DFS outputs from /workdir to curate key approvals, indications, and trial identifiers.

Proceeding with Step 4.[0m

[93mðŸ“ž Tool Calls (1):[0m
  [1m[93m[1] code_execution[0m
      [90mcode:[0m import os
import pandas as pd

files = [
     ... d per drug summary:')
print(summary.head(50))

[1m[95mðŸ”§ Tool Re

2025-12-23 14:49:53,373 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current search round (BFS/DFS calls): 2/2
Current action round (orchestrator calls): 5/15
[1m[94mðŸ¤– AI Assistant[0m

[93mðŸ“ž Tool Calls (1):[0m
  [1m[93m[1] add_to_graph[0m
      [90mentities:[0m [{'name': 'non-small cell lung carcinoma', 'e ... KG:clinical_trials@2025-12; drug@2025-12.']}]

[1m[95mðŸ”§ Tool Response[0m [90m(add_to_graph)[0m
[95m{"success": true, "results": {"entities_created": {"count": 10, "entities": [{"name": "non-small cell lung carcinoma", "entityType": "DISEASE", "observations": ["Grounded to MONDO:0005233; DOID:3908; KEGG ds:H00014.", "Provenance: DERIVED_FROM_KG:biothings@2025-12; opentargets@2025-12."]}, {"name": "pembrolizumab", "entityType": "DRUG", "observations": ["FDA/EMA approved 1L monotherapy for metastatic NSCLC with PD-L1 TPS \u226550%, no EGFR/ALK; KEYNOTE-024; OS/PFS benefit.", "Approved 1L with platinum chemo (non-squamous: KEYNOTE-189; squamous: KEYNOTE-407); OS/PFS benefit regardless of PD-L1.", "Adjuvant approval after resec

2025-12-23 14:50:23,512 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current search round (BFS/DFS calls): 2/2
Current action round (orchestrator calls): 6/15
[1m[94mðŸ¤– AI Assistant[0m
[96m1. [âœ“] Standardize the disease term "NSCLC" and retrieve ontology-backed details to ground the search. (completed)
2. [âœ“] Run a breadth-first search across drug and clinical trial knowledge bases to list immunotherapy agents approved or in late-phase (Phase III) for NSCLC, including key checkpoint classes and leading candidates. (completed)
3. [âœ“] Run a depth-first search to pull Phase III trial names, approvals, and line-of-therapy specifics (e.g., PD-L1 thresholds, histology/stage, combinations). (completed)
4. [âœ“] Load and consolidate BFS/DFS outputs from /workdir to curate key approvals, indications, and trial identifiers. (completed)
5. [âœ“] Add the curated disease and drug entities plus key observations (approval status, trial names, indications) into the evidence graph. (completed)
6. [ ] Retrieve the evidence graph to review the compiled finding

2025-12-23 14:50:48,792 - INFO - HTTP Request: POST https://zifeng-gpt-2.openai.azure.com/openai/deployments/gpt-5/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


Current search round (BFS/DFS calls): 2/2
Current action round (orchestrator calls): 7/15
[1m[94mðŸ¤– AI Assistant[0m
[96mThinking and approach:
- I grounded NSCLC to standard disease identifiers to avoid ambiguity.
- I ran breadth- and depth-focused searches across drug and clinical trial knowledge bases to collect approvals and Phase III programs for NSCLC immunotherapies.
- I loaded and curated the outputs, then added core findings to an internal evidence graph and reviewed it to ensure coverage across settings (metastatic, unresectable stage III, adjuvant/neoadjuvant/perioperative).
- Below is a concise, mechanism-organized summary of cutting-edge immunotherapies either approved or in late Phase III for NSCLC, including PD-L1 thresholds, histology, key trials, and line-of-therapy.

Updated plan:
1. [âœ“] Standardize the disease term "NSCLC" and retrieve ontology-backed details to ground the search. (completed)
2. [âœ“] Run a breadth-first search across drug and clinical trial k

In [3]:
execution_results

ExecutionResults(messages=14, executions=0, has_sandbox=True)

In [4]:
# Generate PDF report following the structured format:
# 1. User query
# 2. Agent exploration trajectories (messages only, no code)
# 3. Final response with embedded artifacts
# 4. Supplementary materials with code blocks and execution results
execution_results.to_pdf(output_dir="test_artifacts")
agent.clear_workspace()
print("Done!")

2025-12-23 14:50:48,913 - INFO - Downloaded NSCLC_IO_Approvals_Summary.csv to /var/folders/rb/nj5lt0x53pj4nt6j_b459_p80000gn/T/biodsa_artifacts_4x68ndq2
2025-12-23 14:50:48,922 - INFO - Downloaded NSCLC_IO_P3.csv to /var/folders/rb/nj5lt0x53pj4nt6j_b459_p80000gn/T/biodsa_artifacts_4x68ndq2
2025-12-23 14:50:48,926 - INFO - Downloaded NSCLC_grounding.json to /var/folders/rb/nj5lt0x53pj4nt6j_b459_p80000gn/T/biodsa_artifacts_4x68ndq2
2025-12-23 14:50:49,042 - INFO - Downloaded biodsa to /var/folders/rb/nj5lt0x53pj4nt6j_b459_p80000gn/T/biodsa_artifacts_4x68ndq2
2025-12-23 14:50:49,050 - INFO - Downloaded biodsa_tools.tar.gz to /var/folders/rb/nj5lt0x53pj4nt6j_b459_p80000gn/T/biodsa_artifacts_4x68ndq2
2025-12-23 14:50:49,053 - INFO - Downloaded nsclc_phase3_immunotherapy_map.csv to /var/folders/rb/nj5lt0x53pj4nt6j_b459_p80000gn/T/biodsa_artifacts_4x68ndq2
2025-12-23 14:50:49,058 - INFO - Downloaded nsclc_tigit_lag3_phase3_status.csv to /var/folders/rb/nj5lt0x53pj4nt6j_b459_p80000gn/T/biodsa_

f3170d78f9c0
f3170d78f9c0
Done!
