BioPLEASE: A General-Purpose Biomedical AI Agent

Overview

BioPLEASE is a general-purpose biomedical AI agent designed to autonomously execute a wide range of research tasks across diverse biomedical subfields. By integrating cutting-edge large language model (LLM) reasoning with retrieval-augmented planning and code-based execution, BioPLEASE helps scientists dramatically enhance research productivity and generate testable hypotheses.

Quick Start

Installation

Our software environment is massive and we provide a single setup.sh script to setup. Follow this file to setup the env first.

Then activate the environment E1:

conda activate bioplease_e1

then install the bioplease official pip package:

pip install bioplease --upgrade

For the latest update, install from the github source version, or do:

pip install git+https://github.com/snap-stanford/BioPLEASE.git@main

Lastly, configure your API keys using one of the following methods:

Click to expand

Option 1: Using .env file (Recommended)

Create a .env file in your project directory:

# Copy the example file
cp .env.example .env

# Edit the .env file with your actual API keys

Your .env file should look like:

# Required: Anthropic API Key for Claude models
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# Optional: OpenAI API Key (if using OpenAI models)
OPENAI_API_KEY=your_openai_api_key_here

# Optional: Azure OpenAI API Key (if using Azure OpenAI models)
OPENAI_API_KEY=your_azure_openai_api_key
OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/

# Optional: AI Studio Gemini API Key (if using Gemini models)
GEMINI_API_KEY=your_gemini_api_key_here

# Optional: groq API Key (if using groq as model provider)
GROQ_API_KEY=your_groq_api_key_here

# Optional: Set the source of your LLM for example:
#"OpenAI", "AzureOpenAI", "Anthropic", "Ollama", "Gemini", "Bedrock", "Groq", "Custom"
LLM_SOURCE=your_LLM_source_here

# Optional: AWS Bedrock Configuration (if using AWS Bedrock models)
AWS_BEARER_TOKEN_BEDROCK=your_bedrock_api_key_here
AWS_REGION=us-east-1

# Optional: Custom model serving configuration
# CUSTOM_MODEL_BASE_URL=http://localhost:8000/v1
# CUSTOM_MODEL_API_KEY=your_custom_api_key_here

# Optional: BioPLEASE data path (defaults to ./data)
# BIOPLEASE_DATA_PATH=/path/to/your/data

# Optional: Timeout settings (defaults to 600 seconds)
# BIOPLEASE_TIMEOUT_SECONDS=600

Option 2: Using shell environment variables

Alternatively, configure your API keys in bash profile ~/.bashrc:

export ANTHROPIC_API_KEY="YOUR_API_KEY"
export OPENAI_API_KEY="YOUR_API_KEY" # optional if you just use Claude
export OPENAI_ENDPOINT="https://your-resource-name.openai.azure.com/" # optional unless you are using Azure
export AWS_BEARER_TOKEN_BEDROCK="YOUR_BEDROCK_API_KEY" # optional for AWS Bedrock models
export AWS_REGION="us-east-1" # optional, defaults to us-east-1 for Bedrock
export GEMINI_API_KEY="YOUR_GEMINI_API_KEY" #optional if you want to use a gemini model
export GROQ_API_KEY="YOUR_GROQ_API_KEY" # Optional: set this to use models served by Groq
export LLM_SOURCE="Groq" # Optional: set this to use models served by Groq

⚠️ Known Package Conflicts

Some Python packages are not installed by default in the BioPLEASE environment due to dependency conflicts. If you need these features, you must install the packages manually and may need to uncomment relevant code in the codebase. See the up-to-date list and details in docs/known_conflicts.md.

Basic Usage

Once inside the environment, you can start using BioPLEASE:

from bioplease.agent import A1

# Initialize the agent with data path, Data lake will be automatically downloaded on first run (~11GB)
agent = A1(path='./data', llm='claude-sonnet-4-20250514')

# Execute biomedical tasks using natural language
agent.go("Plan a CRISPR screen to identify genes that regulate T cell exhaustion, generate 32 genes that maximize the perturbation effect.")
agent.go("Perform scRNA-seq annotation at [PATH] and generate meaningful hypothesis")
agent.go("Predict ADMET properties for this compound: CC(C)CC1=CC=C(C=C1)C(C)C(=O)O")

If you plan on using Azure for your model, always prefix the model name with azure- (e.g. llm='azure-gpt-4o').

MCP (Model Context Protocol) Support

BioPLEASE supports MCP servers for external tool integration:

from bioplease.agent import A1

agent = A1()
agent.add_mcp(config_path="./mcp_config.yaml")
agent.go("Find FDA active ingredient information for ibuprofen")

Built-in MCP Servers: For usage and implementation details, see the MCP Integration Documentation and examples in tutorials/examples/add_mcp_server/ and tutorials/examples/expose_bioplease_server/.

🤝 Contributing to BioPLEASE

BioPLEASE is an open-science initiative that thrives on community contributions. We welcome:

🔧 New Tools: Specialized analysis functions and algorithms
📊 Datasets: Curated biomedical data and knowledge bases
💻 Software: Integration of existing biomedical software packages
📋 Benchmarks: Evaluation datasets and performance metrics
📚 Misc: Tutorials, examples, and use cases
🔧 Update existing tools: many current tools are not optimized - fix and replacements are welcome!

Check out this Contributing Guide on how to contribute to the BioPLEASE ecosystem.

If you have particular tool/database/software in mind that you want to add, you can also submit to this form and the bioplease team will implement them.

🔬 Call for Contributors: Help Build BioPLEASE-E2

BioPLEASE-E1 only scratches the surface of what’s possible in the biomedical action space.

Now, we’re building BioPLEASE-E2 — a next-generation environment developed with and for the community.

We believe that by collaboratively defining and curating a shared library of standard biomedical actions, we can accelerate science for everyone.

Join us in shaping the future of biomedical AI agent.

Contributors with significant impact (e.g., 10+ significant & integrated tool contributions or equivalent) will be invited as co-authors on our upcoming paper in a top-tier journal or conference.
All contributors will be acknowledged in our publications.
More contributor perks...

Let’s build it together.

Tutorials and Examples

BioPLEASE 101 - Basic concepts and first steps

Memory System Demo - Efficient memory management for long conversations

More to come!

📝 Documentation

Core Documentation

Enhanced Memory System - Complete guide to the two-tier memory system that reduces token usage by 75-90%
Memory Quick Reference - Quick command reference for memory management
Memory Migration Guide - How to upgrade existing code (zero changes required!)
Memory Architecture - Visual diagrams and technical details

Key Features: Enhanced Memory System

BioPLEASE now includes an intelligent memory management system that dramatically reduces token usage:

Two-tier memory: Recent messages kept verbatim + compressed summaries of older messages
Token savings: 75-90% reduction in long conversations
Automatic: No code changes required - works out of the box
Configurable: Fine-tune for your specific use case
Persistent: Save/load memory across sessions

from bioplease.agent.a1 import A1

# Memory is automatic!
agent = A1(path="./data", llm="gpt-4o-mini")
result = agent.go("Your task here")

# Optional: Configure for your needs
agent.configure_memory(
    short_window=4,        # Keep fewer messages
    compression_ratio=0.2  # More aggressive compression
)

# Monitor usage
stats = agent.get_memory_stats(state)
print(f"Using ~{stats['estimated_tokens']} tokens")

See MEMORY_SYSTEM_SUMMARY.md for complete implementation details.

Release schedule

8 Real-world research task benchmark/leaderboard release
A tutorial on how to contribute to BioPLEASE
A tutorial on baseline agents
MCP support
BioPLEASE A1+E1 release

Important Note

Security warning: Currently, BioPLEASE executes LLM-generated code with full system privileges. If you want to use it in production, please use in isolated/sandboxed environments. The agent can access files, network, and system commands. Be careful with sensitive data or credentials.
This release was frozen as of April 15 2025, so it differs from the current web platform.
BioPLEASE itself is Apache 2.0-licensed, but certain integrated tools, databases, or software may carry more restrictive commercial licenses. Review each component carefully before any commercial use.

Cite Us

@article{huang2025bioplease,
  title={BioPLEASE: A General-Purpose Biomedical AI Agent},
  author={Huang, Kexin and Zhang, Serena and Wang, Hanchen and Qu, Yuanhao and Lu, Yingzhou and Roohani, Yusuf and Li, Ryan and Qiu, Lin and Zhang, Junze and Di, Yin and others},
  journal={bioRxiv},
  pages={2025--05},
  year={2025},
  publisher={Cold Spring Harbor Laboratory}
}

Name		Name	Last commit message	Last commit date
Latest commit History 256 Commits
artifacts		artifacts
baselines		baselines
bioplease		bioplease
bioplease_env		bioplease_env
checkpoints		checkpoints
code		code
config		config
crispr_screen_design_outputs		crispr_screen_design_outputs
data_inventory		data_inventory
docs		docs
evaluation		evaluation
examples		examples
experiments		experiments
figs		figs
figures		figures
frozen		frozen
hle_questions		hle_questions
literature_searches		literature_searches
logical_analysis		logical_analysis
models		models
notes		notes
outputs		outputs
overnight_results		overnight_results
projects/micromalthidae		projects/micromalthidae
results		results
results_currarino_check		results_currarino_check
tests		tests
token-cost-manager		token-cost-manager
training		training
training_logs		training_logs
tutorials		tutorials
work		work
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.swp		.swp
BASELINE_TEST_README.md		BASELINE_TEST_README.md
BEFORE_AFTER_COMPARISON.md		BEFORE_AFTER_COMPARISON.md
BETA_BRANCH_SUMMARY.md		BETA_BRANCH_SUMMARY.md
BioHLE1.py		BioHLE1.py
CLAUDE_EFFICIENCY_GUIDE.md		CLAUDE_EFFICIENCY_GUIDE.md
CONTRIBUTION.md		CONTRIBUTION.md
DETAILS.md		DETAILS.md
ERROR_PREVENTION_SUMMARY.md		ERROR_PREVENTION_SUMMARY.md
IMPROVEMENTS_SUMMARY.md		IMPROVEMENTS_SUMMARY.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
MEMORY_SYSTEM_SUMMARY.md		MEMORY_SYSTEM_SUMMARY.md
MICRO_EXECUTE_ASSESSMENT_SYSTEM.md		MICRO_EXECUTE_ASSESSMENT_SYSTEM.md
MICRO_PLANNING_SYSTEM.md		MICRO_PLANNING_SYSTEM.md
MICRO_STEP_INFO_FLOW.md		MICRO_STEP_INFO_FLOW.md
MINIMAX_SETUP_GUIDE.md		MINIMAX_SETUP_GUIDE.md
MULTI_MODEL_TEST_README.md		MULTI_MODEL_TEST_README.md
PHASE_LOGGING_AND_MEETINGS.md		PHASE_LOGGING_AND_MEETINGS.md
PLEASE AI AGENT FRAMEWORK.pdf		PLEASE AI AGENT FRAMEWORK.pdf
PM_USAGE.md		PM_USAGE.md
PRODUCT_MANAGER_STATE_UPDATES.md		PRODUCT_MANAGER_STATE_UPDATES.md
PROJECT_SPEC_FOR_CLAUDE.md		PROJECT_SPEC_FOR_CLAUDE.md
README.md		README.md
SCIENTIFIC_MINDSET_README.md		SCIENTIFIC_MINDSET_README.md
START_HERE.md		START_HERE.md
UI.py		UI.py
a1.patch		a1.patch
ai_code.py		ai_code.py
ai_interactive.txt		ai_interactive.txt
all_changes.diff		all_changes.diff
bio_52_baseline_test.py		bio_52_baseline_test.py
bio_52_multi_model_test.py		bio_52_multi_model_test.py
bio_52_questions.txt		bio_52_questions.txt
bio_52_test.py		bio_52_test.py
claude_coding_workshop.ipynb		claude_coding_workshop.ipynb
code_assistant.py		code_assistant.py
create_spec.py		create_spec.py
developer_memo.md		developer_memo.md
fix_all.py		fix_all.py
fix_anthropic.py		fix_anthropic.py
fix_content_list.py		fix_content_list.py
fix_prompts.py		fix_prompts.py
full_run_output.txt		full_run_output.txt
launch_overnight.sh		launch_overnight.sh
license_info.md		license_info.md
my_mistakes.patch		my_mistakes.patch
output_test.txt		output_test.txt
output_test2.txt		output_test2.txt
output_test3.txt		output_test3.txt
overnight_runner.py		overnight_runner.py
patch_execute.py		patch_execute.py
prioritized_32_genes_summary.json		prioritized_32_genes_summary.json
pyproject.toml		pyproject.toml
quick_start_claude.py		quick_start_claude.py
repair2.py		repair2.py
repair_strings.py		repair_strings.py
reply.py		reply.py
requirements.txt		requirements.txt
revert_sed.py		revert_sed.py
run_with_log.sh		run_with_log.sh
run_with_log_52.sh		run_with_log_52.sh
setup_ai_coding.sh		setup_ai_coding.sh
simple_test_output.txt		simple_test_output.txt
smart_coding_templates.py		smart_coding_templates.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BioPLEASE: A General-Purpose Biomedical AI Agent

Overview

Quick Start

Installation

Option 1: Using .env file (Recommended)

Option 2: Using shell environment variables

⚠️ Known Package Conflicts

Basic Usage

MCP (Model Context Protocol) Support

🤝 Contributing to BioPLEASE

🔬 Call for Contributors: Help Build BioPLEASE-E2

Tutorials and Examples

📝 Documentation

Core Documentation

Key Features: Enhanced Memory System

Release schedule

Important Note

Cite Us

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BioPLEASE: A General-Purpose Biomedical AI Agent

Overview

Quick Start

Installation

Option 1: Using .env file (Recommended)

Option 2: Using shell environment variables

⚠️ Known Package Conflicts

Basic Usage

MCP (Model Context Protocol) Support

🤝 Contributing to BioPLEASE

🔬 Call for Contributors: Help Build BioPLEASE-E2

Tutorials and Examples

📝 Documentation

Core Documentation

Key Features: Enhanced Memory System

Release schedule

Important Note

Cite Us

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages