# MCP Server Guide for Biomni

MCP servers connect external APIs to Biomni. This guide shows how to use the built-in servers and create your own.

In [None]:
import os

from biomni.agent import A1

# configure here: https://github.com/settings/personal-access-tokens
os.environ.setdefault("GITHUB_TOKEN", "XXX")

agent = A1("/dfs/project/bioagentos")
agent.add_mcp(config_path="./mcp_config.yaml")

Checking and downloading missing data lake files...
Discovered 80 tools from github MCP server


In [3]:
result = agent.go(
    "Perform a complete end-to-end analysis on /dfs/project/bioagentos/data/singlecell/skin_10k.h5ad and push the entire analysis and report to the mcp_gene repository"
)

Using prompt-based retrieval with the agent's LLM

Perform a complete end-to-end analysis on /dfs/project/bioagentos/data/singlecell/skin_10k.h5ad and push the entire analysis and report to the mcp_gene repository

I'll perform a complete end-to-end single-cell RNA-seq analysis on the skin dataset and push the results to the mcp_gene repository. Let me start by creating a detailed plan.

## Plan for Complete scRNA-seq Analysis

1. [ ] Load and explore the dataset structure
2. [ ] Perform quality control analysis
3. [ ] Data preprocessing (filtering, normalization)
4. [ ] Dimensionality reduction and clustering
5. [ ] Cell type annotation using multiple methods
6. [ ] Differential expression analysis
7. [ ] Gene set enrichment analysis
8. [ ] Generate comprehensive visualizations
9. [ ] Create analysis report
10. [ ] Push all results to mcp_gene repository

Let me start with the analysis:

<execute>
import scanpy as sc
import pandas as pd
import numpy as np
import matplotlib.pyplot as p

21:45:50 - INFO - Performing Enrichr analysis using database GO_Biological_Process_2021. 
        Please note that there might be a more appropriate database for your application. 
        Go to https://maayanlab.cloud/Enrichr/#libraries for a full list of supported databases.
        



<execute>
print("Step 8: Gene Set Enrichment Analysis")
print("=" * 50)

from biomni.tool.genomics import gene_set_enrichment_analysis

# Perform GSEA for top marker genes of major cell types
enrichment_results = {}

if 'cell_type' in adata.obs.columns:
    major_cell_types = ['fibroblast', 'cardiac muscle cell', 'myeloid cell']
    
    for cell_type in major_cell_types:
        print(f"\nPerforming GSEA for {cell_type}...")
        
        # Get top 50 marker genes for this cell type
        if cell_type in adata.uns['rank_genes_groups']['names'].dtype.names:
            marker_genes = list(adata.uns['rank_genes_groups']['names'][cell_type][:50])
            
            try:
                # Perform enrichment analysis
                result = gene_set_enrichment_analysis(
                    genes=marker_genes,
                    top_k=10,
                    database='ontology',
                    plot=False
                )
                enrichment_results[cell_type] = re

21:45:51 - INFO - Performing Enrichr analysis using database GO_Biological_Process_2021. 
        Please note that there might be a more appropriate database for your application. 
        Go to https://maayanlab.cloud/Enrichr/#libraries for a full list of supported databases.
        
21:45:51 - INFO - Performing Enrichr analysis using database GO_Biological_Process_2021. 
        Please note that there might be a more appropriate database for your application. 
        Go to https://maayanlab.cloud/Enrichr/#libraries for a full list of supported databases.
        



<observation>The output is too long to be added to context. Here are the first 10K characters...
Step 8: Gene Set Enrichment Analysis

Performing GSEA for fibroblast...
GSEA completed for fibroblast
Starting enrichment analysis for genes: DCN, COL6A2, C7_ENSG00000112936, ABCA8, MGP, COL6A3, COL1A2, COL6A1, FBLN1, PTN, FBN1, SERPINF1, BICC1, VCAN, PCOLCE, SRPX, MFAP4, MMP2, COL5A2, LUM, FBLN2, COL3A1, LTBP4, COL1A1, ABCA6, CCDC80, NEGR1, NFIA, PDGFRA, GSN, LAMA2, LRP1, TIMP1, IGFBP7, PLAC9, SCN7A, NFIB, CFD, CDH19, COL5A1, SPARCL1, ABCA9, FN1, EMILIN1, MIR99AHG, MXRA8, MGST1, SERPINE2, COL15A1, MEIS2 using ontology database and top_k: 10
Performing enrichment analysis using gget.enrichr with the ontology database...
Filtering the top 10 enrichment results...
Rank: 1
Path Name: extracellular matrix organization (GO:0030198)
P-value: 1.04e-20
Z-score: 39.231383
Combined Score: 1805.220094
Overlapping Genes: COL15A1, GSN, LAMA2, LUM, MMP2, FN1, DCN, COL1A1, VCAN, COL3A1, COL1A2, COL5A1, C