# DNALLM MCP Client with LangChain Agents

This notebook demonstrates how to integrate DNALLM (DNA Large Language Model Toolkit) with LangChain agents using the Model Context Protocol (MCP). 

## Overview

This example shows how to:
1. **Start a DNALLM MCP server** with streamable HTTP transport
2. **Connect a LangChain client** to the MCP server
3. **Create an agent** that can use DNALLM's DNA analysis tools
4. **Perform comprehensive DNA sequence analysis** using multiple specialized models

## Key Components

- **DNALLM MCP Server**: Provides DNA analysis tools via MCP protocol
- **LangChain MCP Adapters**: Enables LangChain to communicate with MCP servers
- **Ollama Integration**: Uses local LLM (Qwen3) for natural language processing
- **Multi-Model Analysis**: Combines promoter, conservation, and chromatin analysis

## Prerequisites

- DNALLM installed and configured
- Ollama running with Qwen3 model
- Required Python packages installed (see first cell)

## Workflow

1. Install dependencies
2. Start DNALLM MCP server
3. Configure async environment
4. Initialize MCP client connection
5. Create LangChain agent with MCP tools
6. Analyze DNA sequence using the agent
7. Display comprehensive results


In [None]:
# Install required dependencies for LangChain MCP integration
# This cell installs the necessary packages to use LangChain with MCP (Model Context Protocol)
# and Ollama for local LLM inference

!uv pip install -U langchain                    # Core LangChain framework
!uv pip install -U langchain-mcp-adapters       # MCP adapters for LangChain integration
!uv pip install -U langchain-ollama            # Ollama integration for local LLM inference


[2mUsing Python 3.13.5 environment at: /Users/forrest/GitHub/DNALLM/.venv[0m
[2K[2mResolved [1m32 packages[0m [2min 2.97s[0m[0m                                        [0m
[2K[37m⠙[0m [2mPreparing packages...[0m (0/22)                                                  
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/22)-------------[0m[0m     0 B/69.34 KiB           [1A
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/22)-------------[0m[0m 14.85 KiB/69.34 KiB         [1A
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/22)-------------[0m[0m 30.85 KiB/69.34 KiB         [1A
[2K[1A[37m⠹[0m [2mPreparing packages...[0m (3/22)[2m---------[0m[0m 46.85 KiB/69.34 KiB         [1A
[2K[1A[37m⠹[0m [2mPreparing packages...[0m (3/22)[2m---------[0m[0m 46.85 KiB/69.34 KiB         [1A
[2K[1A[37m⠹[0m [2mPreparing packages...[0m (3/22)----[2m-----[0m[0m 57.26 KiB/69.34 KiB         [1A
[2K[1A[37m⠹[0m [2mPreparing packages...[0m (3/22)--

In [None]:
# Start the DNALLM MCP server
# This command starts the DNALLM MCP server with streamable HTTP transport
# The server will be available at http://localhost:8000/mcp
# Note: This cell should be run before the client connection cells

!dnallm mcp-server --transport streamable-http

In [None]:
# Configure asyncio for Jupyter notebook compatibility
# nest_asyncio allows nested event loops, which is necessary for running async code in Jupyter
import nest_asyncio
import asyncio
import time

# Apply nest_asyncio to enable nested event loops
nest_asyncio.apply()

In [None]:
# Initialize MCP client to connect to DNALLM server
# This creates a connection to the DNALLM MCP server running on localhost:8000
from langchain_mcp_adapters.client import MultiServerMCPClient  
from langchain.agents import create_agent

# Create MCP client with DNALLM server configuration
# The server should be running on port 8000 with streamable HTTP transport
client = MultiServerMCPClient(  
    {
        "dnallm": {
            "transport": "streamable_http",  # HTTP-based remote server transport
            "url": "http://localhost:8000/mcp",  # DNALLM MCP server endpoint
        }
    }
)

In [None]:
# Create LangChain agent with MCP tools
# This retrieves available tools from the DNALLM MCP server and creates a LangChain agent
# that can use these tools for DNA sequence analysis

# Get available tools from the MCP server
tools = await client.get_tools()  

# Create a LangChain agent using Ollama's Qwen3 model with MCP tools
# The agent can now use DNALLM's DNA analysis capabilities through the MCP tools
agent = create_agent(
    "ollama:qwen3:latest",  # Local LLM model via Ollama
    tools                   # MCP tools from DNALLM server
)

In [None]:
# Perform DNA sequence analysis using the LangChain agent
# This demonstrates how to use the agent to analyze a DNA sequence using DNALLM's models
# The agent will automatically select and use the appropriate MCP tools for analysis

# Define the DNA sequence to analyze
dna_sequence = """AGAAAAAACATGACAAGAAATCGATAATAATACAAAAGCTATGATGGTGTGCAATGTCCGTGTGCATGCGTGCACGCATTGCAACCGGCCCAAATCAAGGCCCATCGATCAGTGAATACTCATGGGCCGGCGGCCCACCACCGCTTCATCTCCTCCTCCGACGACGGGAGCACCCCCGCCGCATCGCCACCGACGAGGAGGAGGCCATTGCCGGCGGCGCCCCCGGTGAGCCGCTGCACCACGTCCCTGA"""

# Invoke the agent to analyze the DNA sequence
# The agent will use DNALLM's specialized models to provide comprehensive analysis
dnallm_response = await agent.ainvoke(
    {"messages": [{"role": "user", "content": f'What is the function of following DNA sequence? Please analyze it thoroughly using all available models:\n{dna_sequence}'}]}
)

In [None]:
# Display the analysis results
# This prints the comprehensive DNA sequence analysis provided by the LangChain agent
# The analysis includes insights from multiple DNALLM models (promoter, conservation, chromatin)
# The results show detailed functional interpretation of the DNA sequence

print(dnallm_response['messages'][-1].content)

The provided DNA sequence has been analyzed using three specialized models, revealing key functional insights:

1. **Promoter Analysis**:
   - **Core Promoter**: The sequence is confidently identified as a core promoter region (score: 94.0%). Core promoters are critical for initiating transcription by RNA polymerase.

2. **Conservation Analysis**:
   - **Conserved Region**: The sequence shows strong conservation across species (score: 92.0%), indicating evolutionary importance. This suggests it likely serves a regulatory function preserved through evolution.

3. **Open Chromatin Analysis**:
   - **Full Open Chromatin**: The sequence is classified as "Full open" (score: 94.6%), indicating it resides in an actively accessible chromatin state. Open chromatin is typically associated with enhancers, promoters, or regulatory elements.

**Functional Interpretation**:
This sequence represents a **highly conserved core promoter region** in an **actively transcribed genomic locus**. The combinat