# RAG Analysis & MCP Server Prototype

This notebook serves two purposes:
1. Validating the existing RAG pipeline (`scripts/ai/rag`) against the `ebook_library` Qdrant collection.
2. Prototyping an MCP Server to expose this RAG functionality to Antigravity agents.
3. Testing the running MCP Server (Client Mode) using LangChain's MultiServerMCPClient.

In [None]:
# Install FastMCP and LangChain MCP Adapters
# %pip install "fastmcp>=3.0.0" langchain-mcp-adapters

In [3]:
import sys
import os
import logging

# Ensure we can import from project root
project_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
if project_root not in sys.path:
    sys.path.append(project_root)

from qdrant_client import QdrantClient
from langchain_qdrant import QdrantVectorStore
from scripts.ai.rag.rag_optimized import get_rag, OptimizedRAG

# Configure basic logging
logging.basicConfig(level=logging.INFO)

## 1. Direct Qdrant Inspection
First, let's verify we can connect to the Docker instance and that the collection exists.

In [4]:
client = QdrantClient(url="http://localhost:6333")
collections = client.get_collections()

print("Available Collections:")
for c in collections.collections:
    print(f"- {c.name}")

count = client.count(collection_name="ebook_library")
print(f"\nDocument Count in 'ebook_library': {count.count}")

INFO:httpx:HTTP Request: GET http://localhost:6333 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:6333/collections "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:6333/collections/ebook_library/points/count "HTTP/1.1 200 OK"


Available Collections:
- ebook_library

Document Count in 'ebook_library': 28295


## 2. Test Existing RAG Pipeline
We will use the `get_rag()` factory from `scripts/ai/rag/rag_optimized.py`. This handles the embedding model, parent-child retrieval, and reranking logic automatically.

In [5]:
# Initialize RAG (warmup=True will load models)
rag = get_rag(warmup=True)

query = "What are the key principles of agentic workflows?"
print(f"\nQuerying: '{query}'...\n")

results = rag.query(query)

for i, doc in enumerate(results, 1):
    source = doc.metadata.get("source", "Unknown")
    print(f"Result {i} [{source}]:\n{doc.page_content[:200]}...\n")

INFO:scripts.ai.rag.rag_optimized:Connecting to Qdrant Docker (http://localhost:6333)...
INFO:httpx:HTTP Request: GET http://127.0.0.1:6333 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://127.0.0.1:6333/collections "HTTP/1.1 200 OK"
INFO:scripts.ai.rag.rag_optimized:Connected to Qdrant Docker Service.
INFO:httpx:HTTP Request: GET http://127.0.0.1:6333/collections "HTTP/1.1 200 OK"
INFO:scripts.ai.rag.rag_optimized:Loading FastEmbed model: sentence-transformers/all-MiniLM-L6-v2
INFO:httpx:HTTP Request: GET http://127.0.0.1:6333/collections/ebook_library "HTTP/1.1 200 OK"
INFO:scripts.ai.rag.rag_optimized:Initializing In-Memory document store...
INFO:scripts.ai.rag.rag_optimized:Loading from fast JSON cache: d:\Users\wpoga\Documents\Python Scripts\antigravity-agent-factory\data\rag\parent_store_cache.json
INFO:scripts.ai.rag.rag_optimized:Hydrated 5839 documents from cache.
INFO:httpx:HTTP Request: POST http://127.0.0.1:6333/collections/ebook_library/points/count "HTTP/1.1 200 OK"



Querying: 'What are the key principles of agentic workflows?'...

Result 1 [D:\Users\wpoga\Documents\Ebooks\Artificial Intelligence\a-practical-guide-to-building-agents.pdf]:
When should you 
build an agent?
Building agents requires rethinking how your systems make decisions and handle complexity. 
Unlike conventional automation, agents are uniquely suited to workflows whe...

Result 2 [D:\Users\wpoga\Documents\Ebooks\Artificial Intelligence\a-practical-guide-to-building-agents.pdf]:
24
25
26
27
28
29
30
32
32
33
)

 
 main():
    msg = input(
)

    orchestrator_output = await Runner.run(
        manager_agent,msg)

    
 message 
 orchestrator_output.new_messages:
        
(f"  ...

Result 3 [D:\Users\wpoga\Documents\Ebooks\Artificial Intelligence\a-practical-guide-to-building-agents.pdf]:
Defining tools
Tools extend your agent’s capabilities by using APIs from underlying applications or systems. For 
legacy systems without APIs, agents can rely on computer-use models to interact di

## 3. MCP Server Prototype with FastMCP

We will use the `fastmcp` library to define the server tools. 
This allows a very clean, decorator-based definition.

In [4]:
from fastmcp import FastMCP

# Create the MCP Server instance
mcp = FastMCP("RAG Agent Server")


@mcp.tool()
def query_rag(query: str) -> str:
    """Semantically search the ebook library for technical concepts."""
    rag = get_rag(warmup=False)
    docs = rag.query(query)
    if not docs:
        return "No relevant information found."

    return "\n\n".join(
        [
            f"Source: {d.metadata.get('source', 'Unknown')}\nContent: {d.page_content}"
            for d in docs
        ]
    )


@mcp.tool()
def ingest_document(path: str) -> str:
    """Ingest a PDF document into the RAG library."""
    rag = get_rag(warmup=False)
    try:
        rag.ingest_ebook(path)
        return f"Successfully ingested {path}"
    except Exception as e:
        return f"Error ingesting document: {str(e)}"


# Verify Tools are Registered
# Note: mcp.list_tools() is typically handled by the server loop.
# However, the decorators register them internally.
print("Registered tools:", query_rag, ingest_document)

# We can call the decorated functions directly to test logic
print("\nTesting 'query_rag' function directly:")
print(query_rag("explain RAG retrieval")[:500] + "...")

# To run the server (blocking), one would normally do:
# mcp.run()
# For this notebook, we just demonstrate the definition.

INFO:httpx:HTTP Request: POST http://127.0.0.1:6333/collections/ebook_library/points/query "HTTP/1.1 200 OK"


Registered tools: <function query_rag at 0x000002A0F529C900> <function ingest_document at 0x000002A0FB16E840>

Testing 'query_rag' function directly:
Source: D:\Users\wpoga\Documents\Ebooks\Artificial Intelligence\AI Fluency_ Key Terminology Cheat Sheet-OCR.pdf
Content: A type of error when AI confidently states something that 
sounds plausible, but is actually incorrect. 
Knowledge cutoff date 
The point after which an AI model has no built-in knowledge 
of the world, based on when it was trained. 
Reasoning or thinking models 
Types of AI models specifically designed to think step-by-
step through complex problems, showing improved 
capabil...


## 4. Test Existing RAG MCP Server (Client Mode)

**Pre-requisite:** Ensure the RAG MCP Server is running! (e.g., via `start_rag_server.bat`)
This section uses LangChain's `MultiServerMCPClient` to connect to the RAG server.

In [6]:
import asyncio
from langchain_mcp_adapters.client import MultiServerMCPClient

SERVER_URL = "http://127.0.0.1:8000/sse"


async def test_langchain_mcp_client():
    print(
        f"Connecting to MCP Server at {SERVER_URL} via LangChain MultiServerMCPClient..."
    )

    try:
        # Initialize client with server configuration
        client = MultiServerMCPClient(
            connections={"rag_server": {"url": SERVER_URL, "transport": "sse"}}
        )

        print("Client Initialized. Fetching tools...")

        # 1. List Available Tools
        # get_tools() retrieves tools from all connected servers
        tools = await client.get_tools()
        print(f"\nAvailable Tools: {[tool.name for tool in tools]}")

        # 2. Invoke 'search_library' tool
        # NOTE: LangChain wraps tools as Runnable objects
        # We can find the tool by name and invoke it
        search_tool = next((t for t in tools if t.name == "search_library"), None)

        if search_tool:
            query = "Inhaltsübersicht Künstliche Intelligenz Russell?"
            print(f"\nInvoking 'search_library' with query: '{query}'")

            # Invoke the tool
            result = await search_tool.ainvoke({"query": query})
            print(f"\n--- Response ---\n{result[:500]}...\n----------------")
        else:
            print("Warning: 'search_library' tool not found on server.")

        # 3. Invoke 'list_library_sources' tool
        list_tool = next((t for t in tools if t.name == "list_library_sources"), None)

        if list_tool:
            print("\nInvoking 'list_library_sources'...")
            sources_result = await list_tool.ainvoke({})
            print(f"\n--- Sources ---\n{sources_result}\n---------------")
        else:
            print("Warning: 'list_library_sources' tool not found on server.")

    except Exception as e:
        print(f"LangChain MCP Client Error: {e}")


# Run the async test
await test_langchain_mcp_client()

INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=cb3f9f01e8094df987f32bfba54bed18 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=cb3f9f01e8094df987f32bfba54bed18 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=cb3f9f01e8094df987f32bfba54bed18 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=72e328e65a544c8f90f1f7ef08465412 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=72e328e65a544c8f90f1f7ef08465412 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=72e328e65a544c8f90f1f7ef08465412 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=7

Connecting to MCP Server at http://127.0.0.1:8000/sse via LangChain MultiServerMCPClient...
Client Initialized. Fetching tools...

Available Tools: ['search_library', 'get_ebook_toc', 'ingest_document', 'list_library_sources']

Invoking 'search_library' with query: 'Inhaltsübersicht Künstliche Intelligenz Russell?'

--- Response ---
[{'type': 'text', 'text': '### Result 1 (Source: Stuart Russell_Künstliche_Intelligenz-_Ein_moderner_Ansatz_(3.,_aktualisierte_Auflage).pdf)\n\nKünstliche Intelligenz\n\n---\n\n### Result 2 (Source: Stuart Russell_Künstliche_Intelligenz-_Ein_moderner_Ansatz_(3.,_aktualisierte_Auflage).pdf)\n\nKünstliche Intelligenz\n\n---\n\n### Result 3 (Source: Stuart Russell_Künstliche_Intelligenz-_Ein_moderner_Ansatz_(3.,_aktualisierte_Auflage).pdf)\n\n39\n1.3  Die Geschichte der künstlichen Intelligenz \nwurde eine Besprechung dieses Buches genauso bekannt wie das eigentliche Buch und \nführte dazu, das Interesse am Behaviorismus so gut wie verschwinden zu lassen. Der 

INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=5f3e3ae2d05e454fb5b58a79aaa4a558 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=5f3e3ae2d05e454fb5b58a79aaa4a558 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=5f3e3ae2d05e454fb5b58a79aaa4a558 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=5f3e3ae2d05e454fb5b58a79aaa4a558 "HTTP/1.1 202 Accepted"



--- Sources ---
[{'type': 'text', 'text': 'Indexed Documents:\n- 2026 Agentic Coding Trends Report.pdf\n- AI Fluency_ Key Terminology Cheat Sheet-OCR.pdf\n- Anthropic-enterprise-ebook-digital.pdf\n- Stuart Russell_Künstliche_Intelligenz-_Ein_moderner_Ansatz_(3.,_aktualisierte_Auflage).pdf\n- The-Complete-Guide-to-Building-Skill-for-Claude.pdf\n- WEF_AI_Agents_in_Action_Foundations_for_Evaluation_and_Governance_2025.pdf\n- a-practical-guide-to-building-agents.pdf\n- claudes-constitution_webPDF_26-01.26a.pdf\n- practices-for-governing-agentic-ai-systems.pdf\n- woodridge_intelligent_agents.pdf', 'id': 'lc_467153b4-aeaa-4272-b81c-6bb5947cbe27'}]
---------------


In [7]:
import os
import getpass
from langchain_google_genai import ChatGoogleGenerativeAI


def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")


_set_env("GEMINI_API_KEY")
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash-lite",
    google_api_key=os.environ["GEMINI_API_KEY"],
    temperature=0,
    convert_system_message_to_human=True,
)
_set_env("LANGSMITH_API_KEY")
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "ai-dev-agent"

## 6. Official LangGraph Agentic RAG implementation

This converts the execution into the standard LangGraph `StateGraph` pattern. It introduces a strict `SystemMessage` to prevent infinite tool calling loops and properly structures the output extraction for Jupyter Markdown rendering.

In [8]:
import os
from typing import Any, Dict, Annotated, Literal, Sequence
from typing_extensions import TypedDict
import operator

from IPython.display import display, Markdown
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition

SERVER_URL = "http://127.0.0.1:8000/sse"


# --- Schema Patching for Gemini ---
def _remove_additional_properties(schema: Dict[str, Any]) -> None:
    if not isinstance(schema, dict):
        return
    if "additionalProperties" in schema:
        del schema["additionalProperties"]
    for key, value in list(schema.items()):
        if isinstance(value, dict):
            _remove_additional_properties(value)
        elif isinstance(value, list):
            for item in value:
                if isinstance(item, dict):
                    _remove_additional_properties(item)


def patch_tool_schema(tool):
    if hasattr(tool, "args_schema") and tool.args_schema:
        if isinstance(tool.args_schema, dict):
            _remove_additional_properties(tool.args_schema)
        else:
            original_schema_method = tool.args_schema.schema

            def custom_schema(*args, **kwargs):
                s = original_schema_method(*args, **kwargs)
                _remove_additional_properties(s)
                return s

            tool.args_schema.schema = custom_schema
    return tool


# --- LangGraph State Definition ---
class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]


# --- Execute Graph ---
async def run_langgraph_rag(query: str):
    print("Connecting to RAG MCP Server...")
    client = MultiServerMCPClient(
        connections={"rag_server": {"url": SERVER_URL, "transport": "sse"}}
    )
    raw_tools = await client.get_tools()
    tools = [patch_tool_schema(t) for t in raw_tools]
    print(f"Retrieved {len(tools)} tools: {[t.name for t in tools]}\n")

    llm_with_tools = llm.bind_tools(tools)

    # We define the LLM decision node
    def agent_node(state: AgentState):
        # Add a strict system message to prevent runaway tool loops
        sys_msg = SystemMessage(
            content="""
            You are an expert AI assistant performing Agentic RAG.
            CRITICAL RULES:
            1. If you have retrieved sufficient information to answer the user's question, DO NOT CALL ANY MORE TOOLS. Synthesize the final answer immediately.
            2. ONLY query the library a maximum of 5 times. If you cannot find the answer after 5 searches, state that you cannot find it.
        """
        )
        messages = [sys_msg] + state["messages"]
        response = llm_with_tools.invoke(messages)
        return {"messages": [response]}

    # We define the explicit LangGraph workflow
    workflow = StateGraph(AgentState)

    # Add the reasoning node and the tool execution node
    workflow.add_node("agent", agent_node)
    workflow.add_node("tools", ToolNode(tools))

    # Build the edges (Reasoning -> Tool or END)
    workflow.add_edge(START, "agent")
    workflow.add_conditional_edges(
        "agent", tools_condition, {"tools": "tools", END: END}
    )
    # Once a tool executes, return to the agent reasoning node
    workflow.add_edge("tools", "agent")

    app = workflow.compile()

    # query = "How should agent skills be build? Describe with an example which aspects are important"
    print(f"Executing Native LangGraph with Query: {query}\n")

    try:
        async for output in app.astream(
            {"messages": [HumanMessage(content=query)]}, stream_mode="updates"
        ):
            for node_name, node_output in output.items():
                # Get the most recent message
                ai_msg = node_output["messages"][-1]

                if node_name == "agent":
                    if hasattr(ai_msg, "tool_calls") and ai_msg.tool_calls:
                        for call in ai_msg.tool_calls:
                            print(
                                f"⚒️ LangGraph Routing to Tool: {call['name']} ({call['args']})"
                            )
                    else:
                        print("\n--- LangGraph Final Output ---")
                        if isinstance(ai_msg.content, list):
                            final_text = "\n".join(
                                [
                                    chunk.get("text", "")
                                    for chunk in ai_msg.content
                                    if isinstance(chunk, dict)
                                    and chunk.get("type") == "text"
                                ]
                            )
                            display(Markdown(final_text))
                        else:
                            display(Markdown(str(ai_msg.content)))
                elif node_name == "tools":
                    print(
                        f"✅ Tool Node executed. Returning {len(str(ai_msg.content))} characters back to Agent.\n"
                    )

    except Exception as e:
        print(f"Error: {e}")


# query = "Wie entscheiden Agenten über ihre Aktionen?"
query = "How should skills for Claude be created? List the most important aspects and provide exmaples. Provide also the sources of your answers"
await run_langgraph_rag(query)

INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=90c075b2bd1942e1966ac0dea72593b9 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=90c075b2bd1942e1966ac0dea72593b9 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=90c075b2bd1942e1966ac0dea72593b9 "HTTP/1.1 202 Accepted"


Connecting to RAG MCP Server...
Retrieved 4 tools: ['search_library', 'get_ebook_toc', 'ingest_document', 'list_library_sources']

Executing Native LangGraph with Query: How should skills for Claude be created? List the most important aspects and provide exmaples. Provide also the sources of your answers



INFO:httpx:HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=e3f3864dc2394c359549606a67362d0c "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=e3f3864dc2394c359549606a67362d0c "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=e3f3864dc2394c359549606a67362d0c "HTTP/1.1 202 Accepted"


⚒️ LangGraph Routing to Tool: search_library ({'query': 'How to create skills for Claude'})


INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=e3f3864dc2394c359549606a67362d0c "HTTP/1.1 202 Accepted"


✅ Tool Node executed. Returning 3625 characters back to Agent.



INFO:httpx:HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3ec25a18cae845fbbd63b6ca1635fa69 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3ec25a18cae845fbbd63b6ca1635fa69 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3ec25a18cae845fbbd63b6ca1635fa69 "HTTP/1.1 202 Accepted"


⚒️ LangGraph Routing to Tool: search_library ({'query': 'best practices for creating Claude skills'})


INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3ec25a18cae845fbbd63b6ca1635fa69 "HTTP/1.1 202 Accepted"


✅ Tool Node executed. Returning 3625 characters back to Agent.



INFO:httpx:HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=9536db5547b24cffa47e50a3d7256a1a "HTTP/1.1 202 Accepted"


⚒️ LangGraph Routing to Tool: search_library ({'query': 'Claude skills structure and best practices'})


INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=9536db5547b24cffa47e50a3d7256a1a "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=9536db5547b24cffa47e50a3d7256a1a "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=9536db5547b24cffa47e50a3d7256a1a "HTTP/1.1 202 Accepted"


✅ Tool Node executed. Returning 4354 characters back to Agent.



INFO:httpx:HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://127.0.0.1:8000/sse "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3e07297966fe47cc853a3bb1e62ecf89 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3e07297966fe47cc853a3bb1e62ecf89 "HTTP/1.1 202 Accepted"
INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3e07297966fe47cc853a3bb1e62ecf89 "HTTP/1.1 202 Accepted"


⚒️ LangGraph Routing to Tool: search_library ({'query': 'technical requirements for Claude skills'})


INFO:httpx:HTTP Request: POST http://127.0.0.1:8000/messages/?session_id=3e07297966fe47cc853a3bb1e62ecf89 "HTTP/1.1 202 Accepted"


✅ Tool Node executed. Returning 4339 characters back to Agent.



INFO:httpx:HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent "HTTP/1.1 200 OK"



--- LangGraph Final Output ---


Skills for Claude are created as a set of instructions packaged in a simple folder, teaching Claude to handle specific tasks or workflows. This allows for customization and ensures Claude follows specific processes consistently without needing repeated explanations.

The most important aspects for creating skills include:

*   **Structure:** A skill is a folder containing:
    *   `SKILL.md` (required): Instructions in Markdown with YAML frontmatter.
    *   `scripts/` (optional): Executable code (Python, Bash, etc.).
    *   `references/` (optional): Documentation loaded as needed.
    *   `assets/` (optional): Templates, fonts, icons used in output.
*   **Core Design Principles:**
    *   **Progressive Disclosure:** A three-level system (YAML frontmatter, `SKILL.md` body, and linked files) that minimizes token usage while maintaining expertise.
    *   **Composability:** Skills should work well together, not assuming they are the only capability available.
    *   **Portability:** Skills should work identically across Claude.ai, Claude Code, and the API without modification, provided the environment supports any dependencies.

**Examples:**
Skills are powerful for repeatable workflows such as:
*   Generating frontend designs from specifications.
*   Conducting research with a consistent methodology.
*   Creating documents that adhere to a team's style guide.
*   Orchestrating multi-step processes.

**Sources:**
*   The-Complete-Guide-to-Building-Skill-for-Claude.pdf