This directory contains integration scripts to bridge our local HuggingFace multimodal models with the Model Context Protocol (MCP) ecosystem. It allows our models to both consume external tools (as an MCP Client) and provide their capabilities to external MCP-compatible agents (as an MCP Server).
The Model Context Protocol standardized how AI models interact with data sources and tools. This directory includes two separate standalone implementations:
mcp_client.py: A client that loads a local LLM/multimodal model and connects it to any standard MCP Server. This enables the model to ask the server for available tools and execute them dynamically in an interactive loop.mcp_server.py: A host application that exposes our local Multimodal model as an MCP Server (using theFastMCPframework). This allows IDEs (like Claude Desktop) or other agents to call our model as a standard tool.
This script loads a HuggingFace causal LM, connects to an external MCP server over stdio, retrieves the available tools, and prompts the model to logically utilize those tools via JSON representations.
python mcp_client.py \
--model-path "path/to/local/hf/model" \
--server "npx" \
--server-args "-y" "@modelcontextprotocol/server-filesystem" "/Users/path/to/expose" \
--prompt "List the files in the exposed directory."- Connects securely over standard I/O streams using
mcp.client.stdio. - Dynamically converts MCP Server JSON schemas into prompt descriptions.
- Parses the model's output for structured JSON boundaries to automatically execute requested tools and return the result logic.
This script wraps a HuggingFace multimodal model behind the MCP boundary using the FastMCP framework. It serves as an execution endpoint for multimodal inferences.
analyze_image(image_path, prompt): Opens a local image, parses it through the vision processor, and returns the model inference.generate_text(prompt): Standard text-only inference.
python mcp_server.py --model-path "path/to/local/multimodal/model"Once started, the server runs an event loop over stdio. You can connect to it using any compliant MCP client, instructing it to run the analyze_image or generate_text tool endpoints.
You need the mcp SDK installed to run these integrations:
pip install mcp transformers torch pillow