This project demonstrates two approaches for enabling an LLM to explore and reason
over a shared docs/ folder using the Model Context Protocol (MCP):
- Local MCP Tool-Calling — tools executed within the Python process
- Remote MCP Server — a standalone HTTP MCP server exposed to an LLM
See the README files within the relevant local or remote folders for additional information and instructions
-
Navigate to the Groq Console:
https://console.groq.com/home - Create a new project in the dashboard.
-
Navigate to the API Keys section:
https://console.groq.com/keys - Generate a new API key.
- Copy the key (it will not be shown again).
In the root of this repository, create a file named:
.envAdd the following template:
GROQ_API_KEY=gsk_your_generated_key_here
# Only required for remote MCP usage:
DOCS_MCP_URL=https://your-public-endpoint/mcp
Notes:
GROQ_API_KEYis required for both local and remote modes.DOCS_MCP_URLis only required when using the remote MCP server.- Ensure the file is named exactly
.envand placed in the project root.
python -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt
-
Local MCP:
python call-local-docs-mcp.py -
Remote MCP:
Follow the detailed setup instructions inside:
remote_mcp/README.md
The local approach provides an “agentic loop” directly within Python, using Groq’s
chat.completions API with structured tool-calling. The model is exposed to two tools:
list_docs()— returns all filenames indocs/read_doc(path)— returns file contents by name
A conversational loop is constructed manually. The sequence runs as follows:
-
The script creates an initial forged chat history:
- system – instructs the model to act as a documentation assistant and to explain why it is using each tool and what it expects to get back.
- user – contains the actual query.
-
On each iteration, the full chat history plus the tool schemas is sent to the model.
The model responds with either:
- one or more structured tool calls, or
- a natural-language answer with no tool calls.
-
When tool calls are present:
- An assistant message containing the tool call(s) and the model’s commentary is added to the conversation.
- Each tool is executed locally in Python.
- A corresponding tool message containing the raw result is added.
- The loop continues with the updated history.
- When the model returns a message with no tool calls, this is treated as the final answer.
- The loop hard-stops after a small max iteration count to prevent infinite tool usage.
At each iteration the script:
- Prints a short “thoughts” trace to the console showing:
- the iteration number,
- the model’s commentary about why it is using tools, and
- the tool name(s) and raw argument JSON.
- Logs the full conversation, raw responses, tool calls, and tool results to a timestamped file under
./logs/.
The local agent can be executed directly from the project root using:
python call-local-docs-mcp.py
I have a design document outlining a project I wish to work on.
Additionally, I have employee profile documents.
Please find the most suitable employee to work on the project,
and give a quick summary of what the project is and why this employee is suitable.
[Iteration 1] Model requested tool(s):
Commentary: I will first list all documents in the shared docs folder to see what is available.
→ Tool: list_docs Args: {}
[Iteration 2] Model requested tool(s):
Commentary: I will read the design specification to understand the project requirements.
→ Tool: read_doc Args: {"path": "design-spec.md"}
[Iteration 3] Model requested tool(s):
Commentary: I will now inspect the employee profiles to find the best match for this project.
→ Tool: read_doc Args: {"path": "employee-profile-JT.txt"}
→ Tool: read_doc Args: {"path": "employee-profile-VB.txt"}
The project outlined in the design document is for a Retrieval-Augmented Generation (RAG)
data pipeline to support an MCP-based agent. The pipeline will ingest documents from a
designated project folder, transform them into searchable vector representations, and
expose retrieval capabilities as MCP tools that can be invoked by the Groq-hosted model.
Based on the employee profiles, Jeffrey Brian Thompson (Data Engineer and Scientist) is
the most suitable employee to work on this project. His primary expertise includes Data
Engineering, SQL, Python, and AWS S3 Buckets, which are relevant to the project's
requirements. Additionally, his experience in data warehousing, analytics engineering,
and technical documentation will be valuable in designing and implementing the RAG
data pipeline.
This local mode does not require networking, hosting, tunneling, or external MCP connectors.
Only a Groq API key (for the LLM) and access to the local docs/ folder are required.
The remote approach runs a standalone MCP HTTP server exposing the same tools
(list_docs and read_doc). The LLM communicates with the tools through
the Model Context Protocol rather than through direct, locally orchestrated tool-calling.
- The server is an independent FastMCP HTTP process.
- The LLM discovers tools through an MCP connector.
- The server must be reachable at a public URL (e.g. via ngrok).
- The Groq Responses API handles orchestration.
For full remote setup instructions, see:
remote_mcp/README.md
├── docs/
├── local_mcp/
├── remote_mcp/
├── call-local-docs-mcp.py
├── call-remote-docs-mcp.py
├── requirements.txt
└── .env