# CAMEL Cookbook - Object Detection with ACI.dev MCP Tools

**Description:** Learn how to build an object detection agent using CAMEL AI and ACI.dev's MCP protocol for seamless ML tasks. 

‚≠ê Star us on [GitHub](https://github.com/camel-ai/camel), join our [Discord](https://discord.gg/EXAMPLE), or follow us on [X](https://x.com/camelaiorg)

This cookbook shows how to build a powerful object detection agent using CAMEL AI connected to ACI.dev's MCP tools. We'll create an agent that analyzes images, detects objects like cars or trees, and explains results in natural language‚Äîall without writing complex ML code.

**Key Learnings:**
- Why agents need tools to be truly useful.
- How MCP enables dynamic, aware tool usage for tasks like object detection.
- Setting up CAMEL with ACI.dev for real-time image analysis.
- Building and running your own object detection agent.
- Handling outputs with summaries, tables, and visualized results.

This setup uses CAMEL's `MCPToolkit` to connect to ACI.dev's MCP servers, powering object detection via Replicate's ML models.

### üì¶ Installation

In [None]:
%pip install camel-ai aci-mcp aci-sdk

Set up keys (ACI_API_KEY, LINKED_ACCOUNT_OWNER_ID, GOOGLE_API_KEY, REPLICATE_API_TOKEN) in `.env` file.

In [12]:
import os
from dotenv import load_dotenv

### Define CAMEL agents

In [13]:
load_dotenv()

agent_name = "ObjectDetectionAgent"
system_prompt="""
You are a specialized Object Detection Agent. Your primary function is to use the `REPLICATE.run` tool for object detection and present the findings in a user-friendly format. "
"The user will provide a text prompt containing an image URL and a query. You must extract the `image` URL and the `query` object(s). "
"Immediately call the `REPLICATE.run` tool. The `input` must be a dictionary with two keys: `image` (the URL) and `query` (a string of the object(s)). "
"Do not ask for clarification; make a reasonable inference if the query is ambiguous. "
"After receiving the tool's output, format your response as follows: "
"- **Natural Language Summary:** Start with a detailed friendly, insightful analysis of the detection results in plain English. "
"- **Markdown Table:** Create a markdown table with columns: 'Object', 'Confidence Score', and 'Bounding Box Coordinates'. "
"- **Result Image:** If the tool provides a URL for an image with bounding boxes, display it using markdown: `![Detected Objects](URL_HERE)`. "
"Whenever I give you a link, trigger the tool call, extract its outputs and links, and present me in a proper markdown format with detailed analysis from the tool call in natural language.
"""

### MCP servers configuration using ACI.dev

In [None]:
from camel.toolkits import MCPToolkit
mcp_config = {
    "mcpServers": {
        "aci_apps": {
            "command": "aci-mcp",
            "args": [
                "apps-server",
                "--apps=REPLICATE",
                "--linked-account-owner-id",
                "tahakom"
            ],
            "env": {"ACI_API_KEY": os.getenv("ACI_API_KEY")},
        }
    }
}
mcp_toolkit = MCPToolkit(config_dict=mcp_config)
# await mcp_toolkit.connect()


<camel.toolkits.mcp_toolkit.MCPToolkit at 0x79b6b05d2bd0>

Define the CAMEL agent with GEMINI 2.5 Flash model with Replicate MCP tools!

In [None]:
from camel.agents import ChatAgent
from camel.messages import BaseMessage
from camel.models import ModelFactory
from camel.types import ModelPlatformType, ModelType

tools = mcp_toolkit.get_tools()

# Initialize Gemini model
model = ModelFactory.create(
    model_platform=ModelPlatformType.OPENAI,
    model_type=ModelType.GPT_4_1,
)

# Create system message
sys_msg = BaseMessage.make_assistant_message(
    role_name=agent_name,
    content=system_prompt,
)

agent = ChatAgent(model=model, system_message=sys_msg, tools=tools, memory=None)

After create the agent, let's do some examples using Replicate to identify the following images:

Now you can start the interactive chat loop to analyze any image URL you provide!

In [17]:
# Sample prompts for reference:
# 1. "Analyze the vegetable stall and identify all produce, including tomato, onion, cabbage, cucumber, zucchini, carrot, and beet, in this image: https://images.pexels.com/photos/2255935/pexels-photo-2255935.jpeg"
# 2. "Analyze the busy street scene and identify all vehicles, such as car, bus, and truck, as well as people, in this image: https://www.livemint.com/rf/Image-621x414/LiveMint/Period1/2012/10/01/Photos/Road621.jpg"
# 3. "Analyze the warehouse scene and identify persons, cardboard boxes, and conveyor belts in this image: https://media.business-humanrights.org/media/images/16278498935_dac4d8f223_o.2e16d0ba.fill-1000x1000-c50.jpg"

print("Welcome to the Object Detection Chat! Enter 'quit' to exit.")
print("Please provide an image URL and what objects you'd like to detect.")
print("Example format: Analyze this image for cars and people: [IMAGE_URL]")

while True:
    try:
        user_input = input("\nEnter your prompt: ").strip()
        
        if user_input.lower() == 'quit':
            print("Thank you for using the Object Detection Chat!")
            break
            
        if not user_input:
            print("Please enter a valid prompt with an image URL.")
            continue
            
        response = await agent.astep(user_input)
        print("\nAnalysis Results:")
        print(response.msg.content)
    except Exception as e:
        print(f"An error occurred: {str(e)}")
        print("Please try again with a different image URL or prompt.")

Welcome to the Object Detection Chat! Enter 'quit' to exit.
Please provide an image URL and what objects you'd like to detect.
Example format: Analyze this image for cars and people: [IMAGE_URL]

Analysis Results:
Based on the object detection analysis of the vegetable stall image, several types of produce were successfully identified. The model detected multiple instances of onions, with varying confidence levels. A large cabbage was also identified with a good confidence score. A single carrot and a zucchini were detected, each with a reasonable confidence. Additionally, a significant cluster of tomatoes was identified. It appears that cucumber and beet were not detected in this image based on the provided query.

| Object | Confidence Score | Bounding Box Coordinates |
|---|---|---|
| carrot | 0.465 | [294, 916, 1001, 1438] |
| cabbage | 0.497 | [110, 1982, 1436, 3150] |
| onion | 0.412 | [742, 3535, 1378, 4186] |
| onion | 0.423 | [778, 4032, 1507, 4837] |
| onion | 0.373 | [10, 37

Finally, we disconnect the MCP toolkit. 

In [12]:
await mcp_toolkit.disconnect()