Skip to content

Add an MCP to TMLL#15

Open
MatthewKhouzam wants to merge 4 commits intoeclipse-tmll:mainfrom
MatthewKhouzam:mcp
Open

Add an MCP to TMLL#15
MatthewKhouzam wants to merge 4 commits intoeclipse-tmll:mainfrom
MatthewKhouzam:mcp

Conversation

@MatthewKhouzam
Copy link
Copy Markdown

What it does

This PR adds Model Context Protocol (MCP) server integration to TMLL, enabling the trace analysis library to be used as an MCP tool in AI assistants and other MCP-compatible applications.

Key additions:

  • tmll_cli.py: A command-line interface that exposes all TMLL functionality (experiment management, anomaly detection, memory leak detection, change point analysis, correlation analysis, idle resource detection, capacity planning, and clustering)
  • mcp_server_cli.py: An MCP server wrapper that exposes all CLI commands as MCP tools, allowing AI assistants to interact with TMLL programmatically.

How to test

  1. Test the CLI directly:

bash

Start a trace server instance

./tracecompass-server -data /path/to/workspace -vmargs -Dtraceserver.port=8080

Create an experiment

python3 tmll_cli.py create /path/to/trace -n test_experiment

List experiments

python3 tmll_cli.py list

Run anomaly detection

python3 tmll_cli.py anomaly <experiment_uuid> -k "cpu usage" -m iforest

  1. Test the MCP server:
    bash

    Run the MCP server

    python3 mcp_server_cli.py tmll_cli.py

    Configure in an MCP-compatible client (e.g., Kiro CLI, Claude Desktop)

    Test tool invocations through the client

  2. Verify all ML modules work through CLI:

    • Test each subcommand: anomaly, memory-leak, changepoint, correlation, idle-resources, capacity, cluster
    • Verify output formatting and error handling

Follow-ups

Review checklist

  • As an author, I have thoroughly tested my changes and carefully followed the instructions in this template

Implement tmll_cli.py with commands for experiment management and ML analysis:
- Experiment operations: create, list, delete, list-outputs, fetch-data
- Anomaly detection: anomaly, memory-leak
- Performance analysis: changepoint, correlation
- Resource optimization: idle-resources, capacity planning
- Clustering analysis

Note: this code was created with the assistance of claude sonnet 4.5

Signed-off-by: Matthew Khouzam <matthew.khouzam@ericsson.com>
Add the following MCP

mcp.json
{
  "mcpServers": {
    "tmll": {
      "command": "/usr/bin/python3",
      "args": "path-to/tmll/mcp_server_cli.py"],
      "env": {
        "PYTHONPATH": "path-to/tmll"
      }
    }
  }
}

This code creation was assisted by claude-sonnet-4.5

Signed-off-by: Matthew Khouzam <matthew.khouzam@ericsson.com>
Copy link
Copy Markdown
Contributor

@kavehshahedi kavehshahedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Matthew for the MCP! I tested it with Claude Code (with some minor changes I put comments for), and it was working very well!

Vibe coders are now ready to become tracing masters, right?

tmll_cli.py Outdated
client = TMLLClient(args.host, args.port, verbose=args.verbose)
traces = [{"path": os.path.expanduser(path)} for path in args.traces]
experiment = client.create_experiment(traces=traces, experiment_name=args.name)
print(f"Created experiment: {experiment.name} (UUID: {experiment.UUID})")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The experiment.UUID should be experiment.uuid

tmll_cli.py Outdated
Comment on lines +109 to +115
outputs = experiment.find_outputs(keyword=args.keywords, type=['xy'])

if not outputs:
print("No outputs found")
return

mld = MemoryLeakDetection(client, experiment, outputs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory leak detection module doesn't accept outputs. So, you should just use:

mld = MemoryLeakDetection(client, experiment)

tmll_cli.py Outdated
return

mld = MemoryLeakDetection(client, experiment, outputs)
result = mld.detect_memory_leak()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be mld.analyze_memory_leaks()

tmll_cli.py Outdated
return

ca = CorrelationAnalysis(client, experiment, outputs)
correlations = ca.analyze_correlation(method=args.method)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be: ca.analyze_correlations(method=args.method)

tmll_cli.py Outdated
Comment on lines +67 to +69
if args.output:
for key, df in data.items():
df.to_csv(f"{args.output}_{key}.csv", index=False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might fail for cases where the data is not a dictionary of <key, df>. There're some cases that the output may be dict<key, dict<key, df>>. So, probably you should handle those cases.

tmll_cli.py Outdated

# anomaly command
anomaly_parser = subparsers.add_parser("anomaly", help="Detect anomalies")
anomaly_parser.add_argument("experiment", help="Experiment UUID or name")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here we should only pass the UUID, and not the name. Same comment for the other commands below.



@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I checked, I guess some commands may send None for arguments. So, it might be better to make it Optional[dict] = None, and then, check for it:

arguments = arguments if isinstance(arguments, dict) else {}

so we don't get an exception when trying to .get() from it.


server = Server("tmll-cli-mcp-server")

CLI_PATH = sys.argv[1] if len(sys.argv) > 1 else "tmll_cli.py"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the path is not given, the "tmll_cli.py" is relative to the cwd. So, probably it's better to handle it.

CLI_PATH = sys.argv[1] if len(sys.argv) > 1 else Path(__file__).resolve().parent / "tmll_cli.py"

Comment on lines +180 to +194
Tool(
name="cluster_data",
description="Perform clustering analysis on trace data (kmeans, dbscan, hierarchical)",
inputSchema={
"type": "object",
"properties": {
"experiment_id": {"type": "string"},
"keywords": {"type": "array", "items": {"type": "string"}, "default": ["cpu usage"]},
"n_clusters": {"type": "integer", "default": 3},
"method": {"type": "string", "default": "kmeans", "enum": ["kmeans", "dbscan", "hierarchical"]},
},
"required": ["experiment_id"],
},
),
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please checkout my comment for the clustering module.

@kavehshahedi
Copy link
Copy Markdown
Contributor

Also, I think you should add the mcp package to the "requirements.txt" in order to run the MCP script:

mcp==1.27.0

And, could you please place the scripts in a proper package within the tmll src files? Maybe tmll/mcp?

- Fix experiment.UUID -> experiment.uuid
- Add None check for experiment in create_experiment
- Fix MemoryLeakDetection: remove outputs param, use analyze_memory_leaks()
- Fix ChangePointAnalysis: method -> methods (list of analysis modes)
- Fix CorrelationAnalysis: analyze_correlation -> analyze_correlations,
  plot_correlation -> plot_correlation_matrix
- Fix IdleResourceDetection: single threshold -> per-resource thresholds
- Fix CapacityPlanning: plan_capacity -> forecast_capacity(forecast_steps=)
- Remove clustering command (module not meaningful)
- Fix help text: 'UUID or name' -> 'UUID'
- Handle nested dict data in fetch_data_cmd
- Move scripts to tmll/mcp package
- Fix MCP server CLI_PATH to use Path(__file__) for reliable resolution
- Make MCP call_tool arguments Optional[dict] with None guard
- Add mcp==1.27.0 to requirements.txt
When the trace server returns a non-200 status for both datatree and
timegraph tree endpoints, response.model is None. Accessing .model on
None caused an AttributeError, failing CI tests.
Copy link
Copy Markdown
Contributor

@kavehshahedi kavehshahedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the fixes! Feel free to merge whenever you want!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants