Trino MCP server

This folder contains a small Model Context Protocol (MCP) server that connects an LLM agent to Trino over the official Python client. The server is implemented in trino_mcp.py using FastMCP and exposes tools the model can call to discover catalogs, schemas, tables, column metadata, and to run SQL with bounded row limits.

What this gives an LLM agent

Trino is a query engine: it federates many data sources behind one SQL dialect and one JDBC/HTTP API. The MCP layer does not replace Trino; it wraps Trino so the agent gets a stable, tool-based interface (list_catalogs, describe_table, query, …) instead of raw connection strings or ad-hoc scripts.

Typical agent loop:

Discover what exists (list_catalogs → list_schemas → list_tables → describe_table).
Plan a read-only SQL query using names and types from DESCRIBE.
Execute with query and interpret the returned rows/columns as context for reasoning or summarization.

Row payloads are always returned as tabular JSON-friendly structures (columns + rows), which maps cleanly to how models consume tool results.

How Trino represents structured, semi-structured, and unstructured data

Trino’s strength is that one SQL surface can address very different physical formats. From the agent’s perspective, almost everything still arrives as named columns and rows (sometimes with nested or string-typed cells).

Kind of data	What it usually means in Trino	How the agent sees it via this MCP
Structured	Relational tables (Iceberg/Hive/Delta, RDBMS connectors, etc.) with scalar columns	`describe_table` returns a clear schema; `query` returns typed rows. Best case for reliable joins and aggregations.
Semi-structured	`ROW`, `MAP`, `ARRAY`, `JSON` types; nested columns in Iceberg; key/value or document-like fields	Still columns in the result set. Complex values may serialize as nested structures or strings depending on the client and types; the agent should read `DESCRIBE` output and sample with `SELECT` limits.
Unstructured or file-like	Blobs, file paths, object storage references, log lines in `VARCHAR`, or specialized table functions (e.g. reading files through a connector)	Often exposed as metadata + text/path columns or parsed fragments (e.g. a line, a JSON envelope, OCR text). The model reasons over those representations; Trino does not stream raw binary to the MCP—use SQL to extract the slice the agent needs (`substr`, JSON functions, regexp, etc.).

In short: Trino normalizes access; MCP transports the query outcome to the model as structured tool output. For semi-structured and unstructured sources, good SQL (projection, filtering, sampling) keeps responses within the row cap and improves answer quality.

Configuration

Create a .env next to trino_mcp.py (the server calls load_dotenv() on startup). Required and optional variables:

Variable	Required	Default	Purpose
`TRINO_HOST`	Yes	—	Coordinator hostname
`TRINO_USER`	Yes	—	Username sent to Trino
`TRINO_PASSWORD`	No	empty	If set, enables `BasicAuthentication`
`TRINO_PORT`	No	`8080`	Coordinator port
`TRINO_HTTP_SCHEME`	No	`https` if password set, else `http`	HTTP scheme
`TRINO_VERIFY_SSL`	No	`true`	TLS verification (`1`/`true`/`yes`/`on` to verify)
`TRINO_CATALOG`	No	empty	When `list_tables` is called with a schema that has no dot, the server prefixes `TRINO_CATALOG.`

Connection setup is in _get_connection() in trino_mcp.py: it uses trino.dbapi.connect with optional basic auth and configurable TLS verification.

Running the server

From this directory, with dependencies installed (see pyproject.toml in this folder for the broader course stack, including trino and mcp[cli]):

python trino_mcp.py

The process writes Starting Trino MCP... to stderr and runs the MCP transport (stdio) via FastMCP’s mcp.run().

Wiring into Cursor (example)

Add a server entry that launches this script with the same Python environment where mcp and trino are installed, and ensure TRINO_* variables are available (inline env or a loaded .env—depending on how you start the process).

Conceptually:

{
  "mcpServers": {
    "trino": {
      "command": "python",
      "args": ["C:/Users/you/project/mcp/trino_mcp.py"],
      "env": {
        "TRINO_HOST": "trino.example.com",
        "TRINO_USER": "agent_reader",
        "TRINO_PORT": "443",
        "TRINO_HTTP_SCHEME": "https",
        "TRINO_PASSWORD": "use-secret-manager-in-practice"
      }
    }
  }
}

Adjust paths and secrets to your environment; prefer read-only Trino users and secret storage over plaintext passwords in config files.

Tools exposed to the agent

All tools that run SQL share _fetch_query, which returns:

columns: list of column names
rows: up to max_rows rows (hard cap 1000 at the API level)
row_count, max_rows

max_rows must be between 1 and 1000 for query; other listing tools use up to 1000 rows internally.

Tool	Role
`query(sql, max_rows=100)`	Arbitrary Trino SQL read path; primary interface for structured/semi-structured extraction
`list_catalogs()`	`SHOW CATALOGS`
`list_schemas(catalog)`	`SHOW SCHEMAS FROM "<catalog>"` with safe identifier quoting
`list_tables(schema)`	`SHOW TABLES FROM "<catalog.schema>"` (optional `TRINO_CATALOG` prefix when schema has no dot)
`describe_table(table)`	`DESCRIBE "<catalog.schema.table>"` for column names/types

Identifiers are passed through _quote_identifier_path, which splits on . and emits double-quoted segments—reducing accidental injection when composing SHOW/DESCRIBE statements.

Operational and security notes

Prefer read-only credentials at Trino (system, catalog, or session-level) so query cannot mutate data unless you explicitly allow it.
Row caps limit accidental huge responses; agents should use LIMIT and selective projections.
PII and compliance: the model receives whatever columns your SQL returns. Narrow selects and masking views are usually better than wide SELECT *.
Performance: heavy scans still hit Trino workers; the MCP server only bounds returned rows, not cluster work—use partitions, predicates, and appropriate connectors.

Related files

trino_mcp.py — MCP server and Trino integration
pyproject.toml — Python dependencies for this course/module layout
mcp_client.py — example MCP client code (currently pointed at a different reference server path); adapt the pattern if you want automated tests against trino_mcp.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
file.txt		file.txt
mcp_client.py		mcp_client.py
pyproject.toml		pyproject.toml
tool_calling.ipynb		tool_calling.ipynb
trino_mcp.py		trino_mcp.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trino MCP server

What this gives an LLM agent

How Trino represents structured, semi-structured, and unstructured data

Configuration

Running the server

Wiring into Cursor (example)

Tools exposed to the agent

Operational and security notes

Related files

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Trino MCP server

What this gives an LLM agent

How Trino represents structured, semi-structured, and unstructured data

Configuration

Running the server

Wiring into Cursor (example)

Tools exposed to the agent

Operational and security notes

Related files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages