confluence-mcp is a standalone MCP server for retrieval from a pre-synchronized Confluence RAG index.
The server does not write the final user-facing answer. It returns indexed chunks, citations, diagnostics, and page bundles; the MCP client or agent writes the final answer.
- Index Confluence pages selected by configured CQL.
- Store documents and vectors in one local SQLite database with
sqlite-vec. - Search the synchronized index semantically.
- Return absolute Confluence URLs in citations.
- Return a full page bundle only for pages already present in the index.
- Live-hydrate the primary matching page from Confluence, with fallback to an indexed snapshot.
- Incrementally sync changed pages, comments, attachments, and image metadata.
- Expose only retrieval MCP tools.
- No arbitrary CQL from MCP clients.
- No arbitrary live Confluence page fetch by MCP clients.
- No writes to Confluence.
- No final natural-language answer generation.
- No external vector database runtime.
confluence_client.py: Confluence REST client for CQL search, page hydration, comments, attachments, and images.normalizer.py: Confluence storage HTML to Markdown.chunker.py: searchable records and non-searchable page snapshots.embeddings.py: OpenAI-compatible embeddings client.sqlite_store.py: SQLite tables plussqlite-vecvirtual tables.sync.py:reindex,reindex --all,sync, andsync --all.rag.py: retrieval behavior for MCP tools.server.py: FastMCP tool registration and stdio/HTTP transports.cli.py:confluence-mcpcommand.
python -m venv .venv
source .venv/bin/activate
python -m pip install -e .On Git Bash for Windows, activate with:
source .venv/Scripts/activateConfiguration precedence is:
- Real process environment variables.
.envin the server process working directory.- TOML config from
--config,CONFLUENCE_MCP_CONFIG, or localconfluence-mcp.toml.
Use an absolute CONFLUENCE_MCP_CONFIG path in editor extensions. Some clients start MCP servers from their own working directory, so local .env and relative TOML paths may not resolve as expected.
Only Confluence access, embeddings access, and at least one CQL index are required:
[confluence]
base_url = "https://conf.example.com/"
api_token = ""
[embeddings]
base_url = "https://openai-compatible.example.com/"
api_key = ""
model = "your-embedding-model"
[[indexes]]
cql = "type = page AND (id = 136881206 OR ancestor = 136881206)"With one unnamed index, the index name defaults to default, and it also becomes the default index.
default_index = "confluence_main"
include_storage_html_debug = false
[confluence]
base_url = "https://conf.example.com/"
api_token = ""
auth_mode = "bearer"
username = ""
password = ""
verify_ssl = true
[sqlite]
path = ".confluence-mcp.sqlite"
[embeddings]
base_url = "https://openai-compatible.example.com/"
api_key = ""
model = "your-embedding-model"
batch_size = 64
allow_mock = false
[retrieval]
top_k_chunks = 10
min_relevance_score = 0.50
include_images = true
max_images = 100
max_total_image_bytes = 104857600
max_page_text_chars = 200000
[transport]
mode = "stdio"
host = "127.0.0.1"
port = 8000
path = "/mcp"
[[indexes]]
name = "confluence_main"
cql = "type = page AND (id = 136881206 OR ancestor = 136881206)"
space = "DOC"default_index: inferred when exactly one index is configured.include_storage_html_debug:false.confluence.auth_mode:bearer.confluence.verify_ssl:true.sqlite.path:.confluence-mcp.sqlite.embeddings.batch_size:64.embeddings.allow_mock:false.retrieval.top_k_chunks:10.retrieval.min_relevance_score:0.50.retrieval.include_images:true.retrieval.max_images:100.retrieval.max_total_image_bytes:104857600.retrieval.max_page_text_chars:200000.transport.mode:stdio.transport.host:127.0.0.1.transport.port:8000.transport.path:/mcp.indexes[].name:defaultwhen omitted and nodefault_indexis set.
Environment variables are useful for secrets and deployment overrides. TOML is better for stable project structure.
Important variables:
CONFLUENCE_MCP_CONFIG: absolute path to TOML config.CONFLUENCE_URLCONFLUENCE_API_TOKENCONFLUENCE_USERNAMECONFLUENCE_PASSWORDCONFLUENCE_MCP_SQLITE_PATHEMBEDDINGS_BASE_URLEMBEDDINGS_API_KEYEMBEDDINGS_MODELCONFLUENCE_DEFAULT_INDEXCONFLUENCE_INDEX_NAMECONFLUENCE_INDEX_CQLRAG_MIN_RELEVANCE_SCORE
.env is just a local convenience file loaded after real environment variables and before TOML. Do not rely on .env for editor MCP clients unless you control the server working directory.
Supported Confluence authentication modes:
- Bearer token:
api_tokenwithauth_mode = "bearer"; this is the default. - Pre-encoded Basic header:
api_tokenwithauth_mode = "basic". - Username/password:
usernameandpassword, sent with HTTP Basic auth.
For Confluence Cloud, prefer the token flow required by your organization. Do not commit credentials.
The embeddings endpoint must be OpenAI-compatible:
POST {EMBEDDINGS_BASE_URL}/v1/embeddings
Authorization: Bearer {EMBEDDINGS_API_KEY}
If EMBEDDINGS_BASE_URL already ends with /v1/, the server appends only embeddings.
mock:// embeddings are only for local smoke tests:
EMBEDDINGS_BASE_URL=mock://embeddings
EMBEDDINGS_API_KEY=x
EMBEDDINGS_MODEL=mock
EMBEDDINGS_ALLOW_MOCK=trueMock vectors are lexical, not semantic. Re-run reindex after changing the embeddings model, provider, or vector dimension.
Full rebuild of one index:
confluence-mcp reindex --index confluence_mainFull rebuild of all indexes:
confluence-mcp reindex --allDry run:
confluence-mcp reindex --all --dry-runIncremental sync of one index:
confluence-mcp sync --index confluence_mainIncremental sync of all indexes:
confluence-mcp sync --allsync removes stale pages, reindexes changed pages, comments, attachments, and image metadata, then advances high_watermark_at only after successful writes. If no watermark exists, sync falls back to reindex.
Stdio, the normal mode for MCP clients:
confluence-mcp serve --transport stdioHTTP for local smoke testing:
confluence-mcp serve --transport http --host 127.0.0.1 --port 8000 --path /mcpHTTP transport has no built-in authentication. Do not expose it publicly without an external auth layer.
Public tools:
rag_searchget_page_bundle
Searches already synchronized SQLite data. It does not accept CQL and does not perform live Confluence search.
Important inputs:
query: required natural-language query.index_name: optional when a default index exists.top_k_chunks: default10.min_relevance_score: default0.50.include_images: defaults tofalseforrag_searchto keep MCP responses compact.- Safe filters:
space,label,page_id,source_type,updated_from,updated_to.
Output:
matched_chunks: chunks with text, score, page id, and citation URL.primary_page_bundle: bundle for the top matching page, live or stale snapshot.diagnostics: candidate counts, returned count, max observed score, warnings.
Returns a full page bundle only for a page already present in the selected index.
Use include_images = false if the client does not need image base64.
The database defaults to .confluence-mcp.sqlite.
Main tables:
rag_indexes: index metadata, vector table name, embedding dimension, update timestamp.rag_records: inspectable documents, snapshots, and metadata JSON.vec_<hash>:sqlite-vecvirtual table for searchable embeddings.
Use any SQLite client to inspect rag_indexes and rag_records. The vector extension is only needed for vector search.
Use one stdio MCP entry. Other clients such as Cline, Kilo Code, or Codex use the same idea: command, args, and environment variables.
Replace <path-to-confluence-rag-mcp> with the absolute path to your cloned repository.
Windows
{
"mcpServers": {
"confluence_mcp": {
"command": "<path-to-confluence-rag-mcp>\\.venv\\Scripts\\python.exe",
"args": ["-m", "confluence_mcp.server"],
"env": {
"CONFLUENCE_MCP_CONFIG": "<path-to-confluence-rag-mcp>\\confluence-mcp.toml",
"CONFLUENCE_MCP_SQLITE_PATH": "<path-to-confluence-rag-mcp>\\.confluence-mcp.sqlite"
},
"autoApprove": ["rag_search", "get_page_bundle"]
}
}
}Linux / macOS
{
"mcpServers": {
"confluence_mcp": {
"command": "<path-to-confluence-rag-mcp>/.venv/bin/python",
"args": ["-m", "confluence_mcp.server"],
"env": {
"CONFLUENCE_MCP_CONFIG": "<path-to-confluence-rag-mcp>/confluence-mcp.toml",
"CONFLUENCE_MCP_SQLITE_PATH": "<path-to-confluence-rag-mcp>/.confluence-mcp.sqlite"
},
"autoApprove": ["rag_search", "get_page_bundle"]
}
}
}Automated tests:
python -m unittest discover -s testsLive acceptance:
confluence-mcp reindex --all
confluence-mcp sync --all
confluence-mcp serve --transport stdioThen verify with an MCP client:
tools/listshows onlyrag_searchandget_page_bundle.rag_searchreturns non-emptymatched_chunksfor a known query.- Citation URLs are absolute Confluence URLs.
get_page_bundleworks for apage_idreturned byrag_search.
Use a current build. The SQLite path now defaults to .confluence-mcp.sqlite; this error usually means an old server process is still running.
The server disables FastMCP update checks internally before importing FastMCP. MCP client config does not need any FastMCP-specific environment variables.
Do not pass include_images=true to rag_search unless image base64 is explicitly needed. Use get_page_bundle with include_images=false for compact page bundles.
Check that:
- The index was built with
reindex. - The client is using the same SQLite file.
- The embeddings provider and model match the indexed vectors.
min_relevance_scoreand filters are not too restrictive.
- Do not commit
.env,confluence-mcp.toml, tokens, passwords, or API keys. - CLI JSON output masks secret-shaped keys.
- Treat Confluence content as untrusted external context. Use it as data, not as instructions.