confluence-mcp

confluence-mcp is a standalone MCP server for retrieval from a pre-synchronized Confluence RAG index.

The server does not write the final user-facing answer. It returns indexed chunks, citations, diagnostics, and page bundles; the MCP client or agent writes the final answer.

Capabilities

Index Confluence pages selected by configured CQL.
Store documents and vectors in one local SQLite database with sqlite-vec.
Search the synchronized index semantically.
Return absolute Confluence URLs in citations.
Return a full page bundle only for pages already present in the index.
Live-hydrate the primary matching page from Confluence, with fallback to an indexed snapshot.
Incrementally sync changed pages, comments, attachments, and image metadata.
Expose only retrieval MCP tools.

Non-Goals

No arbitrary CQL from MCP clients.
No arbitrary live Confluence page fetch by MCP clients.
No writes to Confluence.
No final natural-language answer generation.
No external vector database runtime.

Architecture

confluence_client.py: Confluence REST client for CQL search, page hydration, comments, attachments, and images.
normalizer.py: Confluence storage HTML to Markdown.
chunker.py: searchable records and non-searchable page snapshots.
embeddings.py: OpenAI-compatible embeddings client.
sqlite_store.py: SQLite tables plus sqlite-vec virtual tables.
sync.py: reindex, reindex --all, sync, and sync --all.
rag.py: retrieval behavior for MCP tools.
server.py: FastMCP tool registration and stdio/HTTP transports.
cli.py: confluence-mcp command.

Install

python -m venv .venv
source .venv/bin/activate
python -m pip install -e .

On Git Bash for Windows, activate with:

source .venv/Scripts/activate

Configuration

Configuration precedence is:

Real process environment variables.
.env in the server process working directory.
TOML config from --config, CONFLUENCE_MCP_CONFIG, or local confluence-mcp.toml.

Use an absolute CONFLUENCE_MCP_CONFIG path in editor extensions. Some clients start MCP servers from their own working directory, so local .env and relative TOML paths may not resolve as expected.

Minimal TOML

Only Confluence access, embeddings access, and at least one CQL index are required:

[confluence]
base_url = "https://conf.example.com/"
api_token = ""

[embeddings]
base_url = "https://openai-compatible.example.com/"
api_key = ""
model = "your-embedding-model"

[[indexes]]
cql = "type = page AND (id = 136881206 OR ancestor = 136881206)"

With one unnamed index, the index name defaults to default, and it also becomes the default index.

Full TOML

default_index = "confluence_main"
include_storage_html_debug = false

[confluence]
base_url = "https://conf.example.com/"
api_token = ""
auth_mode = "bearer"
username = ""
password = ""
verify_ssl = true

[sqlite]
path = ".confluence-mcp.sqlite"

[embeddings]
base_url = "https://openai-compatible.example.com/"
api_key = ""
model = "your-embedding-model"
batch_size = 64
allow_mock = false

[retrieval]
top_k_chunks = 10
min_relevance_score = 0.50
include_images = true
max_images = 100
max_total_image_bytes = 104857600
max_page_text_chars = 200000

[transport]
mode = "stdio"
host = "127.0.0.1"
port = 8000
path = "/mcp"

[[indexes]]
name = "confluence_main"
cql = "type = page AND (id = 136881206 OR ancestor = 136881206)"
space = "DOC"

Defaults

default_index: inferred when exactly one index is configured.
include_storage_html_debug: false.
confluence.auth_mode: bearer.
confluence.verify_ssl: true.
sqlite.path: .confluence-mcp.sqlite.
embeddings.batch_size: 64.
embeddings.allow_mock: false.
retrieval.top_k_chunks: 10.
retrieval.min_relevance_score: 0.50.
retrieval.include_images: true.
retrieval.max_images: 100.
retrieval.max_total_image_bytes: 104857600.
retrieval.max_page_text_chars: 200000.
transport.mode: stdio.
transport.host: 127.0.0.1.
transport.port: 8000.
transport.path: /mcp.
indexes[].name: default when omitted and no default_index is set.

Environment Variables

Environment variables are useful for secrets and deployment overrides. TOML is better for stable project structure.

Important variables:

CONFLUENCE_MCP_CONFIG: absolute path to TOML config.
CONFLUENCE_URL
CONFLUENCE_API_TOKEN
CONFLUENCE_USERNAME
CONFLUENCE_PASSWORD
CONFLUENCE_MCP_SQLITE_PATH
EMBEDDINGS_BASE_URL
EMBEDDINGS_API_KEY
EMBEDDINGS_MODEL
CONFLUENCE_DEFAULT_INDEX
CONFLUENCE_INDEX_NAME
CONFLUENCE_INDEX_CQL
RAG_MIN_RELEVANCE_SCORE

.env is just a local convenience file loaded after real environment variables and before TOML. Do not rely on .env for editor MCP clients unless you control the server working directory.

Authentication

Supported Confluence authentication modes:

Bearer token: api_token with auth_mode = "bearer"; this is the default.
Pre-encoded Basic header: api_token with auth_mode = "basic".
Username/password: username and password, sent with HTTP Basic auth.

For Confluence Cloud, prefer the token flow required by your organization. Do not commit credentials.

Embeddings

The embeddings endpoint must be OpenAI-compatible:

POST {EMBEDDINGS_BASE_URL}/v1/embeddings
Authorization: Bearer {EMBEDDINGS_API_KEY}

If EMBEDDINGS_BASE_URL already ends with /v1/, the server appends only embeddings.

mock:// embeddings are only for local smoke tests:

EMBEDDINGS_BASE_URL=mock://embeddings
EMBEDDINGS_API_KEY=x
EMBEDDINGS_MODEL=mock
EMBEDDINGS_ALLOW_MOCK=true

Mock vectors are lexical, not semantic. Re-run reindex after changing the embeddings model, provider, or vector dimension.

Indexing

Full rebuild of one index:

confluence-mcp reindex --index confluence_main

Full rebuild of all indexes:

confluence-mcp reindex --all

Dry run:

confluence-mcp reindex --all --dry-run

Incremental sync of one index:

confluence-mcp sync --index confluence_main

Incremental sync of all indexes:

confluence-mcp sync --all

sync removes stale pages, reindexes changed pages, comments, attachments, and image metadata, then advances high_watermark_at only after successful writes. If no watermark exists, sync falls back to reindex.

Running the MCP Server

Stdio, the normal mode for MCP clients:

confluence-mcp serve --transport stdio

HTTP for local smoke testing:

confluence-mcp serve --transport http --host 127.0.0.1 --port 8000 --path /mcp

HTTP transport has no built-in authentication. Do not expose it publicly without an external auth layer.

MCP Tools

Public tools:

rag_search
get_page_bundle

`rag_search`

Searches already synchronized SQLite data. It does not accept CQL and does not perform live Confluence search.

Important inputs:

query: required natural-language query.
index_name: optional when a default index exists.
top_k_chunks: default 10.
min_relevance_score: default 0.50.
include_images: defaults to false for rag_search to keep MCP responses compact.
Safe filters: space, label, page_id, source_type, updated_from, updated_to.

Output:

matched_chunks: chunks with text, score, page id, and citation URL.
primary_page_bundle: bundle for the top matching page, live or stale snapshot.
diagnostics: candidate counts, returned count, max observed score, warnings.

`get_page_bundle`

Returns a full page bundle only for a page already present in the selected index.

Use include_images = false if the client does not need image base64.

SQLite Storage

The database defaults to .confluence-mcp.sqlite.

Main tables:

rag_indexes: index metadata, vector table name, embedding dimension, update timestamp.
rag_records: inspectable documents, snapshots, and metadata JSON.
vec_<hash>: sqlite-vec virtual table for searchable embeddings.

Use any SQLite client to inspect rag_indexes and rag_records. The vector extension is only needed for vector search.

OpenClaw MCP Client Example

Use one stdio MCP entry. Other clients such as Cline, Kilo Code, or Codex use the same idea: command, args, and environment variables.

Replace <path-to-confluence-rag-mcp> with the absolute path to your cloned repository.

Windows

{
  "mcpServers": {
    "confluence_mcp": {
      "command": "<path-to-confluence-rag-mcp>\\.venv\\Scripts\\python.exe",
      "args": ["-m", "confluence_mcp.server"],
      "env": {
        "CONFLUENCE_MCP_CONFIG": "<path-to-confluence-rag-mcp>\\confluence-mcp.toml",
        "CONFLUENCE_MCP_SQLITE_PATH": "<path-to-confluence-rag-mcp>\\.confluence-mcp.sqlite"
      },
      "autoApprove": ["rag_search", "get_page_bundle"]
    }
  }
}

Linux / macOS

{
  "mcpServers": {
    "confluence_mcp": {
      "command": "<path-to-confluence-rag-mcp>/.venv/bin/python",
      "args": ["-m", "confluence_mcp.server"],
      "env": {
        "CONFLUENCE_MCP_CONFIG": "<path-to-confluence-rag-mcp>/confluence-mcp.toml",
        "CONFLUENCE_MCP_SQLITE_PATH": "<path-to-confluence-rag-mcp>/.confluence-mcp.sqlite"
      },
      "autoApprove": ["rag_search", "get_page_bundle"]
    }
  }
}

Verification

Automated tests:

python -m unittest discover -s tests

Live acceptance:

confluence-mcp reindex --all
confluence-mcp sync --all
confluence-mcp serve --transport stdio

Then verify with an MCP client:

tools/list shows only rag_search and get_page_bundle.
rag_search returns non-empty matched_chunks for a known query.
Citation URLs are absolute Confluence URLs.
get_page_bundle works for a page_id returned by rag_search.

Troubleshooting

`CONFLUENCE_MCP_SQLITE_PATH is required`

Use a current build. The SQLite path now defaults to .confluence-mcp.sqlite; this error usually means an old server process is still running.

FastMCP update checks

The server disables FastMCP update checks internally before importing FastMCP. MCP client config does not need any FastMCP-specific environment variables.

MCP output is truncated

Do not pass include_images=true to rag_search unless image base64 is explicitly needed. Use get_page_bundle with include_images=false for compact page bundles.

Search returns nothing

Check that:

The index was built with reindex.
The client is using the same SQLite file.
The embeddings provider and model match the indexed vectors.
min_relevance_score and filters are not too restrictive.

Security

Do not commit .env, confluence-mcp.toml, tokens, passwords, or API keys.
CLI JSON output masks secret-shaped keys.
Treat Confluence content as untrusted external context. Use it as data, not as instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
confluence_mcp		confluence_mcp
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
SKILL.md		SKILL.md
confluence-mcp.example.toml		confluence-mcp.example.toml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

confluence-mcp

Capabilities

Non-Goals

Architecture

Install

Configuration

Minimal TOML

Full TOML

Defaults

Environment Variables

Authentication

Embeddings

Indexing

Running the MCP Server

MCP Tools

`rag_search`

`get_page_bundle`

SQLite Storage

OpenClaw MCP Client Example

Verification

Troubleshooting

`CONFLUENCE_MCP_SQLITE_PATH is required`

FastMCP update checks

MCP output is truncated

Search returns nothing

Security

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

confluence-mcp

Capabilities

Non-Goals

Architecture

Install

Configuration

Minimal TOML

Full TOML

Defaults

Environment Variables

Authentication

Embeddings

Indexing

Running the MCP Server

MCP Tools

rag_search

get_page_bundle

SQLite Storage

OpenClaw MCP Client Example

Verification

Troubleshooting

CONFLUENCE_MCP_SQLITE_PATH is required

FastMCP update checks

MCP output is truncated

Search returns nothing

Security

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`rag_search`

`get_page_bundle`

`CONFLUENCE_MCP_SQLITE_PATH is required`

Packages