Skip to content

A lot of "Chunk exceeds max size" in logsΒ #314

@Barafu

Description

@Barafu

I get a lot of strings like this in log:

[docs-mcp-server] | πŸ“š Adding processed content: Chain in anyhow - Rust
[docs-mcp-server] | βœ‚  Storing 27 pre-split chunks
[docs-mcp-server] | ⚠  Chunk 4/27 exceeds max size: 5142 > 5000 chars (URL: https://docs.rs/anyhow/latest/anyhow/struct.Chain.html)
[docs-mcp-server] | ⚠  Chunk 5/27 exceeds max size: 5237 > 5000 chars (URL: https://docs.rs/anyhow/latest/anyhow/struct.Chain.html)
[docs-mcp-server] | ⚠  Chunk 6/27 exceeds max size: 5169 > 5000 chars (URL: https://docs.rs/anyhow/latest/anyhow/struct.Chain.html)
[docs-mcp-server] | ⚠  Chunk 7/27 exceeds max size: 5174 > 5000 chars (URL: https://docs.rs/anyhow/latest/anyhow/struct.Chain.html)
[docs-mcp-server] | ⚠  Chunk 10/27 exceeds max size: 5066 > 5000 chars (URL: https://docs.rs/anyhow/latest/anyhow/struct.Chain.html)
[docs-mcp-server] | ⚠  Chunk 14/27 exceeds max size: 5328 > 5000 chars (URL: https://docs.rs/anyhow/latest/anyhow/struct.Chain.html)
[docs-mcp-server] | ⚠  Chunk 26/27 exceeds max size: 5155 > 5000 chars (URL: https://docs.rs/anyhow/latest/anyhow/struct.Chain.html)
[docs-mcp-server] | 🌐 Scraping page 11/21 (depth 2/3): https://docs.rs/anyhow/latest/anyhow/index.html?search=u32+-%3E+bool
[docs-mcp-server] | πŸ“š Adding processed content: "u32 -> bool" Search - Rust

Is it something for a user to care about?

I was adding this URL with default settings: https://docs.rs/anyhow/latest/anyhow/index.html
I use ollama for embedding with embeddinggemma:300m.
It says "Job completed" in the end.

I also get a lot of "time=2026-01-27T20:01:50.541Z level=INFO source=server.go:1748 msg="llm embedding error: the input length exceeds the context length" in ollama log

I have read this
I am using the docker image with the latest tag, so this is either a different problem or the solution did not get to docker image.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions