Skip to content

Conversation

@clydin
Copy link
Member

@clydin clydin commented Oct 20, 2025

The search_documentation MCP tool previously used a regular expression and string searching to extract and clean documentation content from fetched HTML. This approach was not robust and could produce incorrect results. It also buffered the entire HTML response in memory before processing.

This commit refactors the implementation to use parse5-html-rewriting-stream, which is already a dependency in the workspace. The new implementation streams the fetch response directly into a single-pass parser that simultaneously extracts the <main> element's content and strips all HTML tags.

This change makes the parsing more reliable, efficient, and memory-friendly.

…tion tool

The `search_documentation` MCP tool previously used a regular expression and string searching to extract and clean documentation content from fetched HTML. This approach was not robust and could produce incorrect results. It also buffered the entire HTML response in memory before processing.

This commit refactors the implementation to use `parse5-html-rewriting-stream`, which is already a dependency in the workspace. The new implementation streams the `fetch` response directly into a single-pass parser that simultaneously extracts the `<main>` element's content and strips all HTML tags.

This change makes the parsing more reliable, efficient, and memory-friendly.
@clydin clydin added the target: major This PR is targeted for the next major release label Oct 20, 2025
@clydin clydin added the action: review The PR is still awaiting reviews from at least one requested reviewer label Oct 20, 2025
@clydin clydin requested a review from alan-agius4 October 21, 2025 00:00
@alan-agius4 alan-agius4 added action: merge The PR is ready for merge by the caretaker and removed action: review The PR is still awaiting reviews from at least one requested reviewer labels Oct 21, 2025
@clydin clydin merged commit 434daef into angular:main Oct 21, 2025
35 checks passed
@clydin clydin deleted the mcp/doc-search-main-extraction branch October 21, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action: merge The PR is ready for merge by the caretaker area: @angular/cli target: major This PR is targeted for the next major release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants