Skip to content

Feat: Implement SVG-to-Mermaid conversion and image auto-download (#2012)#2048

Open
Andrii-K-17 wants to merge 2 commits into
microsoft:mainfrom
Andrii-K-17:main
Open

Feat: Implement SVG-to-Mermaid conversion and image auto-download (#2012)#2048
Andrii-K-17 wants to merge 2 commits into
microsoft:mainfrom
Andrii-K-17:main

Conversation

@Andrii-K-17
Copy link
Copy Markdown

Description

This PR fully addresses #2012 by implementing both the SVG-to-Mermaid conversion layer and the automated image downloading/renaming pipeline.

  • SVG to Mermaid Converter: Added SvgConverter(DocumentConverter) to handle standalone .svg files. It utilizes an LLM client to parse and rewrite diagrams into Mermaid code blocks within the generated Markdown.
  • Inline SVG Handling: Implemented the convert_svg method to detect, extract, and convert inline elements directly embedded inside HTML/webpages, ensuring robust web-to-markdown fidelity.
  • Fallback Mechanism: Included a fallback that catches exceptions and preserves the original SVG source as a fenced xml code block if LLM extraction fails, is unconfigured, or returns SKIP.
  • Image Auto-download: Added automated downloading for referenced images (jpg, png, webp, etc.). Images are saved alongside the Markdown file using sequential naming (e.g., figure-001.png) and are correctly re-referenced in the final document.
  • Testing: Created test_svg_converter.py covering extension acceptance, happy paths, stream state restoration, and all fallback scenarios for standalone SVGs. Extended test_module_misc.py with inline HTML extraction and fallback tests using mocked LLM responses.

Related Issue

Closes #2012

Notes

This is my very first pull request! I wanted to tackle the entire issue, so the changes are split into two logical commits.

…TML (part of microsoft#2012)

- Add SvgConverter(DocumentConverter) for standalone .svg files using LLM client;
  falls back to fenced XML block on missing LLM, errors, or SKIP response
- Implement convert_svg() in _CustomMarkdownify to detect and convert inline <svg> elements in HTML into Mermaid code blocks
- Add _llm_svg() helper to send SVG to OpenAI-compatible client and return parsed response
- Add test_svg_converter.py covering accepts(), Mermaid conversion, and all fallback cases
- Add inline SVG tests in test_module_misc.py for Mermaid conversion and fallback scenarios
@Andrii-K-17
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: SVG to Mermaid & Auto Image Download for Faithful Markdown Conversion

1 participant