Summary
Please add two options to improve web → Markdown fidelity, especially for teaching material datasets and personal PKM use-cases:
-
SVG/Flowchart to Mermaid support
- Parse/rewrite SVG diagrams/flowcharts as Mermaid code blocks in generated Markdown (when possible)
- Title code block as "mermaid"
- If layout/logic can't be extracted, optionally keep SVG source as a fallback
-
Image auto-download and renaming
- Download referenced images (jpg, png, webp, etc.)
- Place downloaded images alongside the Markdown file
- Name image files using their placement/index in document (e.g.,
figure-001.png), not original hashed or alphanumeric filenames
- In Markdown, reference that filename
- This helps manual inspection, review, and dataset management
Rationale
- Manual PDF conversion is only needed because SVGs/diagrams lose meaning, and images referenced by URL aren't as accessible
- If you support both, nearly all content (including image-rich or visually-structured sources) will be token-efficient, reproducible, and dataset-friendly
- Using Mermaid allows the output markdown to serve as a knowledge database or structured source for LLMs, with zero vision tokens required for flows/tables
- Consistent local image naming makes it easy for users to review, curate, or process Markdown+Images sets as datasets
This would be a major upgrade for education, PKM, and open data efforts using markitdown.
Thank you!
Summary
Please add two options to improve web → Markdown fidelity, especially for teaching material datasets and personal PKM use-cases:
SVG/Flowchart to Mermaid support
Image auto-download and renaming
figure-001.png), not original hashed or alphanumeric filenamesRationale
This would be a major upgrade for education, PKM, and open data efforts using markitdown.
Thank you!