Skip to content

Export text as text in page.getSVGimage() #580

@cipri-tom

Description

@cipri-tom

Is your feature request related to a problem? Please describe.
Sometimes I want to get the SVG of a page, as it allows me to analyse the contents a bit better than the PDF. I currently have a method of doing it, but I noticed that all glyphs of a font are done as a <path>, which is then referenced multiple times. This makes it a bit difficult to disambiguate between paths that correspond to text, and those which do not.

Describe the solution you'd like
I noticed that there is a possibility to export to SVG while keeping the text as text by passing a flag to fz_new_svg_device(), which is currently hardcoded

PyMuPDF/fitz/fitz.i

Lines 3198 to 3201 in 12d0201

dev = fz_new_svg_device(gctx, out,
tbounds.x1-tbounds.x0, // width
tbounds.y1-tbounds.y0, // height
FZ_SVG_TEXT_AS_PATH, 1);

Describe alternatives you've considered
Compile PyMuPDF from source and do the modification on my side. But I'm not sure which file to modify on PyMuPDF, as the source seems to be in a huge fitz.i file, which seems to be the output of the compiler ?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions