New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export text as text in page.getSVGimage() #580
Comments
Thanks for bringing this up! If you want to generate a modified PyMuPDF source yourself, you need SWIG. Invoke it like so:
It will output |
Thanks for bringing up SWIG. I'll look into it ! |
... and of course an installed MuPDF! |
Don't know yet. Should be easy to check that out. |
Change already done - easy peasy. Indeed: using the text-as-text option creates a much smaller file, but correct display in a browser depends on the presence of the fonts named in the SVG. TextFile size 72 KB. PathFile size 652 KB. ConclusionThe text-as-path option represents the current situation. Changing this would require people to change scripts if they need previous behaviour. |
Oh, wow ! that "fixed width" font is anything but "fixed width" 😂 .
Indeed, I didn't realise. A quick search on github shows that it isn't much used, but thre are still a couple of cases, so not a good idea to introduce such a breaking change. It's OK to default to True then. Thanks a lot for the speed of getting it! |
By installed MuPDF, do you mean from source ? Or is there a I'd really like to experiment with this new text without waiting for the new release :"> |
That' because the browser doesn't understand that the word "Courier" with the font's name should lead it to choose a fixed font. If you take the svg source and replace that fontname with just "Courier", everything looks immediately better. That is what I have been referring to.
Don't know if there is such a thing. It must be v1.17.0 at any rate. But go ahead and try installing by the recipe given in the repo folder "installation". It is an easy thing to do. |
Ah, that would save me a bit of trouble. Thanks a lot ! It's |
good luck - hope I haven't introduced some f**up |
Works beautifully so far 😁 . Thanks a lot ! |
Is your feature request related to a problem? Please describe.
Sometimes I want to get the SVG of a page, as it allows me to analyse the contents a bit better than the PDF. I currently have a method of doing it, but I noticed that all glyphs of a font are done as a
<path>
, which is then referenced multiple times. This makes it a bit difficult to disambiguate between paths that correspond to text, and those which do not.Describe the solution you'd like
I noticed that there is a possibility to export to SVG while keeping the text as text by passing a flag to
fz_new_svg_device()
, which is currently hardcodedPyMuPDF/fitz/fitz.i
Lines 3198 to 3201 in 12d0201
Describe alternatives you've considered
Compile PyMuPDF from source and do the modification on my side. But I'm not sure which file to modify on PyMuPDF, as the source seems to be in a huge
fitz.i
file, which seems to be the output of the compiler ?The text was updated successfully, but these errors were encountered: