Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide HTML5 writer option to embed SVG instead of converting it. #8948

Closed
nxn-4-wdf opened this issue Jul 10, 2023 · 12 comments
Closed

Provide HTML5 writer option to embed SVG instead of converting it. #8948

nxn-4-wdf opened this issue Jul 10, 2023 · 12 comments

Comments

@nxn-4-wdf
Copy link

Describe your proposed improvement and the problem it solves.

I convert from Markdown to HTML5 with options --embed-resources --standalone.
When the document contains SVG images, they are converted by librsvg, and then embedded with a data URI.

Problems: this causes a loss of resolution and a big increase in the output size. Moreover, librsvg may not be available on all systems.

Because the HTML 5 syntax accepts inline SVG, Pandoc could simply include these files with an additional option like --inline-svg.

A few changes in the included SVG are necessary. But I think that it does not need a full-featured SVG parser.
For example, a typical SVG file contains:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="781px" height="401px" viewBox="-0.5 -0.5 781 401">
...
</svg>

The <?xml> and <!DOCTYPE> must be dropped.
In the <svg> tag, you can safely remove remove both xmlns attributes and version too.
The resulting HTML code would contain:

<svg width="781px" height="401px" viewBox="-0.5 -0.5 781 401">
...
</svg>

Some SVG files do not include width and height attributes. The default values from the HTML stylesheet (300px and 150px) may not be appropriate, therefore I propose the following additional logic:

  1. When <svg> contains only the viewBox attribute, compute and add the width and height from these values.
    For example viewBox="1 2 300 400" would mean additional attributes: width="299" height="398" (Unit px is not necessary).
  2. If viewBox is missing but width and height are present, similarly create a viewBox.
    Fox example, width="100px" height="200px" would create a viewBox="0 0 100 200"
  3. If both are missing, then do nothing.

Other improvements are possible, but they require more logic:

  1. Detect when a SVG file is used multiple times: in this case, include it only once, generating an id attribute for it. Then the next reference would be only a <svg width=...><use href="#the-previously-included-svg-id"/></svg>
  2. To include SVG files with fragment identifiers (to <view>, with the :target or with <symbol>):
    Idea would be to create in the HTML a <section style="display: none;"> and include those external <svg> inside it.
    Then reference them with <svg width=...><use href="... as previously. Width, height and viewBox computed from the target element attributes if possible, else from its parent SVG element.

Describe alternatives you've considered.

Maybe an additional SVG library could do the job, but it would add an external dependency, like with librsvg.
Unless there is a library you can embed with the cabal build system ?

@jgm
Copy link
Owner

jgm commented Jul 10, 2023

This should be doable. All we would need is an XML parser & renderer, which we depend on already.

@jgm
Copy link
Owner

jgm commented Jul 10, 2023

I don't think we should have an additional option. I think this should just be default behavior for HTML5. We could keep the other behavior for HTML4.

@nxn-4-wdf
Copy link
Author

OK, thanks.

Another potential issue I see: duplicate id attributes coming from multiple included files.
But Pandoc already detects potential duplicate IDs and has a system to prefix them, right ? For example, prefix an ID with a derivation from the file name ?

Then you would just have to ensure that the href are correctly set.

As for it being default or not, it is your choice. I tend to be conservative, sometimes people expect "bug-to-bug compatibility". Maybe someone relies on the SVGs being converted to PNGs and included in data-URIs...

@jgm
Copy link
Owner

jgm commented Jul 10, 2023

Maybe someone relies on the SVGs being converted to PNGs and included in data-URIs...

The only reason I can think of for wanting to do this would be if inline SVG support isn't universal for HTML5 renderers. Is that an issue?

In general, I have to balance between satisfying everyone's expectations and adding additional complexity (more options).

@jgm
Copy link
Owner

jgm commented Jul 10, 2023

As for the ids, I think I'd just generate ids based on the sha1 hash of the contents. Keep track of generated ids that have been used, and if we get a duplicate, use the href form to refer back to the already included one.

@nxn-4-wdf
Copy link
Author

The only reason I can think of for wanting to do this would be if inline SVG support isn't universal for HTML5 renderers. Is that an issue?

Not for me.
Inline SVG is pretty well supported, 98.39% support according to: https://caniuse.com/?search=inline%20svg
Many renderers are based on Chrome, that should even more reduce the impact.

@jgm
Copy link
Owner

jgm commented Jul 10, 2023

One thing seems wrong in your request. I wonder if you're using a really old version of pandoc? We don't convert to png; we just include a data URI that encodes the SVG itself.

So: there is no loss in resolution, and only a modest increase in data size. (e.g. a 9K SVG expands to around 15K in the data URI; though if there are many references to the same SVG, you'll get duplicates, and that still might motivate this change).

@jgm
Copy link
Owner

jgm commented Jul 10, 2023

My immediate question is how to handle other attributes on the img tag. If we just have src, fine. But an img tag may also have alt, class, style, and others. These will be lost if we just insert an svg element. They can be retained with a data URI.

@nxn-4-wdf
Copy link
Author

nxn-4-wdf commented Jul 10, 2023

You can put all these attributes into the new svg tag.
If they collide with the attributes from the SVG file, then preferably use the attributes from img.

@nxn-4-wdf
Copy link
Author

To be more precise: if an image has attributes (alt, class, style, ...), this means that the user specified them on purpose.
For example, with:

An inline ![image](foo.svg){#myid .myclass width=300 height=200px}

In this case, the user-specified attributes should override the attributes imported or generated from the svg file.
With the example above, this would result in:

<svg id="myid" class="myclass" width="300" height="200px" viewBox="-0.5 -0.5 781 401">
...
</svg>

@nxn-4-wdf
Copy link
Author

One thing seems wrong in your request. I wonder if you're using a really old version of pandoc? We don't convert to png; we just include a data URI that encodes the SVG itself.

I had some issues with SVG a few years ago, on a Linux (Suse 12, if I recall correctly) where it complained that librsvg was missing, and I could not install it for whatever reason. As librsvg converts svg to png, I assumed that it was the missing part.

@jgm jgm closed this as completed in 94832af Jul 11, 2023
@nxn-4-wdf
Copy link
Author

@jgm Thanks a lot for the quick implementation ! 👏

I've let a comment for commit 94832af in GH, hope it helps.

jgm added a commit that referenced this issue Jul 13, 2023
jgm added a commit that referenced this issue Nov 25, 2023
- Ensure unique ids for elements by prefixing SVG id.
- Ensure SVG `id` attribute except when `use` element is used.
- Remove `width`, `height` attributes from svg element when `use`
  element is used. Instead, add `width` and `height` 100% to the
  `use` element. This seems to get the sizing right.

Closes #9206.

Ref: #8948.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants