Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent text rendering failures- "Text layouting failed" #614

Closed
randomairborne opened this issue Apr 19, 2023 · 19 comments
Closed

Consistent text rendering failures- "Text layouting failed" #614

randomairborne opened this issue Apr 19, 2023 · 19 comments
Labels

Comments

@randomairborne
Copy link

randomairborne commented Apr 19, 2023

i am using resvg to render rank cards for a Discord service. Sometimes, some usernames will fail to render, despite using very similar settings, including fonts, to the other text on the page, which has
no problems rendering. The code that does this is available at randomairborne/experienced/xpd-rank-card. What has me confused here is that some text
with some settings works, but other text doesn't- and a different name will have different results.

Examples:

broken svg

working svg

broken render

working render

@RazrFalcon
Copy link
Owner

Interesting. Where can I find this font?

@randomairborne
Copy link
Author

@RazrFalcon
Copy link
Owner

Wow! That was weird bug. You have been able to hit resvg text layout limitation which I haven't had time to fix.

The problem is that your SVG text:

<text x="135" y="60">
    <tspan class="name font">ItsMeAlfie0</tspan>
    <tspan class="discriminator font">#0001</tspan>
</text>

would be parsed as:

<text x="135" y="60">
<tspan font-family="Mojang, sans-serif" font-size="25px" fill="#FFFFFF">ItsMeAlfie0</tspan>
<tspan font-family="Roboto" font-size="12px" fill="#FFFFFF"> </tspan> <!-- yes, the space between tspans is significant -->
<tspan font-family="Mojang, sans-serif" font-size="10px" fill="#CCCCCC">#0001</tspan>
</text>

And since you haven't set the font for whitespaces - it would fallback to the default one, which is Roboto in your case.

Now, resvg would try to shape your string using two fonts: Mojang and Roboto. And Roboto support ligatures substitution, so fi (U+0066 and U+0069) would become (U+FB01). Which is a single glyph and not two. resvg, as of now, doesn't support this and will fail.
Welcome to the world of the modern text layout...

I will try to fix it, but in the mean time you can easily fix this from your side by setting class="font" on the text element itself, so all whitespaces would be set to Mojang as well.

@RazrFalcon RazrFalcon added the bug label Apr 21, 2023
@randomairborne
Copy link
Author

Wow, that's kind of insane! Thanks for taking the time to help out with this! I'll add the workaround to the library asap.

@geocybrid
Copy link

I'm hitting this problem in a slightly different scenario and trying to investigate, but here is what I know so far:

  • Layout fails for the penultimate line of the attached svg, the one with font-family override.
  • The "default" font is "Rubik..." and it has this magic glyph merging feature between "f" and "i". The Comic Neue does not.
  • Most tspans are rendered in two pieces for some reason not yet understood by me. The first span is [0,N-1] and the second span is [N-1, N]. In case of the offending line, the first span uses the overriden font (cursive aka Comic), while the tail for some reason uses Rubik (i.e. css font).
  • The tail generates 1 less glyph due to that merging thing and fails the layout. If I disable the check, I get all the letters except the last one rendered correctly. And of course, if I put space between f and i - the problem goes away.
  • If I dump the chunk text going into the loop of outline_chunk function, it appears to have trailing spaces, despite the SVG tspan not having any. Looking into this now.
  • I'm calling usvg directly, not resvg. I'm manually feeding the fonts, so that the cursive is Comic Neue and Rubik Scribble is being loaded despite lack of @ rule support. All three fonts render OK and are all from Google fonts.

P.S. thanks for the awesome lib! Having to do any processing on SVG from scratch would be a nightmare :)

google_font_ttf

@RazrFalcon
Copy link
Owner

Yes, would have to find some time to figure it out.

Most tspans are rendered in two pieces for some reason not yet understood by me.

What do you mean by that? Do you have a minimal example?

If I dump the chunk text going into the loop of outline_chunk function, it appears to have trailing spaces, despite the SVG tspan not having any.

Do you have a minimal example of this? Whitespaces handling in SVG is very complex and unintuitive.

@LaurenzV
Copy link
Contributor

Also, it might help if you create a directory that contains all of the fonts that are needed for reproduction, and run resvg with --skip-system-fonts and --use-fonts-dir and then tell us which fonts you used. Otherwise it's hard to reproduce since every system has a different sets of fonts installed...

@RazrFalcon
Copy link
Owner

@LaurenzV Roboto is known to have this issue.

@geocybrid
Copy link

Thanks @RazrFalcon for quick response.

First the repro - added zip file with both the command I use as well as the SVG and the Google TTFs folder.
usvg_repro.zip

Now about the two piece tspan tag rendering thing. I've added logging to outline_chunk() and in the file I attached most tspan tags are represented by two TextSpans each. The first one is for the actual text and the next one is the whitespace between tspan tags.

The way I understand it, the whitespace between the consecutive tspan tags is appended to the last open TextChunk as a new TextSpan (in collect_text_chunks_impl, here). This whitespace TextSpan uses CSS font (as opposed to inline override), which causes the merged glyph problem for this particular image.

I've tried two things to fix this issue, but having zero idea about the intricacies of the SVG text handling, pretty sure my approaches are way too naive to work... Also, this is the first time I have ever looked at any rust source, so pardon my axe swinging ;)

What I tried so far:

  1. I tried to basically kill the branch that appends the TextSpan to existing TextChunk by adding || is_new_span to the condition here. This seems to solve the problem with this particular image, but probably breaks tons of other things that I have no idea about. Are the whitespaces supposed to have their own object or they are actually supposed to be part of the previous text?
  2. I tried to change outline_chunk to append TextSpans to glyphs vector instead of running the entire text every time and overwriting glyphs in place. This also prevented the crash as you can imagine, but introduced a lot of wasted vertical space. Probably caused by shape_text treating its parameter as a complete "line", i.e. whitespace TextSpans get their own lines.

So this is where I am at this point. Once again - thanks for taking time to look into this, really appreciate your work.

@RazrFalcon
Copy link
Owner

RazrFalcon commented Mar 31, 2024

:nth-child(2n)

This is not supported. resvg supports only basic CSS 2 selectors.

As for whitespaces - I'm not sure what is the problem here. resvg renders this SVG just fine (excluding the fonts bug). Why do you care about whitespaces to begin with? Because you try to render it yourself? Then you have to mimic how usvg converts text into paths. Which is super complicated.

If you plan to parse SVG and render text yourself - just give up. It would not work. You have to flatten text and render paths.
SVG text layout is bizarrely complex.

There are no parsing bugs here. usvg parses this SVG correctly.

@geocybrid
Copy link

Sorry - I didn't give context for why I care about this detail and I am not claiming there is any bug at all - just a feature to handle glyph merging that is not yet implemented.

Basically I'm trying to use usvg to pre-process user-supplied SVG files into something that can be sent to the CNC software, i.e. flatten text and complex shapes primarily.

I'm doing torture-tests to understand the level of compatibility and what to ask customers to bring. While doing this, I bumped into the glyph merging issue. It is relatively easy to explain to the customer that e.g. only TTF fonts or Google fonts are supported, but explaining to them that, in some rare cases the fonts with glyph substitution will fail is not so easy.

So I investigated why this specific failure is triggered when I don't seem to have mixed fonts in a single line. Here, I found that the whitespace is implicitly added with a different font and causes this issue. My first instinct was to "fix" it so that mixed fonts do not trigger this situation in general, or at least it doesn't blow up if there are no intentional mixed fonts in the same line. This way, it would be easier to specify what the customers could bring and what is not supported. This seems to be a rabbithole though.

Hopefully this explains my thinking a bit.

So, if the double rendering of tspan is how it is supposed to work, perhaps another option (for me) would be to pre-pre-process the SVG file and just kill those unintended whitespaces, may be even check for fonts with glyph substitution and reject the file (I'm already collecting fonts ahead of time from javascript).

Or ideally, we could try to implement this whole glyph merging support properly, if you give me some pointers on what to look out for.

@RazrFalcon
Copy link
Owner

Whitespaces handling is totally unrelated to this issue. It just happens to trigger it. There are different ways to do it.
The only solution here is to fix glyphs fallback.

So, if the double rendering of tspan is how it is supposed to work

There are no double rendering. There are no whitespaces during rendering to begin with. Only paths.

only TTF fonts or Google fonts are supported

Are there any other types of fonts than TrueType? 😄

@geocybrid
Copy link

geocybrid commented Mar 31, 2024

Again, sorry for the wording :) Since I'm using usvg and not full resvg, for me - the rendering means generation of paths. This happens in outline_chunk and the loop runs twice for each tspan+whitespace. It of course does nothing wrong - the path is also generated only once, it just happens to trigger this problem in a somewhat common scenario. That's what I meant :)

The offer still stands - if you have an idea of how you want this problem fixed, I'm ready to give it a shot.

Somewhat unrelated to that, but while we're at it... Would you be against exposing usvg as a library to allow usage from e.g. wasm?

Edited: my lack of rust knowledge again :) apparently this crate is a library and all the necessary symbols are already exported. Sorry for the noise.

@RazrFalcon
Copy link
Owner

I'm trying to fix the problem myself, but it's hard. I'm not really sure how it should be handled.

@geocybrid
Copy link

Couple of random thoughts, looking from a distance :)

  • The outline_chunk (now process_chunk) running the same string multiple times with different fonts seems to be causing this issue, so perhaps we could teach shape_text to process substrings without breaking the line?
  • rustybuzz::shape() seems to return the indices of the clusters in the source string in the GlyphInfo record, which are then packed into byte_idx of the Glyph struct. With this information, it should be possible to correctly iterate over the glyphs array and replace the right glyphs only. Of course, there are some corner cases here, like what happens if only the half of the glyph needs to be replaced. But this is a side effect of this approach and would probably be fine to handle with a warning.
  • Finally, we could choose to never join text chunks with different fonts into a single string. This would probably also mean that some secondary changes will be needed to not start a new line, like in the first item here.

@RazrFalcon
Copy link
Owner

Took almost a year, but should be fixed now. The solution is meh, but I don't have any better ideas at the moment.
I've also tried explaining the cause in the function's comment, for history. Not sure it would help anyone thought. It's a very tricky thing.

@RazrFalcon
Copy link
Owner

RazrFalcon commented Mar 31, 2024

so perhaps we could teach shape_text to process substrings without breaking the line?

It's very hard to do for mixed order text. There are plans to avoid shaping the whole string for each font (#486), but I have no idea if it is even possible. Way out of my expertise.

Of course, there are some corner cases here, like what happens if only the half of the glyph needs to be replaced.

That's basically what the new code does. And I'm still not very sure how correct it is.

Finally, we could choose to never join text chunks with different fonts into a single string.

This is not how SVG works. Also, there are no lines in SVG 1. SVG 2 does have lines now, but I'm not sure anyone actually supports them.

Overall, this is not a hard problem if we have only left-to-right text, but we don't.

If you want to understand what kind of monstrosity SVG text is - check out this example. Here, one tspan is rendered as two. How is this possible?! Welcome to SVG text.
How well it supported? About as how you would expect:
bidi-reordering

@geocybrid
Copy link

Amazing work! Thanks Yevhenii!

@randomairborne
Copy link
Author

Thank you, RazrFalcon!

RazrFalcon added a commit that referenced this issue Apr 3, 2024
RazrFalcon added a commit that referenced this issue Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants