Skip to content

Commit

Permalink
[layout] Limit how far we skip when looking back
Browse files Browse the repository at this point in the history
See comments.
  • Loading branch information
behdad committed Feb 2, 2023
1 parent d18fd3f commit 85be877
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions src/hb-ot-layout-gsubgpos.hh
Original file line number Diff line number Diff line change
Expand Up @@ -578,6 +578,13 @@ struct hb_ot_apply_context_t :
unsigned stop = num_items - 1;
if (c->buffer->flags & HB_BUFFER_FLAG_PRODUCE_UNSAFE_TO_CONCAT)
stop = 1 - 1;

This comment has been minimized.

Copy link
@jfkthame

jfkthame Feb 2, 2023

Collaborator

It's pre-existing, but happened to catch my eye here: stop = 1 - 1 seems an odd way to write stop = 0 ... is that really what was intended?

This comment has been minimized.

Copy link
@behdad

behdad Feb 2, 2023

Author Member

Yes, the condition before it has num_items - 1; this one just replaces num_items with 1. I kept it for parallel.


/* When looking back, limit how far we search; this function is mostly
* used for looking back for base glyphs when attaching marks. If we
* don't limit, we can get O(n^2) behavior where n is the number of
* consecutive marks. */
stop = (unsigned) hb_max ((int) stop, (int) idx - HB_MAX_CONTEXT_LENGTH);

This comment has been minimized.

Copy link
@hselasky

hselasky Feb 2, 2023

Please check unsigned vs size_t .


while (idx > stop)
{
idx--;
Expand Down

18 comments on commit 85be877

@hselasky
Copy link

@hselasky hselasky commented on 85be877 Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you do to prevent the same base glyph to be drawn more than once? Like if antialiasing is enabled, it will show differently?

What if a different base glyph is after the limit, what will be drawn then? Will it not show those base glyphs then? What are the implications of that?

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you do to prevent the same base glyph to be drawn more than once? Like if antialiasing is enabled, it will show differently?

Depending on how the font is designed, they will either stack on top of each other, or overdrawn.

What if a different base glyph is after the limit, what will be drawn then? Will it not show those base glyphs then? What are the implications of that?

Again, depending on how the font is designed, they might appear to the left of the cluster as a new cluster, or overdrawn on the cluster, or any other wrong rendering.

@hselasky
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you do to prevent the same base glyph to be drawn more than once? Like if antialiasing is enabled, it will show differently?

Depending on how the font is designed, they will either stack on top of each other, or overdrawn.

May this affect pixel-by-pixel tests?

What if a different base glyph is after the limit, what will be drawn then? Will it not show those base glyphs then? What are the implications of that?

Again, depending on how the font is designed, they might appear to the left of the cluster as a new cluster, or overdrawn on the cluster, or any other wrong rendering.

Please read my question again, if you have N of these characters and M of other similar characters, then the limit you impose doesn't print all of these characters? What do you think about that?

I think you need to halt this change until a full analysis of the effects and side-effects this change has been performed! Any comments? There are simply too many cases here, that a simple min() will solve it!

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to halt this change until a full analysis of the effects and side-effects this change has been performed! Any comments? There are simply too many cases here, that a simple min() will solve it!

This change fixes the algorithmic misbehavior of O(n^2), and only that. Anything else (overprinting), etc, are separate issues, and this change does not try to address those. I've asked @jfkthame to review this change and we believe from HarfBuzz's point of view this is the adequate fix.

What do you do to prevent the same base glyph to be drawn more than once?

These kinds of suggestions belong in new issues, but I'm going to say we're very unlikely to do such parental-control on the input text. Specially since there's precedent that Unicode text sometimes uses multiple diacritic marks to mean different things.

@hselasky
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it fixes one problem and introduces another. 1 in 1 out, what should I say. I guess you have a plan here to move on!

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 in 1 out, what should I say

I think I don't understand. What new problem does this introduce?

@khaledhosny
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then the limit you impose doesn't print all of these characters?

That is not what is happening here (and HarfBuzz does not "paint" any thing). The limit here will prevent the marks from being attached to the base glyph, but they will still be left at their default (not attached) position. HB_MAX_CONTEXT_LENGTH is currently 64, and it is very unlikely that a base glyph will have more than 64 marks attached to it (may be excessive zalgo text, but we can safely dismiss this as irrelevant).

@dscorbett
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s an extreme case, but Noto Sans Duployan 3.001 does use more than 64 marks per base. For example, here is <U+1BC23, U+002C> in HarfBuzz 6.0.0:
𛰣,
and with this commit:
𛰣,
I’m not sure what I’m going to do about it yet.

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s an extreme case, but Noto Sans Duployan 3.001 does use more than 64 marks per base. For example, here is <U+1BC23, U+002C> in HarfBuzz 6.0.0:

Can you elaborate what you do with that?

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s an extreme case, but Noto Sans Duployan 3.001 does use more than 64 marks per base. For example, here is <U+1BC23, U+002C> in HarfBuzz 6.0.0:

Can you use mkmk, which wouldn't induce the O(n^2) behavior?

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if I can solve the O(n^2) issue without putting a limit at all. Let me try.

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if I can solve the O(n^2) issue without putting a limit at all. Let me try.

This turns out to be non-trivial to do properly given that such lookup can be recursed-to from (Chain)Context lookups :(.

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if I can solve the O(n^2) issue without putting a limit at all. Let me try.

This turns out to be non-trivial to do properly given that such lookup can be recursed-to from (Chain)Context lookups :(.

Okay I did this. But now the unsafe_to_break code is exhibiting O(n^2) behavior. I don't think there's a way to win this battle...

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if I can solve the O(n^2) issue without putting a limit at all. Let me try.

This turns out to be non-trivial to do properly given that such lookup can be recursed-to from (Chain)Context lookups :(.

Okay I did this. But now the unsafe_to_break code is exhibiting O(n^2) behavior. I don't think there's a way to win this battle...

At least now I can set a limit to 256 perhaps.

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I did this. But now the unsafe_to_break code is exhibiting O(n^2) behavior. I don't think there's a way to win this battle...

Okay I fixed that. This should work for now for this test.

I'm sure one can construct tests to hit the merge_clusters codepath that would still exhibit O(n^2) behavior, and I don't know how to fix that one.

@behdad
Copy link
Member Author

@behdad behdad commented on 85be877 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I audited merge_clusters and put some safety-nets in place there as well. Fingers crossed.

@SoumyaWind
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
I see the above commit(85be877) is reverted by commit (661050b). Is there any patch to fix this - [https://nvd.nist.gov/vuln/detail/CVE-2023-25193]? Thanks.

@khaledhosny
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I see the above commit(85be877) is reverted by commit (661050b). Is there any patch to fix this - [https://nvd.nist.gov/vuln/detail/CVE-2023-25193]? Thanks.

8708b9e

Please sign in to comment.