-
Notifications
You must be signed in to change notification settings - Fork 599
Commit
See comments.
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -578,6 +578,13 @@ struct hb_ot_apply_context_t : | |
unsigned stop = num_items - 1; | ||
if (c->buffer->flags & HB_BUFFER_FLAG_PRODUCE_UNSAFE_TO_CONCAT) | ||
stop = 1 - 1; | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
behdad
Author
Member
|
||
|
||
/* When looking back, limit how far we search; this function is mostly | ||
* used for looking back for base glyphs when attaching marks. If we | ||
* don't limit, we can get O(n^2) behavior where n is the number of | ||
* consecutive marks. */ | ||
stop = (unsigned) hb_max ((int) stop, (int) idx - HB_MAX_CONTEXT_LENGTH); | ||
This comment has been minimized.
Sorry, something went wrong. |
||
|
||
while (idx > stop) | ||
{ | ||
idx--; | ||
|
18 comments
on commit 85be877
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you do to prevent the same base glyph to be drawn more than once? Like if antialiasing is enabled, it will show differently?
What if a different base glyph is after the limit, what will be drawn then? Will it not show those base glyphs then? What are the implications of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you do to prevent the same base glyph to be drawn more than once? Like if antialiasing is enabled, it will show differently?
Depending on how the font is designed, they will either stack on top of each other, or overdrawn.
What if a different base glyph is after the limit, what will be drawn then? Will it not show those base glyphs then? What are the implications of that?
Again, depending on how the font is designed, they might appear to the left of the cluster as a new cluster, or overdrawn on the cluster, or any other wrong rendering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you do to prevent the same base glyph to be drawn more than once? Like if antialiasing is enabled, it will show differently?
Depending on how the font is designed, they will either stack on top of each other, or overdrawn.
May this affect pixel-by-pixel tests?
What if a different base glyph is after the limit, what will be drawn then? Will it not show those base glyphs then? What are the implications of that?
Again, depending on how the font is designed, they might appear to the left of the cluster as a new cluster, or overdrawn on the cluster, or any other wrong rendering.
Please read my question again, if you have N of these characters and M of other similar characters, then the limit you impose doesn't print all of these characters? What do you think about that?
I think you need to halt this change until a full analysis of the effects and side-effects this change has been performed! Any comments? There are simply too many cases here, that a simple min() will solve it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to halt this change until a full analysis of the effects and side-effects this change has been performed! Any comments? There are simply too many cases here, that a simple min() will solve it!
This change fixes the algorithmic misbehavior of O(n^2), and only that. Anything else (overprinting), etc, are separate issues, and this change does not try to address those. I've asked @jfkthame to review this change and we believe from HarfBuzz's point of view this is the adequate fix.
What do you do to prevent the same base glyph to be drawn more than once?
These kinds of suggestions belong in new issues, but I'm going to say we're very unlikely to do such parental-control on the input text. Specially since there's precedent that Unicode text sometimes uses multiple diacritic marks to mean different things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it fixes one problem and introduces another. 1 in 1 out, what should I say. I guess you have a plan here to move on!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 in 1 out, what should I say
I think I don't understand. What new problem does this introduce?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then the limit you impose doesn't print all of these characters?
That is not what is happening here (and HarfBuzz does not "paint" any thing). The limit here will prevent the marks from being attached to the base glyph, but they will still be left at their default (not attached) position. HB_MAX_CONTEXT_LENGTH
is currently 64, and it is very unlikely that a base glyph will have more than 64 marks attached to it (may be excessive zalgo text, but we can safely dismiss this as irrelevant).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s an extreme case, but Noto Sans Duployan 3.001 does use more than 64 marks per base. For example, here is <U+1BC23, U+002C> in HarfBuzz 6.0.0:
and with this commit:
I’m not sure what I’m going to do about it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s an extreme case, but Noto Sans Duployan 3.001 does use more than 64 marks per base. For example, here is <U+1BC23, U+002C> in HarfBuzz 6.0.0:
Can you elaborate what you do with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s an extreme case, but Noto Sans Duployan 3.001 does use more than 64 marks per base. For example, here is <U+1BC23, U+002C> in HarfBuzz 6.0.0:
Can you use mkmk
, which wouldn't induce the O(n^2)
behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if I can solve the O(n^2)
issue without putting a limit at all. Let me try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if I can solve the
O(n^2)
issue without putting a limit at all. Let me try.
This turns out to be non-trivial to do properly given that such lookup can be recursed-to from (Chain)Context lookups :(.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if I can solve the
O(n^2)
issue without putting a limit at all. Let me try.This turns out to be non-trivial to do properly given that such lookup can be recursed-to from (Chain)Context lookups :(.
Okay I did this. But now the unsafe_to_break
code is exhibiting O(n^2)
behavior. I don't think there's a way to win this battle...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if I can solve the
O(n^2)
issue without putting a limit at all. Let me try.This turns out to be non-trivial to do properly given that such lookup can be recursed-to from (Chain)Context lookups :(.
Okay I did this. But now the
unsafe_to_break
code is exhibitingO(n^2)
behavior. I don't think there's a way to win this battle...
At least now I can set a limit to 256 perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I did this. But now the
unsafe_to_break
code is exhibitingO(n^2)
behavior. I don't think there's a way to win this battle...
Okay I fixed that. This should work for now for this test.
I'm sure one can construct tests to hit the merge_clusters
codepath that would still exhibit O(n^2)
behavior, and I don't know how to fix that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I audited merge_clusters
and put some safety-nets in place there as well. Fingers crossed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I see the above commit(85be877) is reverted by commit (661050b). Is there any patch to fix this - [https://nvd.nist.gov/vuln/detail/CVE-2023-25193]? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I see the above commit(85be877) is reverted by commit (661050b). Is there any patch to fix this - [https://nvd.nist.gov/vuln/detail/CVE-2023-25193]? Thanks.
It's pre-existing, but happened to catch my eye here:
stop = 1 - 1
seems an odd way to writestop = 0
... is that really what was intended?