-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
complex shaper: avoid font warning for non-printable codepoints #508
Conversation
Repushed with trivial changes:
|
Note, there's a case which I'm not sure how it behaves - though I think it's not a new case: If all the symbols at the run don't try to resolve a font (e.g. if all are below 0x20, or newly also if I don't know if this ends up valid later on, or even if this tries to use invalid or out-of-bounds index. Though again - not a new issue as far as I can tell. |
So there are two things which might need addressing:
Or we can push it now, and unify the scans later - they do use the same code, but it does run twice now for the simple shaper. Let me know what you think. EDIT - also, now that I think about it, I don't know that the second commit doesn't introduce regressions. I think it has the potential to end up with shorter runs (because control symbols use font 0, but other symbols might use some other font), and I don't know what the implication of a run is. For instance, if only complete runs are passed to harfbuzz, then these control symbols might be on "runs" of their own, and I don't know whether they'd be taken them into account by the shaper while shaping surrounding runs. I'd need some help with assessing that. It's the first time I'm looking at libass code. |
Re-pushed with these changes:
I'm copying the WIP questions asked at the commit message to allow easier discussion:
Once we answer those, I'll remove the WIP text from the second commit. And, the last commit message suggests that it might be possible to reuse the existing
However, even if |
Yes, that sounds like a regression, unfortunately. Good catch! Or for some fonts, more like… lack of fix than regression. I think this is the very issue that triggered #71 (comment). For e. g. bidi control overrides it shouldn’t matter, but e. g. U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER must make it to HarfBuzz as part of their surrounding shaping runs, because they affect shaping. That means the control character must inherit For what it’s worth, I’ve verified that VSFilter/Uniscribe seems to handle ZWJ correctly (unlike current libass) in a situation where it joins Arabic letters and the main font (Helvetica) has a ZWJ glyph but not Arabic glyphs.
If it does, and we don’t handle it specially, it will presumably be handled like missing glyphs are handled at the moment. See also my next answer.
I think that’s what you do in this PR, so is the question rather “is there anything wrong with this approach”? No, I think it’s fine. But we may want to revise the list (again) to better match HarfBuzz. I’m not fully sure how HarfBuzz decides which code points to request glyphs for and which not. That (plus the potential for changes over time) is why ideally I’d like to somehow make HarfBuzz pick these characters out for us, rather than us try to guess and hope the lists match. I think it may be based on the default-ignorable Unicode property. I see HarfBuzz uses an
Uh, well, is it really any different from when some of the symbols don’t resolve any font?
Indeed, it is. You’re right when you say the complex shaper resets So you can just move the |
Right. Good catch, bad patch :)
Indeed. I also presumed this could be the case, though as you mentioned, it can happen anyway before this PR if the glyph is missing - which as we know can definitely happen for control codepoints.
Correct. I thought this was obvious in this context, but I could have phrased it better. Thanks.
Completely agreed. That was one of the things I thought might be happening here #507 (comment) It's definitely better to not guesswork/duplicate it, unless it's specified explicitly how harfbuzz does this.
I think it is different. Because if no font was resolved, and later you're required to provide a font, then it might end up accessing index 0 of the array, and maybe the array size is 0 to begin with because no font was ever found. I didn't try to follow the code enough to understand whether it's a real issue or not, but regardless, it's not a new issue because theoretically (but I don't know if practically) it can already happen before this PR if all the symbols are smaller than x20.
Can I? I wouldn't have thought so myself, because then So, considering the potential (additional) regression with even more cases of too-short runs than before, here's what I suggest we do:
|
Er, right. Just drop the part that changes symbol to 0. (It’s important for the complex shaper, too.) I don’t think it adds any meaning for later processing stages anyway (when |
On the other hand, if the font does contain a glyph for a control character and we’re using the “simple shaper”, I think we end up using the glyph’s metrics for layout despite This PR might actually fix that. |
Right. That was indeed one thing I also considered, but didn't know if both
Yes, definitely better to not change the run ID, but it can become tricky to a degree I felt uncomfortable to make this change without deeper knowledge of the codebase. So, I still stand by this:
Now also because different approaches which work better than this PR (or just suppression) do also require deeper knowledge of the codebase, and I'm simply not comfortable taking them without knowing for a fact that I don't introduce regressions. If you think that we can do better right now than simply addressing the warning on its own (i.e. my suggestion), then I'd prefer that you take over and handle it however you think is best. What say you? |
I agree that's the best thing to do, and I decided to have a go at this too. I think on top of the existing 3 commits would be OK.
That's implementation details, which I don't immediately know how to translate to code. Also, not sure about I've had a different(?) idea at the implementation:
Few things I'd need help with:
|
Assuming no one cares if control codepoints inside a run have a different font than surrounding symbols at the run, and assuming diff --git a/libass/ass_shaper.c b/libass/ass_shaper.c
index 396185f..50390c6 100644
--- a/libass/ass_shaper.c
+++ b/libass/ass_shaper.c
@@ -835,8 +835,17 @@ void ass_shaper_find_runs(ASS_Shaper *shaper, ASS_Renderer *render_priv,
ass_font_get_index(render_priv->fontselect, info->font,
symbol, &info->face_index, &info->glyph_index);
}
- if (i > 0) {
- GlyphInfo *last = glyphs + i - 1;
+
+ // test if a new run starts, completely ignoring control codepoints
+ int last_font = info->is_control ? -1 : i - 1;
+ while (last_font >= 0 && glyphs[last_font].is_control)
+ last_font--;
+ // last_font is -1 if NOT considering a new run (first symbol, or
+ // control codepoint, or all prior symbols are control), else nearest
+ // earlier index to compare font (guaranteed not a control codepoint).
+
+ if (last_font >= 0) {
+ GlyphInfo *last = glyphs + last_font;
if ((last->font != info->font ||
last->face_index != info->face_index ||
last->script != info->script || This fixes the new cases of too-short runs introduced by the second commit - control codepoint breaks the run even if a glyph can be found. And it also fixes earlier cases of too-short runs - control codepoint breaks the run because no glyph was found. EDIT: or even a better patch which fixes the same two issues (less code, O(1) even in pathological edge cases): diff --git a/libass/ass_shaper.c b/libass/ass_shaper.c
index 396185f..ea6bd5e 100644
--- a/libass/ass_shaper.c
+++ b/libass/ass_shaper.c
@@ -818,6 +818,7 @@ void ass_shaper_find_runs(ASS_Shaper *shaper, ASS_Renderer *render_priv,
{
int i;
int shape_run = 0;
+ int last_font = -1; // latest earlier index which is not a control codepoint
ass_shaper_determine_script(shaper, glyphs, len);
@@ -835,8 +836,15 @@ void ass_shaper_find_runs(ASS_Shaper *shaper, ASS_Renderer *render_priv,
ass_font_get_index(render_priv->fontselect, info->font,
symbol, &info->face_index, &info->glyph_index);
}
- if (i > 0) {
- GlyphInfo *last = glyphs + i - 1;
+ if (!info->is_control && last_font >= 0) {
+ // Consider a new run: current symbol is not a control point,
+ // and we also have a prior symbol which is not a control point.
+ // TODO: if we have last_font followed by several control symbols,
+ // and the current symbol font differs from the last_font
+ // (i.e. a new run begins), should the intermediate control
+ // symbols belong to the previous run or to the new run?
+ // (currently they remain with the previous run)
+ GlyphInfo *last = glyphs + last_font;
if ((last->font != info->font ||
last->face_index != info->face_index ||
last->script != info->script ||
@@ -845,6 +853,9 @@ void ass_shaper_find_runs(ASS_Shaper *shaper, ASS_Renderer *render_priv,
shape_run++;
}
info->shape_run_id = shape_run;
+
+ if (!info->is_control)
+ last_font = i;
}
} |
Actually, I'd probably not make such assumption, unless we know 100% for a fact that that is the case. The reason being that I can imagine harfbuz wanting to use some font metrics while appying these control codepoints, and with a "dummy" font 0 - it could end up using wrong metrics from the font even if the glyph itself is not used. Luckily, making sure that all the control codepoints font info in a run is copied from a codepoint for which a font was actually resolved is relatively simple to add to the patch above. Probably 2-3 more LOC. I'm not updating the patch above, but unless you prefer otherwise (i.e. keep the patch above as is in this regard), then I'll add it to the commit, assuming you'd want this functionality added. |
Forced-pushed with the following changes:
I'm fairly confident that this patchset:
The fourth commit (don't break runs on control codepoints) adds three TODO comments: // TODO: currently only control codes face_index is not-comparable,
// but we should consider the same also if a face was not
// resolved by ass_font_get_index for other reasons This is not a new question, and the code doesn't change existing behavior. However, someone might want to revisit the behavior in the future. // TODO: else face_comparable remains true. is this correct? (when // TODO: if there are both comparable and uncomparable faces at
// the previous run, then their face_index can differ.
// if this matters to the shaper, it can be adjusted here. This is a new question, and is a result of the change that I don't know the answer to that (i.e. whether it matters to the complex shaper), but if the answer is "yes, it does matter", then the fifth WIP commit addresses this concern. If it doesn't matter then keeping the fifth commit doesn't hurt, but also doesn't add anything useful, so maybe it can be removed. |
Yes, this isn't exposed in public API (the check is tailored for font shaping in HarfBuzz, it doesn't match the Unicode property 100%). |
FWIW, I found this definition here https://en.wikipedia.org/wiki/Module:Unicode_data/derived_core_properties : data.default_ignorable = {
singles = {
[0x000AD] = true,
[0x0034F] = true,
[0x0061C] = true,
[0x03164] = true,
[0x0FEFF] = true,
[0x0FFA0] = true,
},
ranges = {
{ 0x0115F, 0x01160 },
{ 0x017B4, 0x017B5 },
{ 0x0180B, 0x0180E },
{ 0x0200B, 0x0200F },
{ 0x0202A, 0x0202E },
{ 0x02060, 0x0206F },
{ 0x0FE00, 0x0FE0F },
{ 0x0FFF0, 0x0FFF8 },
{ 0x1BCA0, 0x1BCA3 },
{ 0x1D173, 0x1D17A },
{ 0xE0000, 0xE0FFF },
},
} the libass set which this patchset puts inside the In the other way around - the definition above includes quite a bit more than the definition of However, I don't know how credible this definition is, and not sure what it refers to. though google did place this page near the top of search results, so I guess people do link to it. |
@khaledhosny for codepoints in this (or some other) default_ignorable list:
My guess would be probably "no" for glyph info, as glyphs for these codepoints are probably not very relevant. But for face data for these codepoitns - I can imagine that there could be value even without the glyph. So, for instance, if libass tries to resolve faces for runs, and there are default_ignorable codepoints embedded within these runs which libass doesn't try to resolve a glyph (and similarly neither a face), should their face be adjusted to match the surrounding face at the run? |
I don’t quite understand what you mean. But HarfBuzz will behave shaping wise whether the font have glyphs for these code points or not. Under normal circumstances it will replace them with space glyph and zero its widths (the replacement glyph can be controlled with the API), it can also retain the glyph in the output under some special circumstances (if requested with the API, or if the fonts has some substitution rules for these glyphs). |
OK, I based my question on this (#507 (comment)):
This PR changes how "our previously found glyphs" works. I.e. it takes place at the stage of resolving the glyphs, before calling harfbuzz, and tries to avoid glyph resolution for default_ignorable codepoints, and deal with any consequences it might have. So I'm trying to understand for a fact what happens without assuming: whether or not harfbuzz does try to "request the font glyph via callbacks" for codepoints at some default_ignorable list. And the reason I'm asking is to understand whether or not libass should try to resolve a face which has a glyph for it, or whether it can simply say "it's ignorable, I'm not going to bother finding a (fallback) face which has this glyph". The second question is basically: does harfbuzz care at all what the face is for codepoints in a default_ignorable list - under normal circumstances (without some "show hidden" mode etc). I.e. I think it's not unimaginable that harfbuzz knows that the glyph itself is not relevant under normal circumstances, but that the face might still be relevant to extract some metrics in order to apply the codepoint correctly during layout. This is a guess - but I hoped for more knowledgable info by you. The reason I'm asking is that due to implementation details, if libass doesn't try to resolve a face for the glyph then the face ends up arbitrary. So this patchset can either leave it with some arbitraty value (if harfbuzz doesn't care what the face is) or it needs to do some work in order to adjust the face to match the resolved face of surrounding glyphs. So this is how I interpret your reply (please confirm):
So assuming a replacement glyph is not provided via the API, and under "normal" circumstances, do I understand correctly that harfbuzz does not need to access neither the glyph nor the font which is associated with this codepoint? |
Hmm, so there’s no way to know which code points require font data and which don’t other than going the Chrome way (with extra complications due to synthesized glyphs) or duplicating internal HarfBuzz code? Speaking of the Chrome way, is it not possible that a font intentionally uses glyph #0 for a supported character (or substitution) such that it should not invoke font fallback? And the proper handling of default ignorable code points is… feed them to HarfBuzz first; if it returns .notdef, perform font fallback as usual; if out of fonts and they’re still .notdef, turn them invisible? It would be great if hb-unicode exposed the raw default-ignorable property so we didn’t have to maintain our own list likely to fall out of sync over time. (By the way, I don’t see Chrome do this—turn ignorables invisible if no font has them and HarfBuzz has let them through—although I’m merely foolishly looking at unfamiliar source code. It looks like Firefox does this. But then Firefox had this issue but I don’t understand how it could be an issue to begin with and hence whether I need to worry about it in libass.) |
@avih Sorry for my silence in the last few days. I very much appreciate your enthusiasm and effort, but I think you’re overcomplicating this :-) Just copy the last glyph’s font face in By the way, currently, libass will try to fetch the glyph later anyway, regardless of HarfBuzz. |
I don't think this would work, but I'd be to glad be proven wrong (and my suggestion for you to just take over is still in place, I really don't mind who writes the code). The point is that a run could have several ignorable codepoints in a row at the beginning the run. I.e. you have nothing to copy from. And once the run ends - only then you can look back at the entire run - and if it has both resolved and unresolved glyhs - copy the resolved ones to the unresolved ones. But sure, it's also entirely possible I'm overly complicating :) |
libass/ass_shaper.c
Outdated
// NOTE: if face_index changes then it can split the run further, | ||
// but cannot accidentally combine adjacent runs (e.g. if two | ||
// symbols had different face_index but they have the same | ||
// face_index after ass_font_get_index). that's because split |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: This is the very place that sets face_index
. It is zero until here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: This is the very place that sets
face_index
. It is zero until here.
Right, that's the only thing I wasn't 100% sure of, so I added this comment to make it obvious if I'm assuming wrong. Thanks.
The reason I assumed that is that split_style_runs
splits on bold and italic, and assumed face_index
was derived from the style somewhere before ass_shaper_find_runs
, and that then ass_font_get_index
starts with the desired face/style, and then falls back to others. That's the very first thing which ass_font_get_index
does - tries the glyph at the input face_index - and if it's found, then nothing happens further later:
// try with the requested face
if (*face_index < font->n_faces) {
face = font->faces[*face_index];
index = FT_Get_Char_Index(face, ass_font_index_magic(face, symbol));
}
And the comment // try with the requested face
makes it look even more like the input face_index
was already set explicitly to some requested value.
This is correct. I was about to say: But:
Oh welp, so we do want the right
This is impossible, so you need not worry. libass makes sure each font always starts with one face. (See:
It checks whether a “glyph” is actually an inline ASS vector drawing. ASS can contain vector drawings instead of (in addition to) text. Drawings need no text shaping and delimit shaper runs.
Every drawing, and every glyph immediately following a drawing, has |
Good. Thanks :)
Right. But I didn't make any assumptions about prior states, except at my comment which you say is wrong :) (about face_index always being 0 before calling TBH, by looking at
I don't think I really understand the statement or its implications. See below.
Not 100% sure I follow. The right font as the surrounding glyphs (i.e. artificial adjustment)? or the right font as one which has a glyph for this codepoint (i.e. we actually need If it's the latter and it does have a resolved face which (can) differs from its surroundings, do we want to break the run due to the changed face (assuming nothing else except the face changed)? or do we want to keep it in one run which will now have more than one face? And also if it's the latter, what happens if we can't find a face which has this glyph? Do we still do the warning?
Right. I didn't assume anything about it, and the code doesn't need modification following this new (to me) information, but it does answer one of the TODO items:
So the answer is that it's correct because it doesn't matter - it's force-delimited by |
It depends on whether HarfBuzz wants to show this glyph or not, and like I said earlier there are few situations where it would.
This should work, as long as it is kept in the same run (runs, by definition, must use the same font) with its surroundings.
HarfBuzz does not have any special meaning of 0 glyph index, the callback function returns false when it does not find a glyph for the specified code point and that what tells HarfBuzz the font does not support it (Pango, for example, returns a very high glyph index that also encodes the code point to use it later for drawing missing glyph hex box).
You are unlikely to ever end up with |
This is true for most of them, but what about the Default_Ignorables that are excluded from HarfBuzz’s internal list, such as hangul fillers and shorthand format controls? My understanding is that it does not treat them as ignorables in any manner at all. If the font supports them, the font itself will ligate/substitute them into blank glyphs. But if not, we should not show fallback .notdef, because that’s the definition of Default_Ignorable. |
Aha, I see it now. Nice, thanks! If the |
They would be returned as
I think so. |
This removes the function ass_shaper_skip_characters, and sets .skip at ass_shaper_find_runs to the same value as before. However, unlike before - it doesn't also set .symbol=0. This affects only the simple shaper, because the complex shaper resets and re-calculates .skip later at shape_harfbuzz. According to @astiob this is equivalent and without regressions, though I didn't try to prove its correctness - avih. The type uint_least32_t is different than other places which hold a codepoint (`uint32_t' or `unsigned' are used) because that's the only type for which C99 guarantees support, and is at least 32 bits (uint32_t is optional, and unsigned can theoretically be 16 bits).
default-ignorable (DI) codepoints glyphs are not present at all fonts, are not used by the simple shaper for rendering, and are mostly not used by the complex shaper for rendering. This avoids missing-glyph warnings for DI codepoints. Because now we don't try to resolve DI glyphs - their face_index could differ from their surrounding, which would break the run. Similarly, before this commit DI could also break the run due to missing glyphs or if resolved-with-different-face. So now we also don't allow DI face_index to break the run. However, DI face_index can still differ from the surrounding, so we adjust DI face_index to match the rest of the run if possible so that the complex shaper later gets the correct face_index if needed (technically, only the first face_index of a run is used later). There's still a potential issue if the complex shaper needs DI glyphs for the rendering but they're not available at the run's face_index. After discussion with @astiob, it was concluded that perfect solution can't exist without major refactoring of the dataflow, and that the the approach at this commit (don't resolve DI glyphs, don't break runs on DI, adjust DI face_index to match the run) is both better than before and also the best we can do within the current code form.
Force-pushed two commits which replace the previous ones, as requested by @astiob on IRC. |
For what it's worth, I've been using the last patch-set (two-commits) in my own build of libass with mpv for the past 2 weeks. It seems to solve the original issue (no more warnings for missing glyphs of default-ignorable), and I didn't notice a negative impact, though I didn't try to measure anything. |
I’ll get back to this now. Preliminary tests suggested that Uniscribe in VSFilter doesn’t attach leading ZWJ to a subsequent fallback shape run, which should let us simplify our code further. But first, I want to do another test with something like U+1D179 MUSICAL SYMBOL BEGIN PHRASE that is logically a leading character, to make sure it doesn’t depend on the exact character. |
I don't think I understand how is this related to the issue which this PR tries to solve - avoid warning for missing glyphs which are not rendered anyway. |
Fonts don't need to contain any glyphs for these code points. Their special effect on layout (if any) will be handled by FriBidi and/or HarfBuzz, and they will be replaced by zero-width glyphs regardless of whether any font glyph was provided. By trying hard to find a glyph as we currently do, we end up mistakenly splitting shape runs on control characters that lack font glyphs and therefore ruining shaping, as well as logging unnecessary warnings about the missing glyphs. There are some quirks and exceptions. In particular, font glyphs for normally-invisible characters may be retained by recent HarfBuzz if OpenType substitutions are involved (which sometimes matches and sometimes doesn't match Uniscribe). This commit happens to retain this behavior for visible glyphs contained in the main font (as opposed to any arbitrarily-chosen fallback fonts) by virtue of defaulting run-leading invisibles to face_index 0, where HarfBuzz will find the glyphs on its own. Meanwhile, the case of _fallback_ fonts inserting visible glyphs doesn't even seem to be desirable to begin with. Preliminary testing shows this commit's logic matches Uniscribe in some common cases, but tests with musical symbols show strange results. More research is needed to match Uniscribe and hence VSFilter even better. On the other hand, this logic is likely not sophisticated enough for "optimal" typesetting without regard to Uniscribe compatibility. For example, Blink does not look up glyphs and split font-fallback runs prior to running HarfBuzz at all, and instead runs HarfBuzz on the whole higher-level run at once and then reruns it with the next fallback font on just those substrings that HarfBuzz failed to shape. In any case, some Default_Ignorable glyphs are more appropriately viewed as logical prefixes than suffixes and should be grouped with the following, not preceding, character in case of doubt. Furthermore, when a fallback font is used for a sequence of text, it may be desirable to use the same font for intervening or surrounding whitespace or punctuation even if the main font (or another fallback font) also contains glyphs for them; and whitespace characters can be synthesized in lieu of import from another font, and indeed HarfBuzz does synthesize them, but this is not always desirable. Additionally, the glyph HarfBuzz uses for zero-width substitutes may actually be visible as it defaults to the font's basic space glyph: if we were to forget about Uniscribe, we should define a custom, fake glyph index and translate it into `skip`. Finally, our higher-level shape run splitting is not perfect either: bidi splitting after script splitting can create awkward short runs, and in general, script splitting seems problematic in itself. Fixes libass#507 (log messages). Supersedes and closes libass#508. Fixes misshaping such as in the sample in libass#71 (comment), libass#71 (comment). However, this particular sample was actually a combined result of unnecessary font fallback (which this commit fixes) with another bug that was fixed in commit 80ce637: it already renders correctly since that commit, but the current commit alone would also have fixed it.
Fonts don't need to contain any glyphs for these code points. Their special effect on layout (if any) will be handled by FriBidi and/or HarfBuzz, and they will be replaced by zero-width glyphs regardless of whether any font glyph was provided. By trying hard to find a glyph as we currently do, we end up mistakenly splitting shape runs on control characters that lack font glyphs and therefore ruining shaping, as well as logging unnecessary warnings about the missing glyphs. There are some quirks and exceptions. In particular, font glyphs for normally-invisible characters may be retained by recent HarfBuzz if OpenType substitutions are involved (which sometimes matches and sometimes doesn't match Uniscribe). This commit happens to retain this behavior for visible glyphs contained in the main font (as opposed to any arbitrarily-chosen fallback fonts) by virtue of defaulting run-leading invisibles to face_index 0, where HarfBuzz will find the glyphs on its own. Meanwhile, the case of _fallback_ fonts inserting visible glyphs doesn't even seem to be desirable to begin with. Preliminary testing shows this commit's logic matches Uniscribe in some common cases, but tests with musical symbols show strange results. More research is needed to match Uniscribe and hence VSFilter even better. On the other hand, this logic is likely not sophisticated enough for "optimal" typesetting without regard to Uniscribe compatibility. For example, Blink does not look up glyphs and split font-fallback runs prior to running HarfBuzz at all, and instead runs HarfBuzz on the whole higher-level run at once and then reruns it with the next fallback font on just those substrings that HarfBuzz failed to shape. In any case, some Default_Ignorable glyphs are more appropriately viewed as logical prefixes than suffixes and should be grouped with the following, not preceding, character in case of doubt. Furthermore, when a fallback font is used for a sequence of text, it may be desirable to use the same font for intervening or surrounding whitespace or punctuation even if the main font (or another fallback font) also contains glyphs for them; and whitespace characters can be synthesized in lieu of import from another font, and indeed HarfBuzz does synthesize them, but this is not always desirable. Additionally, the glyph HarfBuzz uses for zero-width substitutes may actually be visible as it defaults to the font's basic space glyph: if we were to forget about Uniscribe, we should define a custom, fake glyph index and translate it into `skip`. Finally, our higher-level shape run splitting is not perfect either: bidi splitting after script splitting can create awkward short runs, and in general, script splitting seems problematic in itself. Fixes libass#507 (log messages). Supersedes and closes libass#508. Fixes misshaping such as in the sample in libass#71 (comment), libass#71 (comment). However, this particular sample was actually a combined result of unnecessary font fallback (which this commit fixes) with another bug that was fixed in commit 80ce637: it already renders correctly since that commit, but the current commit alone would also have fixed it.
This was performed in two steps (commits):