-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sub: add sub-detect-rtl
option
#12985
base: master
Are you sure you want to change the base?
Conversation
Download the artifacts for this pull request: |
a3bf379
to
f1a459e
Compare
Updated to add a @ShlomoCode can you try this version once it's finished rebuilding? |
That sounds like the least flexible solution, needs a better fix IMO. |
That's just the auto detection mode, the I asked in #libass and this is what I was recommended to do, I suppose libass could provide an API to do this on their side instead |
CleanShot.2023-11-28.at.01.21.02-converted.mp4
|
I'm afraid it's not possible to display this correctly then without changes from libass side. I still think there's some merit to having the |
libass assumes all text to be left-to-right for compatibility with VSFilter. We can force auto detection by setting Encoding=-1, but it may break older subtitles so don't enable it by default.
f1a459e
to
a9ada0b
Compare
@@ -2426,6 +2426,12 @@ Subtitles | |||
``--sub-speed=25/23.976`` plays frame based subtitles which have been | |||
loaded assuming a framerate of 23.976 at 25 FPS. | |||
|
|||
``--sub-detect-rtl=<yes|no>`` | |||
Default: no. Set the ``Encoding`` flag to ``-1`` to enable Fribidi's base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering it only applies to converted subs (and thus VSFilter is not relevant), in which situation would you possibly want to set this to no
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know how well Fribidi's auto detection algorithm works, so I'm scared of introducing a change in default that might possibly regress some LTR subtitles. If you think it should be enabled by default then I could change that. also cc @avih
I should update the description, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that either:
- it works correctly, then it can be enabled and we don't need an option
- it doesn't work correctly, then the entire PR isn't really that useful
That it regresses any LTR content is very unlikely, Unicode has clear definitions which glyphs/scripts are RTL. You'd have to mess up really bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it works correctly, then it can be enabled and we don't need an option
That it regresses any LTR content is very unlikely
I think I mostly agree with this.
Auto RTL detection (based on the first word language apparently) is highly unlikely to get wrong IMO.
However, it is still a guess for text which is missing RTL marks. So I think it would be useful to allow disabling it too, i.e. keep it an option.
I also think it would be useful to keep the force mode (i.e. force RTL instead of autodetect) because we know for a fact that in some cases detection is not good enough, though the method, while working (of inserting RTL marks), is a bit iffy, because libass doesn't currently have other methods to control direction.
But a user script can probably add RTL EMBEDDING marks for all sub lines? so just let user scripts allow forcing RTL?
Or maybe we could add an SDH filter which inserts RTL embedding?
Converting this to a draft for now as it has questionable usefulness in its current state. It seems libass will provide us some way to force RTL, so I can add the Currently, this option can be emulated by setting |
Isn't there already an existing ASS_FEATURE_WHOLE_TEXT_LAYOUT API for implementing bidirectional text processing in libass? Isn't it better to add an option to choose whether to enable it? |
Encoding=-1 already implies whole-text layout, and as seen, it's not enough for webvtt files obtained from Youtube. This is because Youtube sub tracks don't contain bidi information themselves, but are provided to the browser through css. |
To clarify further and recap additional details from the discussion on IRC a while ago:
|
Standard webvtt files always use auto base direction, so we should always enable incompatible extensions. This currently covers bidi base direction, as well as bidi bracket matching, and unicode soft line wrapping. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Additionally, we could also consider enabling this for other sub formats in the future on a case-by-case basis. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_INCOMPATIBLE_EXTENSIONS and auto base detection by default, and add an option to disable this if needed. This currently means auto base direction, changing how bidi runs are split, bidi bracket matching, and unicode soft line wrapping. This is strictly an improvement for Webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_INCOMPATIBLE_EXTENSIONS and auto base detection by default, and add an option to disable this if needed. This currently means auto base direction, changing how bidi runs are split, bidi bracket matching, and unicode soft line wrapping. This is strictly an improvement for Webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_INCOMPATIBLE_EXTENSIONS and auto base detection by default, and add an option to disable this if needed. This currently means auto base direction, changing how bidi runs are split, bidi bracket matching, and unicode soft line wrapping. This is strictly an improvement for Webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_{WHOLE_TEXT_LAYOUT, BIDI_BRACKETS} and auto base detection by default, and add an option to disable this if needed. This is strictly an improvement for webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_{WHOLE_TEXT_LAYOUT, BIDI_BRACKETS} and auto base detection by default, and add an option to disable this if needed. This is strictly an improvement for webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_{WHOLE_TEXT_LAYOUT, BIDI_BRACKETS} and auto base detection by default, and add an option to disable this if needed. This is strictly an improvement for webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_{WHOLE_TEXT_LAYOUT, BIDI_BRACKETS} and auto base detection by default, and add an option to disable this if needed. This is strictly an improvement for webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
update: ffmpeg git master now preserves bidi marks |
Enable ASS_FEATURE_{WHOLE_TEXT_LAYOUT, BIDI_BRACKETS} and auto base detection by default, and add an option to disable this if needed. This is strictly an improvement for webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: mpv-player#12985 (comment)
Enable ASS_FEATURE_{WHOLE_TEXT_LAYOUT, BIDI_BRACKETS} and auto base detection by default, and add an option to disable this if needed. This is strictly an improvement for webvtt files as they always use auto base detection. This _fixes_ right-to-left text rendering for webvtt files which correctly mark rtl/ltr. Webvtt files obtained from sources which sideload the RTL information through css also see an improvement due to the auto detection. Generally SRT files also want this, but some are also written to workaround VSFilter quirks. See also: #12985 (comment)
libass assumes all text to be left-to-right for compatibility with VSFilter. We can force auto detection by setting Encoding=-1, but it may break older subtitles so don't enable it by default.
Fixes #12978