-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract all subtitle streams simultaneously #10884
Conversation
Extracting a subtitle stream is a disk I/O bottlenecked operation as ffmpeg has to read through the whole file, but usually there is nothing CPU intensive to do. If a file has multiple subtitle streams, and we want to extract more of them, extracting them one-by-one results in reading the whole file again and again. However ffmpeg can extract multiple streams at once. We can optimize this by extracting the subtitle streams all at once when only one of them gets queried, then we will have all of them cached for later use. It is useful for people switching subtitles during playback. It is even more useful for people who extract all the subtitle streams in advance, for example with the "Subtitle Extract" plugin. In this case we reduce the extraction time significantly based on the number of subtitle streams in the files, which can be 5-10 in many cases. Signed-off-by: Attila Szakacs <szakacs.attila96@gmail.com>
Jellyfin have a tendency to choke the storage when there are tons of attachments. Usially this happens with anime shows. I opened an old issue about this that can be used as a reference. If this fixes the above issue alot people with slower storage and low memory hardware will be happy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor formatting notes, cc @nyanmisaka for the substance of the changes
cd1f290
to
005c215
Compare
Thank you for the review! I have fixed all your comments, here's the diff: diff --git a/MediaBrowser.MediaEncoding/Subtitles/SubtitleEncoder.cs b/MediaBrowser.MediaEncoding/Subtitles/SubtitleEncoder.cs
index 0e66565ed..3230927a6 100644
--- a/MediaBrowser.MediaEncoding/Subtitles/SubtitleEncoder.cs
+++ b/MediaBrowser.MediaEncoding/Subtitles/SubtitleEncoder.cs
@@ -472,8 +472,8 @@ namespace MediaBrowser.MediaEncoding.Subtitles
/// <returns>Task.</returns>
private async Task ExtractAllTextSubtitles(MediaSourceInfo mediaSource, CancellationToken cancellationToken)
{
- var semaphores = new List<SemaphoreSlim> { };
- var extractableStreams = new List<MediaStream> { };
+ var semaphores = new List<SemaphoreSlim>();
+ var extractableStreams = new List<MediaStream>();
try
{
@@ -498,9 +498,9 @@ namespace MediaBrowser.MediaEncoding.Subtitles
}
if (extractableStreams.Count > 0)
- {
- await ExtractAllTextSubtitlesInternal(mediaSource, extractableStreams, cancellationToken).ConfigureAwait(false);
- }
+ {
+ await ExtractAllTextSubtitlesInternal(mediaSource, extractableStreams, cancellationToken).ConfigureAwait(false);
+ }
}
catch (Exception ex)
{
@@ -521,7 +521,7 @@ namespace MediaBrowser.MediaEncoding.Subtitles
CancellationToken cancellationToken)
{
var inputPath = mediaSource.Path;
- var outputPaths = new List<string> { };
+ var outputPaths = new List<string>();
var args = string.Format(
CultureInfo.InvariantCulture,
"-i {0} -copyts",
@@ -531,6 +531,13 @@ namespace MediaBrowser.MediaEncoding.Subtitles
{
var outputPath = GetSubtitleCachePath(mediaSource, subtitleStream.Index, "." + GetTextSubtitleFormat(subtitleStream));
var outputCodec = IsCodecCopyable(subtitleStream.Codec) ? "copy" : "srt";
+ var streamIndex = EncodingHelper.FindIndex(mediaSource.MediaStreams, subtitleStream);
+
+ if (streamIndex == -1)
+ {
+ _logger.LogError("Cannot find subtitle stream index for {InputPath} ({Index}), skipping this stream", inputPath, subtitleStream.Index);
+ continue;
+ }
Directory.CreateDirectory(Path.GetDirectoryName(outputPath) ?? throw new FileNotFoundException($"Calculated path ({outputPath}) is not valid."));
@@ -538,7 +545,7 @@ namespace MediaBrowser.MediaEncoding.Subtitles
args += string.Format(
CultureInfo.InvariantCulture,
" -map 0:{0} -an -vn -c:s {1} \"{2}\"",
- subtitleStream.Index,
+ streamIndex,
outputCodec,
outputPath);
}
@@ -614,18 +621,15 @@ namespace MediaBrowser.MediaEncoding.Subtitles
{
_logger.LogError("ffmpeg subtitle extraction failed for {InputPath} to {OutputPath}", inputPath, outputPath);
failed = true;
+ continue;
}
- else
+ if (outputPath.EndsWith("ass", StringComparison.OrdinalIgnoreCase))
{
- if (outputPath.EndsWith("ass", StringComparison.OrdinalIgnoreCase))
- {
- await SetAssFont(outputPath, cancellationToken).ConfigureAwait(false);
- }
-
- _logger.LogInformation("ffmpeg subtitle extraction completed for {InputPath} to {OutputPath}", inputPath, outputPath);
+ await SetAssFont(outputPath, cancellationToken).ConfigureAwait(false);
+ }
+ _logger.LogInformation("ffmpeg subtitle extraction completed for {InputPath} to {OutputPath}", inputPath, outputPath);
}
}
- }
if (failed)
{ |
Signed-off-by: Attila Szakacs <szakacs.attila96@gmail.com>
005c215
to
ce81e2a
Compare
🤦 Sorry, thanks! |
Extract all subtitle streams simultaneously
Similar to jellyfin#10884 --- Extracting a media attachment is a disk I/O bottlenecked operation as ffmpeg has to read through the whole file, but usually there is nothing CPU intensive to do. If a file has multiple media attachments, and we want to extract more of them, extracting them one-by-one results in reading the whole file again and again. However ffmpeg can extract multiple streams at once. We can optimize this by extracting the media attachments all at once when only one of them gets queried, then we will have all of them cached for later use. --- Jellyfin clients need fonts for subtitles, and each font is a separate attachment, which causes a lot of re-reads of the file. Certain contents, like anime in a lot of cases, contain 10-50 different attachments; reading the same 2-3 GB file 50 times from a HDD takes a significant amount of time. This change helps a lot in this scenario. Signed-off-by: Attila Szakacs <szakacs.attila96@gmail.com>
Similar to jellyfin#10884 --- Jellyfin clients need fonts for subtitles, and each font is a separate attachment, which causes a lot of re-reads of the file. Certain contents, like anime in a lot of cases, contain 50-80 different attachments. Spawning 80 ffmpeg processes at the same time on the same file might cause swapping on slower HDDs and can bring disk subsystem to a crawl. (For more info, see https://github.com/jellyfin/jellyfin/3215) This change helps a lot in this scenario. Signed-off-by: Attila Szakacs <szakacs.attila96@gmail.com>
I was thinking if there is a way to stream subtitles directly from the video file without extraction, or perhaps extracting the subtitle without the need to read the whole file. Do video players like VLC, etc, read the whole file to play subtitles? |
For non-SSA/ASS plain text subtitles, it is possible to pass the subtitles to the HLS player via HLS/m3u8. This requires additional efforts in both server and clients. |
Similar to jellyfin#10884 --- Jellyfin clients need fonts for subtitles, and each font is a separate attachment, which causes a lot of re-reads of the file. Certain contents, like anime in a lot of cases, contain 50-80 different attachments. Spawning 80 ffmpeg processes at the same time on the same file might cause swapping on slower HDDs and can bring disk subsystem to a crawl. (For more info, see https://github.com/jellyfin/jellyfin/3215) This change helps a lot in this scenario. Signed-off-by: Attila Szakacs <szakacs.attila96@gmail.com>
Similar to #10884 --- Jellyfin clients need fonts for subtitles, and each font is a separate attachment, which causes a lot of re-reads of the file. Certain contents, like anime in a lot of cases, contain 50-80 different attachments. Spawning 80 ffmpeg processes at the same time on the same file might cause swapping on slower HDDs and can bring disk subsystem to a crawl. (For more info, see https://github.com/jellyfin/jellyfin/3215) This change helps a lot in this scenario. Signed-off-by: Attila Szakacs <szakacs.attila96@gmail.com>
Changes
Extracting a subtitle stream is a disk I/O bottlenecked operation as
ffmpeg
has to read through the whole file, but usually there is nothing CPU intensive to do.If a file has multiple subtitle streams, and we want to extract more of them, extracting them one-by-one results in reading the whole file again and again.
However
ffmpeg
can extract multiple streams at once.We can optimize this by extracting the subtitle streams all at once when only one of them gets queried, then we will have all of them cached for later use.
It is useful for people switching subtitles during playback.
It is even more useful for people who extract all the subtitle streams in advance, for example with the "Subtitle Extract" plugin.
In this case we reduce the extraction time significantly based on the number of subtitle streams in the files, which can be 5-10 in many cases.
Jellyfin logs
ffmpeg logs