Skip to content
TwistedUmbrellaX edited this page Dec 9, 2024 · 32 revisions

Plugin Banner

Analyzes the audio of television episodes to detect and skip over intros.

System requirements

  • The latest Jellyfin 10.x release
  • Jellyfin's fork of ffmpeg
    • Docker jellyfin/jellyfin / linuxserver/jellyfin: preinstalled
    • Windows / Linux native installs: jellyfin-ffmpeg package
      • 10.10 => jellyfin-ffmpeg7
      • 10.9 => jellyfin-ffmpeg6
      • 10.8 => jellyfin-ffmpeg5
    • MacOS native installs: packaged for 10.9+ on macOS 12+
    • Gentoo Linux native installs: enable xarblu-overlay and install media-video/jellyfin-ffmpeg

Limitations

  • SyncPlay is not (yet) compatible with any method of skipping due to the nature of how the clients are synced.

Detection parameters

Show introductions will be detected if they are:

  • Located within the first 25% of an episode or the first 10 minutes, whichever is smaller
  • Between 15 seconds and 2 minutes long

Ending credits will be detected if they are between 15 seconds and 5 minutes long

Movies (10.9+) will be detected if they are less than 15 minutes long

These parameters can be configured in the plugin settings under Modify Segment Parameters

Detection types

  • Chapter

  • Detection is done using chapter markers placed in the file by the encoder. In many cases, these chapters are named and allow using the predefined start and end points in place of processing the video. This is the fastest method, but relies on files to contain meaningful chapter names.

  • Chromaprint

  • Chromaprint is an FFmpeg extension that allows comparing audio between files to find matching segments. By comparing multiple files, recurring audio clips can be detected and processed to find likely matches for intros, credits, and other common uses of specific audio clips.

  • Black frame

  • The blackframe filter in FFmpeg is a video filter that detects black (or nearly black) frames within a video stream. This filter itself doesn't directly relate to decoding, whether hardware or software. but rather to the analysis of video frames for a specific characteristic (in this case, black frames). All the processing is always done by the CPU.

  • Silence

  • Silence detection is exactly as it sounds. It detects an extended period of complete (or near complete) silence. This means there is no dialog or music.

Analysis

The methods explained above are what are used in analysis, but the analysis itself is a bit more complex. One of the frequent questions is why it seems like the plugin completes too fast when analysis is run manually. This is actually because some of the information that needs to be collected is already there.

Chromaprint is the slowest analysis method, but also has the most room for subsequent (future) optimization. The actual process is to create markers, called fingerprints, from the audio of each episode. These fingerprints are then compared to determine a span of audio that is repeated across multiple files. This span of audio is recorded as start and end points that are called the timestamps.

When you change settings, the timestamps and how they are determined are what changes. As long as the media file itself remains the same, fingerprints from past analysis are still valid. This means that even if you change how a timestamp is identified, you skip chromaprint and save all of that time.

The process of analysis becoming Media Segments runs in a single direction. Since timestamps are the result of analysis, editing timestamps will only replace the existing results with your own values. These values will not improve analysis, but will not be replaced unless the cache is cleared for the edited episodes (or no cache exists, such as recaps and previews). They are also the values that will be used to generate Media Segments.

Media Segments are essentially identical to timestamps, but stored in a different database. Media Segments are overwritten by timestamps, to it is better to edit the timestamps to avoid lost data. The time to convert a timestamp into a segment is just the time it takes for the server to retrieve the data from the plugin and record it locally.