[Bug]: Video Ends Early While Audio Continues Repeating in Decrypted Files #215

oijm17 · 2024-04-16T06:18:56Z

What happened?

Description
The script downloads the audio and video files for each lesson and decrypts them correctly. However, when multiplexing, the issue arises because the script concatenates the audio four times within the same track in the video container, instead of just once. This behavior differs from the video track, which is multiplexed correctly with just one iteration.

The result is a single decrypted video file with two tracks: the first is the video track (which works correctly), and the second is the audio track, containing the original decrypted audio but concatenated four times. This means that when playing the file, the video stops after the expected duration, but the audio continues playing three more times, even though the video has stopped.

Example to Illustrate the Issue:
If an encrypted lesson has a total duration of 5 minutes, the script, after decrypting and multiplexing, creates a video file with a duration of 20 minutes. After the first 5 minutes, the video stops because it's complete, but the audio starts again. This repeats at 10, 15, and 20 minutes, because the audio track is concatenated four times.

This problem only occurs with encrypted lessons and seems to affect all DRM-based courses. I have tested this with three different courses and encountered the same result.

Desktop:
Python: v3.9.1

Expected Result

The script should download, decrypt, and multiplex the audio and video files for each lesson correctly. The resulting video file should contain one track for the video and one track for the audio, each with the expected duration and without repetition or errors.

When playing the final video file, both the video and audio should end at the same time, without any repeated or redundant audio tracks.

Branch

master/main

What operating systems are you seeing the problem on?

Alma Linux 8.9, Windows

Relevant log output

No response

Other information

No response

FrancoStino · 2024-06-03T11:27:08Z

Confirmed.

thebetauser · 2024-06-08T19:07:52Z

Confirmed as well. I have tested in on 2 encrypted files under 5 minutes and both have the same issue.

Edit:
As a temporary hotfix, i have added the -shortest flag to the mux_process function so it cuts at the shortest stream (usually is the video). It works but i will wait for an official fix to be pushed.

thebetauser · 2024-06-15T01:53:17Z

@Puyodead1 Will there be an official fix for this or should I create a PR with my changes?

Puyodead1 · 2024-06-15T09:43:37Z

@Puyodead1 Will there be an official fix for this or should I create a PR with my changes?

Go ahead and make a PR.

Added -shortest flag to force the shortest stream during multiplexing. This resolves issue Puyodead1#215 on linux.

auoie · 2024-06-16T13:45:04Z

It seems like yt-dlp isn't able to distinguish between the different audio segments in the .mpd files. In contrast, ffmeg can differentiate between them. I went into the ./temp folder and ran:

yt-dlp --allow-unplayable-formats --enable-file-urls -F file://$(pwd)/index_${ID}.mpd

It displays the following tracks:

ID EXT RESOLUTION │  TBR PROTO │ VCODEC       VBR ACODEC     ABR ASR MORE INFO
────────────────────────────────────────────────────────────────────────────────────────────────────
7  m4a audio only │  64k dash  │ audio only       mp4a.40.5  64k 44k [eng] DRM, DASH audio, m4a_dash
1  mp4 640x360    │  85k dash  │ avc1.4D401E  85k video only         DRM, DASH video, mp4_dash
2  mp4 640x360    │ 124k dash  │ avc1.4D401E 124k video only         DRM, DASH video, mp4_dash
3  mp4 768x432    │ 191k dash  │ avc1.4D401E 191k video only         DRM, DASH video, mp4_dash
4  mp4 1024x576   │ 283k dash  │ avc1.4D401F 283k video only         DRM, DASH video, mp4_dash
5  mp4 1280x720   │ 408k dash  │ avc1.4D401F 408k video only         DRM, DASH video, mp4_dash
6  mp4 1920x1080  │ 788k dash  │ avc1.4D4028 788k video only         DRM, DASH video, mp4_dash

I also ran:

ffmpeg -i index_${ID}.mpd

It displays the following tracks:

Input #0, dash, from 'index_9807122.mpd':
  Duration: 00:02:12.00, start: 0.000000, bitrate: 2 kb/s
  Program 0
  Stream #0:0: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 81 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 84954
      id              : 1
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:1: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 120 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 124412
      id              : 2
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:2: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 768x432 [SAR 1:1 DAR 16:9], 186 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 191129
      id              : 3
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:3: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1024x576 [SAR 1:1 DAR 16:9], 275 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 282741
      id              : 4
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:4: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 398 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 408096
      id              : 5
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:5: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 771 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 788192
      id              : 6
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:6(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 7
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:7(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 8
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:8(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 9
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:9(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 10
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:10(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 11
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:11(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 12
    Side data:
      unknown side data type 24 (1085 bytes)

yt-dlp identifies 6 video tracks and 1 audio track. ffmpeg identifies 6 video tracks and 6 audio tracks.
These 6 audio tracks are identical. I used yt-dlp and ffmpeg to download the audio:

ffmpeg \
  -loglevel verbose \
  -i index_${ID}.mpd \
  -map 0:p:0:6 \
  -c copy \
  ffmpeg.m4a # 1.05MB
yt-dlp \
  -f 7 \
  --allow-unplayable-formats \
  --enable-file-urls \
  file://$(pwd)/index_${ID}.mpd \
  -o yt-dlp.m4a # 6.78MB

Basically, yt-dlp downloads all 6 audio tracks and combines them into one. It seems to be incapable of only downloading a single audio track. ffmpeg can do a single track, but it's slow. So a fix is to use some XML parser to parse the .mpd file, delete all of the audio <Representation/> elements except for one, and then use yt-dlp for downloading. This should also make downloading faster since you're not downloading the same thing multiple times.

Puyodead1 · 2024-06-16T14:20:05Z

It seems like yt-dlp isn't able to distinguish between the different audio segments in the .mpd files. In contrast, ffmeg can differentiate between them. I went into the ./temp folder and ran:

yt-dlp --allow-unplayable-formats --enable-file-urls -F file://$(pwd)/index_${ID}.mpd

It displays the following tracks:

ID EXT RESOLUTION │  TBR PROTO │ VCODEC       VBR ACODEC     ABR ASR MORE INFO
────────────────────────────────────────────────────────────────────────────────────────────────────
7  m4a audio only │  64k dash  │ audio only       mp4a.40.5  64k 44k [eng] DRM, DASH audio, m4a_dash
1  mp4 640x360    │  85k dash  │ avc1.4D401E  85k video only         DRM, DASH video, mp4_dash
2  mp4 640x360    │ 124k dash  │ avc1.4D401E 124k video only         DRM, DASH video, mp4_dash
3  mp4 768x432    │ 191k dash  │ avc1.4D401E 191k video only         DRM, DASH video, mp4_dash
4  mp4 1024x576   │ 283k dash  │ avc1.4D401F 283k video only         DRM, DASH video, mp4_dash
5  mp4 1280x720   │ 408k dash  │ avc1.4D401F 408k video only         DRM, DASH video, mp4_dash
6  mp4 1920x1080  │ 788k dash  │ avc1.4D4028 788k video only         DRM, DASH video, mp4_dash

I also ran:

ffmpeg -i index_${ID}.mpd

It displays the following tracks:

Input #0, dash, from 'index_9807122.mpd':
  Duration: 00:02:12.00, start: 0.000000, bitrate: 2 kb/s
  Program 0
  Stream #0:0: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 81 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 84954
      id              : 1
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:1: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 120 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 124412
      id              : 2
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:2: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 768x432 [SAR 1:1 DAR 16:9], 186 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 191129
      id              : 3
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:3: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1024x576 [SAR 1:1 DAR 16:9], 275 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 282741
      id              : 4
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:4: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 398 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 408096
      id              : 5
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:5: Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 771 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      variant_bitrate : 788192
      id              : 6
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:6(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 7
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:7(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 8
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:8(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 9
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:9(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 10
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:10(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 11
    Side data:
      unknown side data type 24 (1085 bytes)
  Stream #0:11(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 47 channels, fltp, 62 kb/s (default)
    Metadata:
      variant_bitrate : 64139
      id              : 12
    Side data:
      unknown side data type 24 (1085 bytes)

yt-dlp identifies 6 video tracks and 1 audio track. ffmpeg identifies 6 video tracks and 6 audio tracks. These 6 audio tracks are identical. I used yt-dlp and ffmpeg to download the audio:

ffmpeg \
  -loglevel verbose \
  -i index_${ID}.mpd \
  -map 0:p:0:6 \
  -c copy \
  ffmpeg.m4a # 1.05MB
yt-dlp \
  -f 7 \
  --allow-unplayable-formats \
  --enable-file-urls \
  file://$(pwd)/index_${ID}.mpd \
  -o yt-dlp.m4a # 6.78MB

Basically, yt-dlp downloads all 6 audio tracks and combines them into one. It seems to be incapable of only downloading a single audio track. ffmpeg can do a single track, but it's slow. So a fix is to use some XML parser to parse the .mpd file, delete all of the audio <Representation/> elements except for one, and then use yt-dlp for downloading. This should also make downloading faster since you're not downloading the same thing multiple times.

what's strange is that I don't have this issue while some others do. also maybe yt-dlp is filtering out the other audio tracks because they are duplicates? idk, if that's not the case, this should be reported to yt-dlp devs

oijm17 added the bug Something isn't working label Apr 16, 2024

oijm17 assigned Puyodead1 Apr 16, 2024

oijm17 changed the title ~~[Bug]: Audio stream is joined 4 times to the resulting file~~ [Bug]: Video Ends Early While Audio Continues Repeating in Decrypted Files Apr 23, 2024

thebetauser added a commit to thebetauser/udemy-downloader that referenced this issue Jun 16, 2024

Update main.py

1c0bca9

Added -shortest flag to force the shortest stream during multiplexing. This resolves issue Puyodead1#215 on linux.

This was referenced Jun 16, 2024

Update main.py thebetauser/udemy-downloader#1

Merged

fix for audio overrun during multiplexing #226

Merged

auoie mentioned this issue Jun 17, 2024

_merge_mpd_periods merges fragments from formats in the same period if the formats are functionally identical yt-dlp/yt-dlp#10200

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Video Ends Early While Audio Continues Repeating in Decrypted Files #215

[Bug]: Video Ends Early While Audio Continues Repeating in Decrypted Files #215

oijm17 commented Apr 16, 2024 •

edited

Loading

FrancoStino commented Jun 3, 2024 •

edited

Loading

thebetauser commented Jun 8, 2024 •

edited

Loading

thebetauser commented Jun 15, 2024

Puyodead1 commented Jun 15, 2024

auoie commented Jun 16, 2024

Puyodead1 commented Jun 16, 2024

[Bug]: Video Ends Early While Audio Continues Repeating in Decrypted Files #215

[Bug]: Video Ends Early While Audio Continues Repeating in Decrypted Files #215

Comments

oijm17 commented Apr 16, 2024 • edited Loading

What happened?

Expected Result

Branch

What operating systems are you seeing the problem on?

Relevant log output

Other information

FrancoStino commented Jun 3, 2024 • edited Loading

thebetauser commented Jun 8, 2024 • edited Loading

thebetauser commented Jun 15, 2024

Puyodead1 commented Jun 15, 2024

auoie commented Jun 16, 2024

Puyodead1 commented Jun 16, 2024

oijm17 commented Apr 16, 2024 •

edited

Loading

FrancoStino commented Jun 3, 2024 •

edited

Loading

thebetauser commented Jun 8, 2024 •

edited

Loading