Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtitles of ARD/Das Erste are off by 10 hours #22331

Open
m4thu opened this issue Sep 7, 2019 · 3 comments
Open

Subtitles of ARD/Das Erste are off by 10 hours #22331

m4thu opened this issue Sep 7, 2019 · 3 comments

Comments

@m4thu
Copy link

@m4thu m4thu commented Sep 7, 2019

Checklist

  • I'm reporting a broken site support issue
  • I've verified that I'm running youtube-dl version 2019.09.01
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar bug reports including closed ones
  • I've read bugs section in FAQ

Verbose log

 youtube-dl -f best -v --all-subs --skip-download https://www.ardmediathek.de/ard/player/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC8zYmEwN2VkOC0xMzNjLTQ2MWQtYjZkOS1jZmNmYWZkNTExOGE/falscher-hase 
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-f', 'best', '-v', '--all-subs', '--skip-download', 'https://www.ardmediathek.de/ard/player/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC8zYmEwN2VkOC0xMzNjLTQ2MWQtYjZkOS1jZmNmYWZkNTExOGE/falscher-hase']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.09.01
[debug] Python version 3.7.4 (CPython) - Linux-5.2.11-arch1-1-ARCH-x86_64-with-arch
[debug] exe versions: ffmpeg 4.2.1, ffprobe 4.2.1, rtmpdump 2.4
[debug] Proxy map: {}
[ARDBetaMediathek] falscher-hase: Downloading webpage
[ARDBetaMediathek] Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC8zYmEwN2VkOC0xMzNjLTQ2MWQtYjZkOS1jZmNmYWZkNTExOGE: Downloading m3u8 information
[info] Writing video subtitles to: Falscher Hase-Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC8zYmEwN2VkOC0xMzNjLTQ2MWQtYjZkOS1jZmNmYWZkNTExOGE.de.ttml

Description

The extraction of the subtitles is not working as intended. The subtitles start at 10:00:00.000 while the should start at 00:00:00.000.

I quote out of the .ttml file of Tatort: Falscher Hase:

<tt:p xml:id="C1" region="R1" style="S2" begin="10:00:00.000" end="10:00:02.200">
<tt:span style="S3">UNTERTITEL: Hessischer Rundfunk</tt:span>
</tt:p>
<tt:p xml:id="C2" region="R1" style="S2" begin="10:00:02.200" end="10:00:04.520">
<tt:span style="S3">UNTERTITEL: Hessischer Rundfunk</tt:span>
</tt:p>
<tt:p xml:id="C3" region="R1" style="S2" begin="10:00:06.520" end="10:00:10.000">
<tt:span style="S4">* Tatort-Titelmelodie </tt:span>
</tt:p>
<tt:p xml:id="C4" region="R1" style="S2" begin="10:00:35.400" end="10:00:38.080">
<tt:span style="S4">
melancholische klassische Musik *</tt:span>

After one hour in the movie has passed, the times counts up to 11:00 instead of 01:00

<tt:p xml:id="C800" region="R1" style="S2" begin="10:59:56.560" end="10:59:58.000">
<tt:span style="S7">Falscher Hase.</tt:span>
</tt:p>
<tt:p xml:id="C801" region="R1" style="S2" begin="11:00:08.160" end="11:00:10.160">
<tt:span style="S7">Der ist ja ganz saftig.</tt:span>
</tt:p>
<tt:p xml:id="C802" region="R1" style="S2" begin="11:00:13.120" end="11:00:16.680">
<tt:span style="S7">Brix, willst du mal probieren?</tt:span>
tt:br/
<tt:span style="S7">Der ist sensationell saftig.</tt:span>

The subtitles can be added in for example VLC, but are actually never shown as the movie ends before 10 hours are passed.

@adrianheine
Copy link

@adrianheine adrianheine commented Sep 7, 2019

@m4thu
Copy link
Author

@m4thu m4thu commented Sep 8, 2019

Yes, but where does the time information for the text come from? it is all off by ten hours.

@basicmaster
Copy link
Contributor

@basicmaster basicmaster commented Oct 28, 2019

This is actually an issue on the broadcaster side - the used EBU-TT-D(-Basic-DE) profile with a timebase of media requires that all timecodes align to the video timeline i.e. start at zero.

The ten hours are a "magic value": It is common that on production side timecodes start at 10:00:00 (instead of 00:00:00). This has historical reasons; it allows to have a timecode for things before the actual content. But for distribution this shall not be the case, as described. The ARD Mediathek player seems to handle that case i.e. automatically subtracts ten hours in such case.

I will contact the relevant colleagues about this next week...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.