Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong conversion from ttml to srt #14187

Closed
ghost opened this issue Sep 12, 2017 · 2 comments
Closed

Wrong conversion from ttml to srt #14187

ghost opened this issue Sep 12, 2017 · 2 comments
Labels

Comments

@ghost
Copy link

@ghost ghost commented Sep 12, 2017

  • I've verified and I assure that I'm running youtube-dl 2017.09.11
  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

Log:

C:\Users\user>youtube-dl -u Username -p Password -v --write-sub --convert-subs srt --skip-download -o "~/Videos/%(title)s.%(ext)s" https://www.safaribooksonline.com/library/view/introduction-to-python/9781491904794/video171204.html
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-u', 'PRIVATE', '-p', 'PRIVATE', '-v', '--write-sub', '--convert-subs', 'srt', '--skip-download', '-o', '~/Videos/%(title)s.%(ext)s', 'https://www.safaribooksonline.com/library/view/introduction-to-python/9781491904794/video171204.html']
[debug] Encodings: locale cp1255, fs mbcs, out cp862, pref cp1255
[debug] youtube-dl version 2017.09.11
[debug] Python version 3.4.4 - Windows-10-10.0.15063
[debug] exe versions: ffmpeg N-87043-gf0f4888, ffprobe N-87043-gf0f4888
[debug] Proxy map: {}
[safari] Downloading login form
[safari] Logging in as Username 
[safari] 9781491904794/video171204: Downloading webpage
[safari] 9781491904794/video171204: Downloading kaltura session JSON
[Kaltura] 9781491904794-video171204: Downloading webpage
[Kaltura] 0_ox5ujlby: Downloading video info JSON
[Kaltura] 0_ox5ujlby: Downloading m3u8 information
[debug] Default format spec: bestvideo+bestaudio/best
[info] Writing video subtitles to: C:\Users\user\Videos\What Can You Do With Python.en.ttml
...
<end of log>

Description of your issue, suggested solution and other information

Converted srt is in a bad format causing bugs in both VLC and PSPlayer.
Provided are the original ttml, the wrongly formatted srt produced with youtube-dl and ffmpeg, and a correct format converted with the online converter from https://gotranscript.com/subtitle-converter

Here are the subtitle files:

Subtitles.zip

Here are the first few lines of the subs if there's a problem opening the attached zip above:

Original ttml:

<tt xmlns="http://www.w3.org/2006/04/ttaf1" xmlns:tts="http://www.w3.org/2006/04/ttaf1#styling" xml:lang="en">
  <head>
    <styling>
      <style id="1" tts:fontSize="14" tts:textAlign="center" tts:wrapOption="wrap"/>
    </styling>
  </head>
  <body>
    <div xml:lang="en">
      <p begin="00:00:00.00" end="00:00:02.76" style="1">
        
      </p>
      <p begin="00:00:02.76" end="00:00:05.08" style="1">
        I can say, unabashedly,<br/>
        that Python
      </p>
      <p begin="00:00:05.08" end="00:00:06.83" style="1">
        is my favorite<br/>
        programming language.
      </p>
      <p begin="00:00:06.83" end="00:00:08.56" style="1">
        I use it every day at work.
      </p>
      <p begin="00:00:08.56" end="00:00:10.10" style="1">
        It's also the primary<br/>
        language that I
      </p>
      <p begin="00:00:10.10" end="00:00:12.81" style="1">
        use for most of my side<br/>
        projects, my open source
      </p>
      <p begin="00:00:12.81" end="00:00:13.70" style="1">
        projects.
      </p>

    </div>
  </body>
</tt>

Wrong format srt:

1
00:00:00,000 --> 00:00:02,759
<font size="14">
        
      </font>

2
00:00:02,759 --> 00:00:05,080
<font size="14">
        I can say, unabashedly,

        that Python
      </font>

3
00:00:05,080 --> 00:00:06,830
<font size="14">
        is my favorite

        programming language.
      </font>

4
00:00:06,830 --> 00:00:08,560
<font size="14">
        I use it every day at work.
      </font>

5
00:00:08,560 --> 00:00:10,099
<font size="14">
        It's also the primary

        language that I
      </font>

6
00:00:10,099 --> 00:00:12,810
<font size="14">
        use for most of my side

        projects, my open source
      </font>

7
00:00:12,810 --> 00:00:13,699
<font size="14">
        projects.
      </font>

Correct format srt:

1
00:00:00,000 --> 00:00:02,760


2
00:00:02,760 --> 00:00:05,080
I can say, unabashedly,
that Python

3
00:00:05,080 --> 00:00:06,830
is my favorite
programming language.

4
00:00:06,830 --> 00:00:08,560
I use it every day at work.

5
00:00:08,560 --> 00:00:10,100
It's also the primary
language that I

6
00:00:10,100 --> 00:00:12,810
use for most of my side
projects, my open source

7
00:00:12,810 --> 00:00:13,700
projects.


@dstftw dstftw closed this Sep 12, 2017
@dstftw dstftw added the duplicate label Sep 12, 2017
@dstftw
Copy link
Collaborator

@dstftw dstftw commented Sep 12, 2017

Duplicate of #7908.

@ghost
Copy link
Author

@ghost ghost commented Sep 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.