Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto generated subtitles on YouTube #24257

Open
Wyndhamcrow opened this issue Mar 5, 2020 · 3 comments
Open

Auto generated subtitles on YouTube #24257

Wyndhamcrow opened this issue Mar 5, 2020 · 3 comments

Comments

@Wyndhamcrow
Copy link

@Wyndhamcrow Wyndhamcrow commented Mar 5, 2020

Checklist

  • [] I'm reporting a broken site support issue
  • I've verified that I'm running youtube-dl version 2020.03.06
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar bug reports including closed ones
  • I've read bugs section in FAQ

Verbose log

Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.03.06
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

-->

PASTE VERBOSE LOG HERE
```$ youtube-dl --write-auto-sub -v https://youtu.be/XIoB939gWmk
[debug] System config: []
[debug] User config: ['-o', '/data/data/com.termux/files/home/storage/downloads/%(extractor_key)s/%(uploader)s/%(title)s-%(id)s.%(ext)s']
[debug] Custom config: []
[debug] Command-line args: ['--write-auto-sub', '-v', 'https://youtu.be/XIoB939gWmk']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2020.03.06
[debug] Python version 3.8.1 (CPython) - Linux-4.4.153-perf+-aarch64-with-libc
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2
[debug] Proxy map: {}
[youtube] XIoB939gWmk: Downloading webpage
[youtube] XIoB939gWmk: Downloading embed webpage
[youtube] XIoB939gWmk: Refetching age-gated info webpage
[youtube] XIoB939gWmk: Looking for automatic captions
WARNING: Couldn't find automatic captions for XIoB939gWmk
[youtube] XIoB939gWmk: Downloading MPD manifest
[debug] Default format spec: bestvideo+bestaudio/best
WARNING: Requested formats are incompatible for merge and will be merged into mkv.
[download] /data/data/com.termux/files/home/storage/downloads/Youtube/80's 90's movies/Party Camp-XIoB939gWmk.mkv has already been downloaded and merged



## Description

<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->

WRITE DESCRIPTION HERE

Hi, on Android using termux auto generated subtitles will not download after update. Worked fine before update. Subs available on video.
@IEWbgfnYDwHRoRRSKtkdyMDUzgdwuBYgDKtDJWd

I have been trying to find a work around for this for days now. Saw you posted this days ago, was hoping it would have a fix by now.

The issue for me is with youtube livestream videos. ydl will not recognize the auto subs until the video is finished streaming AND once youtube has converted it to a non HLS video container, usually taking hours for longer videos on youtube's end.

Have you found any useful way to grab them in the interim?

@l0ophole
Copy link

@l0ophole l0ophole commented Jun 13, 2020

I found a way to download them in XML format by using the Developer Tools in Google Chrome. Open the YouTube video in Google Chrome and pause the video and then click the CC button to enable the subtitles. Press F12 and click in the Network tab, then right click on the item that starts with "timedtext" and click Copy followed by Copy link address.

Now paste the copied link to any download manager to save it. I just used "wget" and it worked, but if you're using Windows then you may have to either use a download manager or you can download the Windows version of wget. It's a command-line program, and to download the link that you copied you just open a command prompt and type: wget "the link you copied previously"

Make sure you put the link in quotes because it has special characters in the link.

@l0ophole
Copy link

@l0ophole l0ophole commented Jun 13, 2020

I used this series of piped commands to convert the XML to plaintext. it works in Linux, but YMMV in Windows.

grep utf8 subtitles.xml | cut -c15- | tr -d ,"' | sed 's/\n//' | xargs

Replace subtitles.xml with the name of your subtitles file.

If you want to save the plaintext version to a file you can do this:

grep utf8 big.xml | cut -c15- | tr -d ,"' | sed 's/\n//' | xargs >/path/to/new/file.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.