Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YouTube sub-titles and captions. #1875

Closed
ProgramErgoSum opened this issue Dec 2, 2013 · 6 comments
Closed

YouTube sub-titles and captions. #1875

ProgramErgoSum opened this issue Dec 2, 2013 · 6 comments

Comments

@ProgramErgoSum
Copy link

@ProgramErgoSum ProgramErgoSum commented Dec 2, 2013

How does youtube-dl differentiate sub-titles and captions ? The result of the command to list sub-titles seems to consider them different.

nsubrahm@nsubrahm:~$ youtube-dl --list-subs 1V9wVmO0Tfg
[youtube] Setting language
[youtube] 1V9wVmO0Tfg: Downloading video webpage
[youtube] 1V9wVmO0Tfg: Downloading video info webpage
[youtube] 1V9wVmO0Tfg: Extracting video information
WARNING: video doesn't have subtitles
[youtube] 1V9wVmO0Tfg: Looking for automatic captions
[youtube] 1V9wVmO0Tfg: Downloading webpage
[youtube] 1V9wVmO0Tfg: Available subtitles for video: 
[youtube] 1V9wVmO0Tfg: Available automatic captions for video: vi,el,eo,en,af,zh-Hans,sw,ca,gu,iw,zh-Hant,cs,cy,ar,mk,ga,eu,et,az,id,es,ru,gl,nl,pt,la,lo,jv,sv,lv,lt,th,tr,it,ro,is,fil,ta,yi,be,fr,bg,ceb,sl,hr,bn,de,ht,da,fa,hmn,hi,bs,fi,hu,ja,uk,ka,te,sr,sq,no,ko,kn,km,ur,sk,mt,pl,ms,mr

The video does have sub-titles as I can download the same using Google2SRT tool (http://google2srt.sourceforge.net/). Although, it seems, Google2SRT does the download from http://video.google.com.

So, can the sub-titles or captions or timed-text can be downloaded for a given YouTube video using youtube-dl ?

@phihag
Copy link
Contributor

@phihag phihag commented Dec 2, 2013

As you can see in the following screenshot, the video has only automatic captions. Since those are of an inferior quality and different technical sources, youtube-dl distinguishes automatic captions from manually generated subtitles (at the moment, we may change that to avoid user confusion). Google2SRT does not make that distinction, and downloads the automatically generated subtitles as well - note the incorrect transcription of moving away from Earth as moving away from our in the web player, the file downloadede by Google2SRT, as well as the automatic captions downloaded by youtube-dl.

captions

Pass in --write-auto-sub to write out automatically generated subtitles files as well.

@phihag phihag closed this Dec 2, 2013
@ProgramErgoSum
Copy link
Author

@ProgramErgoSum ProgramErgoSum commented Dec 3, 2013

Thank you for the explanation.

Just curious, how does youtube-dl distinguish automatic captions and manually generated sub-titles ? And, I think, you should maintain this differentiation. Perhaps, show a warning message that the item being downloaded is an automatic caption or a manually generated sub-title; but, the distinction should be made, in my opinion.

@phihag
Copy link
Contributor

@phihag phihag commented Dec 3, 2013

Technically, the URLs for automatic captions and manually generated are different, so youtube-dl doesn't have to do anything special to distinguish the two.

@ProgramErgoSum
Copy link
Author

@ProgramErgoSum ProgramErgoSum commented Dec 4, 2013

Thanks. One last question.

By "automatically generated subtitles files" (returned by --write-auto-sub), are you referring to a result of automatic translation ? I guess, what I am asking is, is there a way YouTube says, this sub-title has been provided by a real human and not an API (or service) ?

@phihag
Copy link
Contributor

@phihag phihag commented Dec 4, 2013

YouTube uses (automatic) voice recognition to recognize text in the original language, and then uses automatic translations to generate text in all other languages. Apart from the programmers, there are no humans involved (that's kind of Google's thing). It's up to the uploader how the non-auto subtitles are generated, but typically these are created by humans.

@ProgramErgoSum
Copy link
Author

@ProgramErgoSum ProgramErgoSum commented Dec 4, 2013

Thank you, Philipp !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.