Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtitles not longer downloaded from www.pbs.org #18796

Open
himadri0327 opened this issue Jan 10, 2019 · 4 comments · May be fixed by #24430
Open

Subtitles not longer downloaded from www.pbs.org #18796

himadri0327 opened this issue Jan 10, 2019 · 4 comments · May be fixed by #24430

Comments

@himadri0327
Copy link

@himadri0327 himadri0327 commented Jan 10, 2019

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2019.01.02. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • [x ] I've verified and I assure that I'm running youtube-dl 2019.01.02

Before submitting an issue make sure you have:

  • [x ] At least skimmed through the README, most notably the FAQ and BUGS sections
  • [x ] Searched the bugtracker for similar issues including closed ones
  • [x ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • [x ] Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

C:\temp> youtube-dl -v --list-subs https://www.pbs.org/video/sinking-cities-miami-bcdxzj/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', '--list-subs', 'https://www.pbs.org/video/sinking-cities-miami-bcdxzj/']
[debug] Encodings: locale cp1251, fs mbcs, out cp65001, pref cp1251
[debug] youtube-dl version 2019.01.02
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.14393
[debug] exe versions: ffmpeg N-91260-g3a56ade1f3, ffprobe N-91260-g3a56ade1f3
[debug] Proxy map: {}
[debug] Using fake IP 3.157.152.16 (US) as X-Forwarded-For.
[pbs] Downloading JSON metadata
[pbs] sinking-cities-miami-bcdxzj: Downloading webpage
[pbs] sinking-cities-miami-bcdxzj: Downloading widget/partnerplayer page
[pbs] sinking-cities-miami-bcdxzj: Downloading portalplayer page
[pbs] sinking-cities-miami-bcdxzj: Downloading 0 video url info
[pbs] sinking-cities-miami-bcdxzj: Downloading m3u8 information
[pbs] sinking-cities-miami-bcdxzj: Downloading 1 video url info
[pbs] sinking-cities-miami-bcdxzj: Checking http-2000k video URL
[pbs] sinking-cities-miami-bcdxzj: http-2000k video URL is invalid, skipping
[pbs] sinking-cities-miami-bcdxzj: Checking http-6500k video URL
[pbs] sinking-cities-miami-bcdxzj: http-6500k video URL is invalid, skipping
[pbs] sinking-cities-miami-bcdxzj: Checking http-4500k video URL
[pbs] sinking-cities-miami-bcdxzj: http-4500k video URL is invalid, skipping
[pbs] sinking-cities-miami-bcdxzj: Checking http-3000k video URL
[pbs] sinking-cities-miami-bcdxzj: Checking http-1100k video URL
[pbs] sinking-cities-miami-bcdxzj: http-1100k video URL is invalid, skipping
[pbs] sinking-cities-miami-bcdxzj: Checking http-730k video URL
[pbs] sinking-cities-miami-bcdxzj: http-730k video URL is invalid, skipping
3018705799 has no subtitles
<end of log>
@appleton-tom
Copy link

@appleton-tom appleton-tom commented Jan 19, 2019

A Workaround:
I hope this is fixed for the method is a bit tedious.
It looks as if the subtitles are broken up into segments. If you save the JSON file (--write-info-json) and locate the correct m3u8 (try one with 1080p in the name) you can load the URL into VLC player, "Open Network Stream". The video should play and subtitles appear. If only video, try another "m3u8". If you load a "URLSnooper" prior to VLC you will find the address of the first segment, called "http://ga.video.cdn.pbs.org/videos/masterpiece/....captions0.vtt". The next is "...captions1.vtt" and so on. Save them all and join them together. Don't forget to edit out the extra WEBVTT header lines, each segment has one. Two episodes of about 1hr 20m each had about 30 caption segments.
I can't watch the PBS videos direct, I'm outside USA, and the proxies sites I tried didn't work. It's possible that a single subs file would show in the URLsnooper.

@himadri0327
Copy link
Author

@himadri0327 himadri0327 commented Apr 28, 2019

The workaround info from appleton-tom is correct.

By using the --write-info-json option and looking into the JSON file, every video-format section has two m3u8 (url & manifest_url) and I see:
(a) all the "manifest_url m3u8" for every video-format are the same,
(b) downloading manifest_url m3u8, it contains every formats m3u8 and the captions m3u8

URLs found in JSON

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-234p-145k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-270p-365k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-360p-730k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-432p-1100k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-540p-2000k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-720p-3000k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-mp4-720p-3000k.mp4
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-720p-4500k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-1080p-6500k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p-1080p-6500k.m3u8
manifest_url:
https://ga.video.cdn.pbs.org/videos/sinking-cities/b012d498-514e-4e58-b6fd-8a383478f2bb/2000070658/hd-16x9-mezzanine-1080p/uvqsl1fe_sink0104-16x9-1080p.m3u8

MANIFEST_URL M3U8 "uvqsl1fe_sink0104-16x9-1080p.m3u8" contains the captions.m3u8 and every video-format m3u8 (again):

#EXTM3U
#EXT-X-INDEPENDENT-SEGMENTS
#EXT-X-VERSION:3
#EXT-X-MEDIA:URI="uvqsl1fe_sink0104-captions.m3u8",TYPE=SUBTITLES,GROUP-ID="subs",LANGUAGE="en",NAME="English",DEFAULT=NO,AUTOSELECT=YES,FORCED=NO,CHARACTERISTICS="public.accessibility.describes-music-and-sound,public.accessibility.transcribes-spoken-dialog"
#EXT-X-STREAM-INF:BANDWIDTH=2419521,AVERAGE-BANDWIDTH=2180501,RESOLUTION=960x540,CODECS="avc1.64001f,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-540p-2000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=7571585,AVERAGE-BANDWIDTH=6778267,RESOLUTION=1920x1080,CODECS="avc1.640028,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-1080p-6500k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5320844,AVERAGE-BANDWIDTH=4734878,RESOLUTION=1280x720,CODECS="avc1.64001f,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-720p-4500k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=3503073,AVERAGE-BANDWIDTH=3202210,RESOLUTION=1280x720,CODECS="avc1.64001f,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-720p-3000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1380544,AVERAGE-BANDWIDTH=1260911,RESOLUTION=768x432,CODECS="avc1.64001e,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-432p-1100k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=927291,AVERAGE-BANDWIDTH=848084,RESOLUTION=640x360,CODECS="avc1.64001e,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-360p-730k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=507593,AVERAGE-BANDWIDTH=461249,RESOLUTION=480x270,CODECS="avc1.640015,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-270p-365k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=248412,AVERAGE-BANDWIDTH=233534,RESOLUTION=416x234,CODECS="avc1.64000c,mp4a.40.2",SUBTITLES="subs"
uvqsl1fe_sink0104-16x9-1080p-234p-145k.m3u8

@chkuendig
Copy link

@chkuendig chkuendig commented Jan 22, 2020

It's actually pretty simple to automatically download and merge these subtitles, e.g. with ffmpeg:

ffmpeg -i https://ga.video.cdn.pbs.org/videos/frontline/389b0dcc-8df4-4901-8a81-8c01097373e2/2000151468/hd-16x9-mezzanine-1080p/00003808-captions.m3u8 -c:s srt test.srt (url is from the manifest m3u as described above)

@Veazer
Copy link

@Veazer Veazer commented Feb 8, 2020

@chkuendig
I tried your method for the video I'm trying to grab and it seems to only grab the first .vtt segment.
ffmpeg -i https://ga.video.cdn.pbs.org/videos/frontline/8f769cee-b6d6-437d-80b5-74252ed7642d/2000150564/hd-16x9-mezzanine-1080p/00003717fes-captions.m3u8 -c:s srt test.srt

Full output:
srt-merge-test.txt

@gesa gesa linked a pull request that will close this issue Mar 22, 2020
5 of 8 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.