Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gamespot] Add support for video embeds in article pages(was: RegexNotFoundError: Unable to extract data video) #14652

Closed
Hrxn opened this issue Nov 1, 2017 · 16 comments
Labels

Comments

@Hrxn
Copy link

@Hrxn Hrxn commented Nov 1, 2017

  • I've verified and I assure that I'm running youtube-dl 2017.10.29

  • At least skimmed through the README, most notably the FAQ and BUGS sections

  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

GameSpot.com extraction broken, due to site change I assume.

Example URL:
https://www.gamespot.com/articles/the-last-of-us-2-receives-new-ps4-trailer/1100-6454469/


PS D:\> youtube-dl --verbose --ignore-config "https://www.gamespot.com/articles/the-last-of-us-2-receives-new-ps4-trailer/1100-6454469/"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', '--ignore-config', 'https://www.gamespot.com/articles/the-last-of-us-2-receives-new-ps4-trailer/1100-6454469/']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2017.10.29
[debug] Python version 3.4.4 - Windows-10-10.0.16299
[debug] exe versions: ffmpeg 3.4, ffprobe 3.4
[debug] Proxy map: {}
[GameSpot] 6454469: Downloading webpage
ERROR: Unable to extract data video; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmpk63cqkyt\build\youtube_dl\YoutubeDL.py", line 784, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmpk63cqkyt\build\youtube_dl\extractor\common.py", line 434, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmpk63cqkyt\build\youtube_dl\extractor\gamespot.py", line 44, in _real_extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmpk63cqkyt\build\youtube_dl\extractor\common.py", line 797, in _search_regex
youtube_dl.utils.RegexNotFoundError: Unable to extract data video; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

PS D:\>
@Hrxn
Copy link
Author

@Hrxn Hrxn commented Nov 1, 2017

Addendum

More luck with this URL
https://www.gamespot.com/videos/the-last-of-us-part-ii-pgw-2017-trailer/2300-6441610/

This is the Share link from within the player controls.

But:
It's only 720p, and has some kind of advertising at the beginning, fuck yeah.


The Embed link does not work:

PS E:\Test> youtube-dl --verbose --ignore-config "https://www.gamespot.com/videos/embed/6441610/"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', '--ignore-config', 'https://www.gamespot.com/videos/embed/6441610/']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2017.10.29
[debug] Python version 3.4.4 - Windows-10-10.0.16299
[debug] exe versions: ffmpeg 3.4, ffprobe 3.4
[debug] Proxy map: {}
[generic] 6441610: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 6441610: Downloading webpage
[generic] 6441610: Extracting information
ERROR: Unsupported URL: https://www.gamespot.com/videos/embed/6441610/
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmpk63cqkyt\build\youtube_dl\YoutubeDL.py", line 784, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmpk63cqkyt\build\youtube_dl\extractor\common.py", line 434, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmpk63cqkyt\build\youtube_dl\extractor\generic.py", line 3059, in _real_extract
youtube_dl.utils.UnsupportedError: Unsupported URL: https://www.gamespot.com/videos/embed/6441610/

PS E:\Test>

But:
The embed page has a 1080p version, apparently..

PS E:\Test>
>> curl.exe -s "https://www.gamespot.com/videos/embed/6441610/" |
>> pup -p "div.js-video-player-new.av-video-player.av-desktop-player.av-video-on-demand.is-vid-noseek.is-vid-show-controls.is-vid-uvpjs-player attr{data-video}" |
>> jq '.videoStreams .adaptive_hd'
>>
"https://gamespot-vh.akamaihd.net/i/d5/2017/10/30/Trailer_LastofUs_PGW_20171030_,4000,.mp4.csmil/master.m3u8"
PS E:\Test>

Edit:

Single line:

curl.exe -s "https://www.gamespot.com/videos/embed/6441610/" | pup -p "div.js-video-player-new.av-video-player.av-desktop-player.av-video-on-demand.is-vid-noseek.is-vid-show-controls.is-vid-uvpjs-player attr{data-video}" | jq '.videoStreams .adaptive_hd'
@Hrxn
Copy link
Author

@Hrxn Hrxn commented Nov 11, 2017

@remitamine I think this issue has been mentioned on the Changelog/Relnotes, so what's the status on this? Still fixes in the pipeline?

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Nov 11, 2017

python __main__.py https://www.gamespot.com/videos/embed/6441610/
[GameSpot] 6441610: Downloading webpage
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Checking http-2000 video format URL
[GameSpot] 6441610: Checking http-4400 video format URL
[GameSpot] 6441610: Checking http-3000 video format URL
[GameSpot] 6441610: Checking http-1200 video format URL
[GameSpot] 6441610: Checking http-764 video format URL
[GameSpot] 6441610: Checking http-512 video format URL
[GameSpot] 6441610: Checking http-264 video format URL
[download] Destination: The Last Of Us Part II - PGW 2017 Trailer-gs-2300-6441610.mp4
[download] 100% of 172.31MiB in 06:14
python __main__.py https://www.gamespot.com/videos/the-last-of-us-part-ii-pgw-2017-trailer/2300-6441610/
[GameSpot] 6441610: Downloading webpage
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Checking http-2000 video format URL
[GameSpot] 6441610: Checking http-4400 video format URL
[GameSpot] 6441610: Checking http-3000 video format URL
[GameSpot] 6441610: Checking http-1200 video format URL
[GameSpot] 6441610: Checking http-764 video format URL
[GameSpot] 6441610: Checking http-512 video format URL
[GameSpot] 6441610: Checking http-264 video format URL
[download] The Last Of Us Part II - PGW 2017 Trailer-gs-2300-6441610.mp4 has already been downloaded
[download] 100% of 172.31MiB

video and embed pages support has been fixed, video embeds in articles pages are not supported yet, it's easy to add support for them but i'm not working on it for now,

@remitamine remitamine added the easy label Nov 11, 2017
@remitamine remitamine changed the title [gamespot] RegexNotFoundError: Unable to extract data video [gamespot] Add support for video embeds in article pages(was: RegexNotFoundError: Unable to extract data video) Nov 11, 2017
@Hrxn
Copy link
Author

@Hrxn Hrxn commented Nov 11, 2017

Isn't that the video with additional ~ 15s advertising at the beginning?

Edit:

Yes.

PS E:\Test> youtube-dl --ignore-config --verbose "https://www.gamespot.com/videos/embed/6441610/"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--ignore-config', '--verbose', 'https://www.gamespot.com/videos/embed/6441610/']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2017.11.06
[debug] Python version 3.4.4 - Windows-10-10.0.16299
[debug] exe versions: ffmpeg 3.4, ffprobe 3.4
[debug] Proxy map: {}
[GameSpot] 6441610: Downloading webpage
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Checking http-2000 video format URL
[GameSpot] 6441610: Checking http-4400 video format URL
[GameSpot] 6441610: Checking http-3000 video format URL
[GameSpot] 6441610: Checking http-1200 video format URL
[GameSpot] 6441610: Checking http-764 video format URL
[GameSpot] 6441610: Checking http-512 video format URL
[GameSpot] 6441610: Checking http-264 video format URL
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'http://once.unicornmedia.com/now/media/progressive/f35627fb-e0cc-43bb-ac38-859bab1493ff/0df57330-950f-4947-9946-833815cf6612/468fb310-a585-11e4-bfdb-005056837bc7/6441610/content.mp4'
[download] The Last Of Us Part II - PGW 2017 Trailer-gs-2300-6441610.mp4 has already been downloaded
[download] 100% of 164.29MiB
PS E:\Test> MediaInfo '.\The Last Of Us Part II - PGW 2017 Trailer-gs-2300-6441610.mp4' | sls "Duration|Width|Height"

Duration                                 : 5 min 13 s
Duration                                 : 5 min 12 s
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Duration                                 : 5 min 13 s
Duration                                 : 5 min 12 s
Duration                                 : 5 min 13 s


PS E:\Test>

Using the URL from embed page above, extracted from the doc

PS E:\Test> youtube-dl --ignore-config --verbose --hls-prefer-native "https://gamespot-vh.akamaihd.net/i/d5/2017/10/30/Trailer_LastofUs_PGW_20171030_,4000,.mp4.csmil/master.m3u8"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--ignore-config', '--verbose', '--hls-prefer-native', 'https://gamespot-vh.akamaihd.net/i/d5/2017/10/30/Trailer_LastofUs_PGW_20171030_,4000,.mp4.csmil/master.m3u8']
[debug] Encodings: locale cp1252, fs mbcs, out cp65001, pref cp1252
[debug] youtube-dl version 2017.11.06
[debug] Python version 3.4.4 - Windows-10-10.0.16299
[debug] exe versions: ffmpeg 3.4, ffprobe 3.4
[debug] Proxy map: {}
[generic] master: Requesting header
WARNING: Could not send HEAD request to https://gamespot-vh.akamaihd.net/i/d5/2017/10/30/Trailer_LastofUs_PGW_20171030_,4000,.mp4.csmil/master.m3u8: HTTP Error 405: Method Not Allowed
[generic] master: Downloading webpage
[generic] master: Downloading m3u8 information
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://gamespot-vh.akamaihd.net/i/d5/2017/10/30/Trailer_LastofUs_PGW_20171030_,4000,.mp4.csmil/index_0_av.m3u8'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 30
[download] Destination: master-master.mp4
[download] 100% of 141.07MiB in 00:35
[debug] ffmpeg command line: ffprobe -show_streams "file:master-master.mp4"
[ffmpeg] Fixing malformed AAC bitstream in "master-master.mp4"
[debug] ffmpeg command line: ffmpeg -y -i "file:master-master.mp4" -c copy -f mp4 "-bsf:a" aac_adtstoasc "file:master-master.temp.mp4"
PS E:\Test> MediaInfo .\master-master.mp4 | sls "duration|width|height"

Duration                                 : 4 min 57 s
Duration                                 : 4 min 57 s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Duration                                 : 4 min 57 s


PS E:\Test>
@remitamine
Copy link
Collaborator

@remitamine remitamine commented Nov 11, 2017

no, i just watched the downloaded video and it doesn't contain any ads, you can try yourself with version 2017.11.06 it contain the fix for video pages.
in the begining there are 30 secends unrelated to the trailer but it's part of the video and it can't removed automatically.

i think it can be removed.

@Hrxn
Copy link
Author

@Hrxn Hrxn commented Nov 11, 2017

Call of Duty: WWII ad at the beginning 😄

And 720p vs. 1080p, obviously.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Nov 11, 2017

python __main__.py -F https://www.gamespot.com/videos/the-last-of-us-part-ii-pgw-2017-trailer/2300-6441610/
[GameSpot] 6441610: Downloading webpage
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Downloading m3u8 information
[GameSpot] 6441610: Checking http-2000 video format URL
[GameSpot] 6441610: Checking http-4400 video format URL
[GameSpot] 6441610: Checking http-3000 video format URL
[GameSpot] 6441610: Checking http-1200 video format URL
[GameSpot] 6441610: Checking http-764 video format URL
[GameSpot] 6441610: Checking http-512 video format URL
[GameSpot] 6441610: Checking http-264 video format URL
[info] Available formats for gs-2300-6441610:
format code  extension  resolution note
hls-264      mp4        256x144     264k , avc1.42001e, mp4a.40.5
http-264     mp4        256x144     264k , avc1.42001e, mp4a.40.5
hls-512      mp4        384x216     512k , avc1.42001e, mp4a.40.5
http-512     mp4        384x216     512k , avc1.42001e, mp4a.40.5
hls-764      mp4        480x270     764k , avc1.42001e, mp4a.40.2
http-764     mp4        480x270     764k , avc1.42001e, mp4a.40.2
hls-778      mp4        640x360     778k 
hls-1055     mp4        640x360    1055k 
hls-1200     mp4        640x360    1200k , avc1.42001f, mp4a.40.2
http-1200    mp4        640x360    1200k , avc1.42001f, mp4a.40.2
hls-1796     mp4        960x540    1796k 
hls-2000     mp4        960x540    2000k , avc1.4d001f, mp4a.40.2
http-2000    mp4        960x540    2000k , avc1.4d001f, mp4a.40.2
hls-2444     mp4        1280x720   2444k 
hls-3000     mp4        1280x720   3000k , avc1.4d001f, mp4a.40.2
http-3000    mp4        1280x720   3000k , avc1.4d001f, mp4a.40.2
hls-3097     mp4        1280x720   3097k 
hls-3839     mp4        1920x1080  3839k 
hls-4400     mp4        1280x720   4400k , avc1.640028, mp4a.40.2
http-4400    mp4        1280x720   4400k , avc1.640028, mp4a.40.2 (best)

the 1080p is not selected because the 720p has higher bitrate, the ads are only included in the http formats.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Nov 11, 2017

in the next release, article pages will be supported and formats that has ads will be skipped.

@Hrxn
Copy link
Author

@Hrxn Hrxn commented Nov 12, 2017

Awesome, thanks a lot!

remitamine added a commit that referenced this issue Nov 13, 2017
@07416
Copy link

@07416 07416 commented Dec 2, 2018

Were reviews ever supported?

youtube-dl https://www.gamespot.com/reviews/gears-of-war-review/1900-6161188/

[generic] 1900-6161188: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 1900-6161188: Downloading webpage
[generic] 1900-6161188: Extracting information
ERROR: Unsupported URL: https://www.gamespot.com/reviews/gears-of-war-review/1900-6161188/
@Hrxn
Copy link
Author

@Hrxn Hrxn commented Dec 2, 2018

Nope, looks like that URL format is not supported..

Can you access the same vid with another URL? Like, something along the lines of https://www.gamespot.com/videos/...........?

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Dec 2, 2018

you can download this video using share URL(https://www.gamespot.com/videos/gears-of-war-video-review/2300-6161200/ or https://www.gamespot.com/videos/embed/6161200/), downloading from review URLs will be supported in the next version.

@07416
Copy link

@07416 07416 commented Dec 2, 2018

Can you access the same vid with another URL? Like, something along the lines of https://www.gamespot.com/videos/...........?

Yes, I found the URL eventually via the share menu.
https://www.gamespot.com/videos/gears-of-war-video-review/2300-6182341/

Initially I used Google to search under https://www.gamespot.com/gears-of-war/videos/, found the URL https://www.gamespot.com/videos/gears-of-war-video-review/2300-6182341/ which fails and redirects to the PC review.

@07416
Copy link

@07416 07416 commented Dec 2, 2018

@remitamine Yes, I successfully downloaded the video.

youtube-dl https://www.gamespot.com/videos/gears-of-war-video-review/2300-6161200/

@07416
Copy link

@07416 07416 commented Dec 2, 2018

This page fails (from the PC version review): youtube-dl https://www.gamespot.com/videos/gears-of-war-video-review/2300-6182341/:

[GameSpot] 6182341: Downloading webpage
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[debug] System config: []
[debug] User config: ['-o', 'C:/Users/user/Downloads/youtube-dl/%(title)s.%(ext)s']
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.gamespot.com/videos/gears-of-war-video-review/2300-6182341/']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2018.11.23
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.17134
[debug] exe versions: ffmpeg 4.1, ffprobe 4.1
[debug] Proxy map: {}
[GameSpot] 6182341: Downloading webpage
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bh3thhm\build\youtube_dl\YoutubeDL.py", line 792, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bh3thhm\build\youtube_dl\extractor\common.py", line 508, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bh3thhm\build\youtube_dl\extractor\gamespot.py", line 127, in _real_extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bh3thhm\build\youtube_dl\extractor\common.py", line 1292, in _sort_formats
youtube_dl.utils.ExtractorError: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Perhaps the age-gated videos aren't supported?

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Dec 2, 2018

Perhaps the age-gated videos aren't supported?

no, it's not it just requires more changes to the extractor, you can download the video using https://gamespot-pdl.akamaized.net/d3/gsc/2007/11/169_gearsofwar_vr_01_pc_110507_hr.mp4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.