Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Make sure you are using the latest version: run
youtube-dl --versionand ensure your version is 2018.03.26.1. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.Before submitting an issue make sure you have:
What is the purpose of your issue?
The Problem
The way the
@contextproperty of a JSON-LD string is checked it too strict. Currently@contexthas to be exactly the stringhttp://schema.orgin order to be parsed by the_json_ldfunction incommon.py. However, some websites havehttpchanged tohttps. I also saw websites adding a/to the end like so:http://schema.org/.This leads to the problem, that JSON-LD strings of those websites do not get parsed, which cause video extraction errors, since
_json_ldthen returns an empty dictionary. (See below for an example.)Suggested Fix
In order to solve the problem, I suggest to make the check for the
@contextproperty more resiliant. Instead of checking against a static string...https://github.com/rg3/youtube-dl/blob/5d60b9971784289acd4325a8ed7b5afd7bea05ca/youtube_dl/extractor/common.py#L1028
we should use a regex that allows the above mentioned modification of the string:
This small change already fixes the currently broken
gamestar.pyextractor. Additionally, no existing extractor can be affected in a negative way, e.g, no extractor will get broken through this change.Please let me know, if I should open a pull request myself.
Example
An example of an extractor that fails because of not parsed JSON-LD is
gamestar.py. Please note, that 'KeyError' occurs only because the info_dict is empty, since theself._json_ldfunction didn't parse anything. Here, there is ahttpsinstead ofhttpused in the@contextproperty.