Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Unable to extract OpenGraph title, certain playlist on webofstories.com #8417
Comments
|
i finally had some time to take a look into it myself and there's an issue extracting the title from the meta tags/opengraph in the page. a page where this fails is this: http://www.webofstories.com/play/oliver.sacks/152 checking the html i found that it's just invalid (contains quotes). meta property="og:title" content=""A Leg to Stand On"" / my conclusion is it's not a bug in youtubedl and can't be fixed. one can't consistently parse some non-html as html. anything to account for that would be a hack. whoever paid the guy for the webofstories website should demand their money back. guy doesn't even know html (there's also other nastiness like class names with spaces, i.e. class="duration text") the only hack i wouldn't be embarassed to post here is:
in def _real_extract(self, url): i'll try reporting to web of stories admins, so they fix their html. ;) and will do a hack locally just to get all the videos with meta information. |
|
i think this should be closed |
|
This issue has been fixed and fix will be incorporated in the next version of youtube-dl. |
using 2016.02.01
here's the full output and the link it doesn't work on.