Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DiscoveryGo while using cookies #11219

Closed
battlecattle opened this issue Nov 17, 2016 · 20 comments
Closed

DiscoveryGo while using cookies #11219

battlecattle opened this issue Nov 17, 2016 · 20 comments

Comments

@battlecattle
Copy link

@battlecattle battlecattle commented Nov 17, 2016

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like that [x])
  • Use Preview tab to see how your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2016.11.18. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2016.11.18

Before submitting an issue make sure you have:

  • At least skimmed through README and most notably FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add -v flag to your command line you run youtube-dl with, copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

C:\>youtube-dl.exe --restrict-filenames --continue --no-check-certificate --verbose --cookies cookies.txt https://www.discoverygo.com/gold-rush/misery-on-the-mountain/

[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['--restrict-filenames', '--continue', '--no-check-certificate', '--verbose', '--cookies', 'cookies.txt', 'https://www.discoverygo.com/gold-rush/misery-on-the-mountain/']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2016.11.18
[debug] Python version 3.4.4 - Windows-10-10.0.14393
[debug] exe versions: ffmpeg N-79209-gb3eda69, ffprobe N-79209-gb3eda69
[debug] Proxy map: {}
[DiscoveryGo] misery-on-the-mountain: Downloading webpage
ERROR: misery-on-the-mountain: Failed to parse JSON  (caused by ValueError("Expecting ',' delimiter: line 1 column 68762 (char 68761)",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 559, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 359, in raw_decode
ValueError: Expecting ',' delimiter: line 1 column 68762 (char 68761)
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 559, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 359, in raw_decode
ValueError: Expecting ',' delimiter: line 1 column 68762 (char 68761)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\YoutubeDL.py", line 694, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 357, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\discoverygo.py", line 53, in _real_extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 563, in _parse_json
youtube_dl.utils.ExtractorError: misery-on-the-mountain: Failed to parse JSON  (caused by ValueError("Expecting ',' delimiter: line 1 column 68762 (char 68761)",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

If the purpose of this issue is a site support request please provide all kinds of example URLs support for which should be included (replace following example URLs by yours):


Description of your issue, suggested solution and other information

I have been using this method without issue and recently stopped working. I am in the US and have no issues watching the video at the link provided.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Nov 17, 2016

Re-export cookies and try again.

@battlecattle
Copy link
Author

@battlecattle battlecattle commented Nov 17, 2016

I have re-exported multiple times, and just tried again and received the same errors.

@dstftw dstftw mentioned this issue Nov 18, 2016
2 of 6 tasks complete
@StevenDTX
Copy link

@StevenDTX StevenDTX commented Nov 20, 2016

I, too, have verified this issue. I exported my cookies twice.

E:\>youtube-dl.exe --restrict-filenames --continue --no-check-certificate --verbose --cookies e:\cookies\cookies-DISCGO.txt -o Gold.Rush.S07E06.mp4 "https://www.discoverygo.com/gold-rush/no-crane-no-gain/"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['--restrict-filenames', '--continue', '--no-check-certificate', '--verbose', '--cookies', 'e:\\cookies\\cookies-DISCGO.txt', '-o', 'Gold.Rush.S07E06.mp4', 'https://www.discoverygo.com/gold-rush/no-crane-no-gain/']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2016.11.18
[debug] Python version 3.4.4 - Windows-10-10.0.10586
[debug] exe versions: ffmpeg N-72383-g7206b94, ffprobe N-72383-g7206b94, rtmpdump 2.4
[debug] Proxy map: {}
[DiscoveryGo] no-crane-no-gain: Downloading webpage
ERROR: no-crane-no-gain: Failed to parse JSON  (caused by ValueError("Expecting ',' delimiter: line 1 column 27222 (char 27221)",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 559, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 359, in raw_decode
ValueError: Expecting ',' delimiter: line 1 column 27222 (char 27221)
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 559, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 359, in raw_decode
ValueError: Expecting ',' delimiter: line 1 column 27222 (char 27221)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\YoutubeDL.py", line 694, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 357, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\discoverygo.py", line 53, in _real_extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmptx6jnsy6\build\youtube_dl\extractor\common.py", line 563, in _parse_json
youtube_dl.utils.ExtractorError: no-crane-no-gain: Failed to parse JSON  (caused by ValueError("Expecting ',' delimiter: line 1 column 27222 (char 27221)",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

@StevenDTX
Copy link

@StevenDTX StevenDTX commented Dec 2, 2016

Just FYI... there are some "unlocked" episodes on DiscoveryGo where the m3u8 can be extracted without a logon or cookies.

https://www.discoverygo.com/killing-fields/a-body-in-the-bayou/
https://www.discoverygo.com/gold-rush/frankenstein-machinery/

E:\>youtube-dl --verbose https://www.discoverygo.com/killing-fields/a-body-in-the-bayou/
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['--verbose', 'https://www.discoverygo.com/killing-fields/a-body-in-the-bayou/']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2016.12.01
[debug] Python version 3.4.4 - Windows-10-10.0.10586
[debug] exe versions: ffmpeg N-81721-g7447ec9, ffprobe N-72383-g7206b94, rtmpdump 2.4
[debug] Proxy map: {}
[DiscoveryGo] a-body-in-the-bayou: Downloading webpage
ERROR: a-body-in-the-bayou: Failed to parse JSON  (caused by ValueError("Expecting ',' delimiter: line 1 column 23968 (char 23967)",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6j6yd3lp\build\youtube_dl\extractor\common.py", line 559, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 359, in raw_decode
ValueError: Expecting ',' delimiter: line 1 column 23968 (char 23967)
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6j6yd3lp\build\youtube_dl\extractor\common.py", line 559, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 359, in raw_decode
ValueError: Expecting ',' delimiter: line 1 column 23968 (char 23967)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6j6yd3lp\build\youtube_dl\YoutubeDL.py", line 694, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6j6yd3lp\build\youtube_dl\extractor\common.py", line 357, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6j6yd3lp\build\youtube_dl\extractor\discoverygo.py", line 53, in _real_extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6j6yd3lp\build\youtube_dl\extractor\common.py", line 563, in _parse_json
youtube_dl.utils.ExtractorError: a-body-in-the-bayou: Failed to parse JSON  (caused by ValueError("Expecting ',' delimiter: line 1 column 23968 (char 23967)",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
@KyleBS
Copy link

@KyleBS KyleBS commented Dec 16, 2016

Inside of the data-video JSON, there is unescaped XML under "adParameters" keys causing the JSON parser to barf. The XML root appears consistent, "VAST", and if you peel it out of the JSON string, things suddenly work (for me).

The hack I have:

git diff b63005f5afb164f8660c23ab62962287eb1e1c16
diff --git a/youtube_dl/extractor/discoverygo.py b/youtube_dl/extractor/discoverygo.py
index c4e83b2..23faee7 100644
--- a/youtube_dl/extractor/discoverygo.py
+++ b/youtube_dl/extractor/discoverygo.py
@@ -38,6 +38,23 @@ class DiscoveryGoIE(InfoExtractor):
         },
     }
 
+    # DiscoveryGo pages include unescaped XML that breaks JSON parsing. Removing these tags allow for
+    # downloading to proceed as desired.
+    def removeUnescapedXML(self, s):
+        if s is None:
+            return None
+        assert type(s) == compat_str
+
+        xml_tags = [ 'VAST' ]
+        for tag in xml_tags:
+            xml_tag_open = '"&lt;?xml version="1.0" encoding="UTF-8"?> <' + tag
+            xml_tag_close = tag + '> "'
+            for i in xrange(s.count(xml_tag_open)):
+                xml_start_index = s.find(xml_tag_open)
+                xml_end_index = s.find(xml_tag_close) + len(xml_tag_close)
+                s = s[:xml_start_index] + s[xml_end_index:]
+        return s
+
     def _real_extract(self, url):
         display_id = self._match_id(url)
 
@@ -49,7 +66,7 @@ class DiscoveryGoIE(InfoExtractor):
                 webpage, 'video container'))
 
         video = self._parse_json(
-            unescapeHTML(container.get('data-video') or container.get('data-json')),
+            self.removeUnescapedXML(unescapeHTML(container.get('data-video') or container.get('data-json'))),
             display_id)
 
         title = video['name']
@StevenDTX
Copy link

@StevenDTX StevenDTX commented Dec 28, 2016

Thanks, @KyleBS.

Would you mind posting a copy of your Discoverygo.py file? Unfortunately, I'm on Windows and fairly new to this whole "git" thing, so I haven't been able to successfully integrate your changes.

@KyleBS
Copy link

@KyleBS KyleBS commented Dec 28, 2016

Posted as .txt to appease github, be sure to change the extension back to .py!
discoverygo.py.txt

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Dec 29, 2016

Could anyone paste a dump of the breaking XML string?

@StevenDTX
Copy link

@StevenDTX StevenDTX commented Dec 29, 2016

Thanks @KyleBS. It still isn't working for me on Windows. Oh well, I will just wait until its fixed. Until then, its easy enough to get the m3u8 from the page and download it.

@StevenDTX
Copy link

@StevenDTX StevenDTX commented Dec 30, 2016

@yan12125

Is that something I can help you with? Is there a command line I should run?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Dec 30, 2016

Add print(container.get('data-video') or container.get('data-json')) before video = self._parse_json and run

@StevenDTX
Copy link

@StevenDTX StevenDTX commented Dec 30, 2016

Hopefully this gives you what you asked for.

discoverygo-xml.txt

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Dec 30, 2016

Does things work with this change?

diff --git a/youtube_dl/extractor/discoverygo.py b/youtube_dl/extractor/discoverygo.py
index c4e83b2c3..b4d686a09 100644
--- a/youtube_dl/extractor/discoverygo.py
+++ b/youtube_dl/extractor/discoverygo.py
@@ -49,7 +49,7 @@ class DiscoveryGoIE(InfoExtractor):
                 webpage, 'video container'))
 
         video = self._parse_json(
-            unescapeHTML(container.get('data-video') or container.get('data-json')),
+            container.get('data-video') or container.get('data-json'),
             display_id)
 
         title = video['name']
@StevenDTX
Copy link

@StevenDTX StevenDTX commented Dec 31, 2016

@yan12125

Yes...that works!!! Thank you!

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jan 1, 2017

Thanks @StevenDTX. Which video(s) were you testing against?

Also, I'd like to hear more feedback before commiting it. @KyleBS Does that work for you?

@StevenDTX
Copy link

@StevenDTX StevenDTX commented Jan 1, 2017

I downloaded a couple episodes of Gold Rush and several Alaska the Last Frontier. It also tested fine on ScienceChannelgo.com and on AnimalPlanetgo.com

@KyleBS
Copy link

@KyleBS KyleBS commented Jan 2, 2017

@yan12125 Can confirm, this solution worked for me as well, nice catch!

yan12125 added a commit that referenced this issue Jan 5, 2017
HTMLParser, which is used by extract_attributes, already unescapes
attribute values with HTMLParser.unescape. They shouldn't be unescaped
again, to there may be parsing errors.

Ref: #11219, #11522
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jan 5, 2017

I remember there are some errors in "unlocked" videos mentioned in #11219 (comment). However, they are now all subscribers-only, so I can't reproduce the error. Closing this first and feel free to open new issues if there's any problem.

@yan12125 yan12125 closed this Jan 5, 2017
@StevenDTX
Copy link

@StevenDTX StevenDTX commented Jan 5, 2017

Thanks, @yan12125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.