Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Decoding Error, appears sometimes, doesn't autofix. #26920
Comments
|
Really hope that there is a fix for this, I am having the same issues, I am using youtube-dl for a discord bot to play some music but each search seems to throw this error randomly. It happens on both searches as well as with links and sometimes it works for songs then the same song will not work 2 min later, but recently it seems to have almost stopped working entirely |
|
same issue. Is it linked to the new Youtube sign-in policy? ERROR: query "heads will rock a-trak remix yeah yeah yeahs": Failed to parse JSON (caused by JSONDecodeError('Expecting value: line 1 column 1 (char 0)')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. youtube-dl --version |
|
That’s possible... I’ll try using the sign in options and see if that helps |
|
For me, the search feature works when and only when I dont use cookies. |
|
I haven’t been, and I assume it doesn’t create its own when it’s doing multiple searches, which is why there’s no apparent correlation with what doesn’t work for a couple minutes and what does |
I didnt quite understand what you meant. Can you elaborate? |
|
|
@pukkandan Sure, what I mean is that I assume youtube-dl doesn’t automatically create cookies to make your search results change, and instead just tries to find the video without cookies. |
|
This is the json_string when it fails --- /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/common.py 2020-10-18 19:23:49.465637087 +0200
+++ /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/common.py 2020-10-18 19:20:47.937090816 +0200
@@ -901,6 +901,7 @@
if transform_source:
json_string = transform_source(json_string)
try:
+ print('DEBUG: ' + json_string)
return json.loads(json_string)
except ValueError as ve:
errmsg = '%s: Failed to parse JSON ' % video_id
|
|
So the problem is that since a couple days (weeks?), Youtube returns randomly (1/3 tries) the HTML instead of the JSON Maybe workaround could be to retry 2-3 times if you don't get a JSON. Content without {"title": or a content with <html |
|
That's a possible fix I think, but it seems to fail multiple times in a row. |
|
FYI Same random issue with Chrome User-Agent like blocked by YouTube Proxy Cookies matters: A good value of VISITOR_INFO1_LIVE will make it work 100% --- /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/common.py 2020-10-18 19:23:49.465637087 +0200
+++ /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/common.py 2020-10-23 14:56:18.538975849 +0200
@@ -617,6 +617,29 @@
if 'X-Forwarded-For' not in headers:
headers['X-Forwarded-For'] = self._x_forwarded_for_ip
+ headers['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
+ headers['Accept-Language'] = 'en-US,en;q=0.5'
+ headers['Content-Type'] = 'application/json'
+ headers['DNT'] = '1'
+ headers['Sec-GPC'] = '1'
+ headers['Sec-Fetch-Dest'] = 'document'
+ headers['Sec-Fetch-Mode'] = 'navigate'
+ headers['Sec-Fetch-Site'] = 'same-origin'
+ headers['Sec-Fetch-User'] = '?1'
+ headers['TE'] = 'Trailers'
+ headers['Upgrade-Insecure-Requests'] = '1'
+ headers['User-Agent'] = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36'
+ headers['Access-Control-Request-Headers'] = 'x-goog-visitor-id'
+ headers['Access-Control-Request-Hethod'] = 'GET'
+ headers['Origin'] = 'https://www.youtube.com'
+ headers['Referer'] = url_or_request
+ headers['Pragma'] = 'no-cache'
+ headers['Cache-Control'] = 'no-cache'
+
+ print('DEBUG: url: %s' % url_or_request)
+ print('DEBUG: headers: %s' % json.dumps(headers))
+ print('DEBUG: query: %s' % json.dumps(query))
+
if isinstance(url_or_request, compat_urllib_request.Request):
url_or_request = update_Request(
url_or_request, data=data, headers=headers, query=query) |
|
This error happens to me every time I use a string query. URLs work, but queries don't. I hope YTDL fixes this fast, as I love using YTDL. |
Add guide for Ubuntu 20.04. Yes: I've tested this out on my Ubuntu 20.04, and I got no errors (only that (temporary?) youtube-dl error ytdl-org/youtube-dl#26920 ) I'm not 100% sure whether I should add that part 'bout the installation of python3.7, but I added it because this whole page uses the master branch. You may remove that part if that's necessary.
|
I get the same error, although, it is more often than not at this time. Just as @Xenophloxic says direct links do work, although, the default-search option of 'auto or 'ytsearch' will always result in this error 99/100 times. |
|
I have tested with both a string query as well as using a direct link and I get this error 99% of the time, not sure of any fixes though. |
|
From my understanding, if you get VISITOR_INFO1_LIVE cookie right in "youtube-dl --cookies cookies", it will work.
Working
Also Working if the same VISITOR_INFO1_LIVE and it will auto add GPS, PREF, YSC, s_gl the next run
Not Working
|
Is there a way to do this in python through the ydl_opts list? |
|
If it works once with youtube-dl --cookies cookiefile edit /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/youtube.py (sudo vi) Now I need to understand, where to get VISITOR_INFO1_LIVE right |
|
I've tried over 300 times for a cookie that works. Any tips to help find one? |
You need youtube-dl to work once with --cookies cookiefile then copy VISITOR_INFO1_LIVE
N.B. Before enforcing a VISITOR_INFO1_LIVE with a good value, I had random success like 1/3, 2/10, 0/7, ... You can also try to modify CONSENT from cookiefile to match your browser
When you have a single youtube-dl with completed download, you can remove --cookies cookiefile from youtube-dl --- /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/youtube.py 2020-10-06 20:50:09.554406404 +0200
+++ /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/youtube.py 2020-10-23 03:49:17.357860513 +0200
@@ -3196,6 +3196,20 @@
url_query.update(self._EXTRA_QUERY_ARGS)
result_url = 'https://www.youtube.com/results?' + compat_urllib_parse_urlencode(url_query)
+ visitor_info1_live = self._get_cookies(result_url).get('VISITOR_INFO1_LIVE')
+ print('DEBUG: visitor_info1_live1: %s' % visitor_info1_live)
+
+ visitor_info1_live = self._get_cookies('https://www.youtube.com/').get('VISITOR_INFO1_LIVE')
+ print('DEBUG: visitor_info1_live2: %s' % visitor_info1_live)
+
+ visitor_info1_live = self._get_cookies('https://www.youtube.com').get('VISITOR_INFO1_LIVE')
+ print('DEBUG: visitor_info1_live3: %s' % visitor_info1_live)
+
+ #self._downloader.cookiejar.clear('.youtube.com')
+
+ self._set_cookie('.youtube.com', 'VISITOR_INFO1_LIVE', 'ToGMI3_Nn_I')
+ #self._set_cookie('.youtube.com', 'VISITOR_INFO1_LIVE', 'xxxxxxxxxxx')
+
for pagenum in itertools.count(1):
data = self._download_json(
result_url, video_id='query "%s"' % query,
--- /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/common.py 2020-10-18 19:23:49.465637087 +0200
+++ /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/common.py 2020-10-23 03:44:56.548444691 +0200
@@ -617,6 +617,10 @@
if 'X-Forwarded-For' not in headers:
headers['X-Forwarded-For'] = self._x_forwarded_for_ip
+ print('DEBUG: url: %s' % url_or_request)
+ print('DEBUG: headers: %s' % json.dumps(headers))
+ print('DEBUG: query: %s' % json.dumps(query))
+
if isinstance(url_or_request, compat_urllib_request.Request):
url_or_request = update_Request(
url_or_request, data=data, headers=headers, query=query)P.S. I did not manage to get self._get_cookies(url).get('VISITOR_INFO1_LIVE')
**Using a valid VISITOR_INFO1_LIVE is the only thing that make it work all the time
What did work
What did not work
@@ -3192,6 +3232,7 @@
url_query = {
'search_query': query.encode('utf-8'),
+ 'gl': 'US'.encode('utf-8'),
}
url_query.update(self._EXTRA_QUERY_ARGS)
result_url = 'https://www.youtube.com/results?' + compat_urllib_parse_urlencode(url_query)
@@ -3201,7 +3242,27 @@
result_url, video_id='query "%s"' % query,
note='Downloading page %s' % pagenum,
errnote='Unable to download API page',
- query={'spf': 'navigate'})
+ query={'spf': 'navigate'},
+ headers={
+ 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
+ 'Accept-Language': 'en-US,en;q=0.5',
+ 'Content-Type': 'application/json',
+ 'DNT': '1',
+ 'Sec-GPC': '1',
+ 'TE': 'Trailers',
+ 'Upgrade-Insecure-Requests': '1',
+ 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36',
+ 'Access-Control-Request-Headers': 'x-goog-visitor-id',
+ 'Access-Control-Request-Hethod': 'GET',
+ 'Origin': 'https://www.youtube.com',
+ 'Referer': result_url,
+ 'Pragma': 'no-cache',
+ 'Sec-Fetch-Dest': 'document',
+ 'Sec-Fetch-Mode': 'navigate',
+ 'Sec-Fetch-Site': 'same-origin',
+ 'Sec-Fetch-User': '?1',
+ },
+ )
html_content = data[1]['body']['content']
if 'class="search-message' in html_content:
for _ in range(5):
try:
print('DEBUG: Tries ' + str(_))
res = self._download_webpage_handle(
url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding, data=data, headers=headers, query=query,
expected_status=expected_status)
if res is False:
return res
json_string, urlh = res
return self._parse_json(
json_string, video_id, transform_source=transform_source,
fatal=fatal), urlh
except ExtractorError as e:
if json_string.__contains__('<html') == True:
self._sleep(5, video_id)
continue
raise
for _ in range(5):
try:
print('DEBUG: Tries ' + str(_))
res = self._download_json_handle(
url_or_request, video_id, note=note, errnote=errnote,
transform_source=transform_source, fatal=fatal, encoding=encoding,
data=data, headers=headers, query=query,
expected_status=expected_status)
return res if res is False else res[0]
except ExtractorError as e:
"""if json_string.__contains__('<html') == True:"""
self._sleep(5, video_id)
continue
raise
for _ in range(5):
try:
print('DEBUG: Tries ' + str(_))
data = self._download_json(
result_url, video_id='query "%s"' % query,
note='Downloading page %s' % pagenum,
errnote='Unable to download API page',
query={'spf': 'navigate'})
break
except ExtractorError as e:
#if json_string.__contains__('<html') == True:
self._sleep(5, 'query "%s"' % query)
continue
raise
try:
for _ in range(7):
try:
print('DEBUG: Tries ' + str(_))
self.initialize()
ie_result = self._real_extract(url)
if self._x_forwarded_for_ip:
ie_result['__x_forwarded_for_ip'] = self._x_forwarded_for_ip
return ie_result
except GeoRestrictedError as e:
if self.__maybe_fake_ip_and_retry(e.countries):
continue
raise
except ExtractorError as e:
time.sleep(random.randint(1,10))
continue
raise |
|
Wow, thanks! This fixed ytdl locally, now its time to do this with my server. |
|
Maybe someone can help me? I am able to get VISITOR_INFO1_LIVE from https://www.youtube.com but it's not working New ideas:
This is my new patch --- /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/youtube.py 2020-10-06 20:50:09.554406404 +0200
+++ /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/youtube.py 2020-10-23 13:39:58.273988869 +0200
@@ -8,6 +8,7 @@
import os.path
import random
import re
+import sys
import time
import traceback
@@ -75,12 +76,6 @@
'x-youtube-client-version': '1.20200609.04.02',
}
- def _set_language(self):
- self._set_cookie(
- '.youtube.com', 'PREF', 'f1=50000000&f6=8&hl=en',
- # YouTube sets the expire time to about two months
- expire_time=time.time() + 2 * 30 * 24 * 3600)
-
def _ids_to_results(self, ids):
return [
self.url_result(vid_id, 'Youtube', video_id=vid_id)
@@ -276,10 +271,59 @@
return super(YoutubeBaseInfoExtractor, self)._download_webpage_handle(
*args, **compat_kwargs(kwargs))
+ def _get_cookie_handle(self, url_handle, cookie):
+ """
+ Get cookie from Set-Cookie header
+ """
+ for header, cookies in url_handle.headers.items():
+ if header.lower() != 'set-cookie':
+ continue
+ if sys.version_info[0] >= 3:
+ cookies = cookies.encode('iso-8859-1')
+ cookies = cookies.decode('utf-8')
+ cookie_value = re.search(
+ r'%s=(.+?);.*?\b[Dd]omain=(.+?)(?:[,;]|$)' % cookie, cookies)
+ if cookie_value:
+ value, domain = cookie_value.groups()
+ return value
+ #self._set_cookie(domain, cookie, value)
+ #break
+
+ def _set_youtube_cookies(self, url='https://www.youtube.com/'):
+ youtube_page, url_handle = self._download_webpage_handle(url, None, 'Get cookies', fatal=False)
+ if youtube_page is False:
+ return False
+
+ ## This get a dynamic VISITOR_INFO1_LIVE most of the time not valid (the one from --cookies cookefile after a working download is still valid)
+ #visitor_info1_live = self._get_cookie_handle(url_handle, 'VISITOR_INFO1_LIVE')
+ #if visitor_info1_live:
+ # print('DEBUG: visitor_info1_live: ' + visitor_info1_live)
+ # self._set_cookie('.youtube.com', 'VISITOR_INFO1_LIVE', visitor_info1_live, expire_time=time.time() + 180 * 24 * 3600)
+
+ #for cookie in ('VISITOR_INFO1_LIVE', 'YSC', 'GPS'):
+ for cookie in ('VISITOR_INFO1_LIVE', 'YSC'):
+ value = self._get_cookie_handle(url_handle, cookie)
+ if value:
+ print('DEBUG: Cookie: %s=%s' % (cookie, value))
+ self._set_cookie('.youtube.com', cookie, value, expire_time=time.time() + 180 * 24 * 3600)
+
+ # replace f4=xxxxxxx&gl=xx with PREF from your web browser
+ # self._set_cookie('.youtube.com', 'PREF', 'f4=xxxxxxx&gl=xx')
+ self._set_cookie('.youtube.com', 'PREF', 'f4=4000000&gl=US', expire_time=time.time() + 180 * 24 * 3600)
+
+ # replace YES+xxxxx+V13 with CONSENT from your web browser
+ # self._set_cookie('.youtube.com', 'CONSENT', 'YES+xxxxx+V13')
+ self._set_cookie('.youtube.com', 'CONSENT', 'YES+CH.de+V13')
+
+ # replace xxxxxxxxxx with a working VISITOR_INFO1_LIVE taken from youtube-dl --cookie cookiefile (a completed download)
+ # self._set_cookie('.youtube.com', 'VISITOR_INFO1_LIVE', 'xxxxxxxxxx')
+ # N.B. The following line is a workaround and should be commented in the future when you will get a valid VISITOR_INFO1_LIVE from the above code
+ self._set_cookie('.youtube.com', 'VISITOR_INFO1_LIVE', 'ToGMI3_Nn_I', expire_time=time.time() + 180 * 24 * 3600)
+
def _real_initialize(self):
if self._downloader is None:
return
- self._set_language()
+ self._set_youtube_cookies()
if not self._login():
return
You can use this to test it
If you want to try to get a valid VISITOR_INFO1_LIVE wait a couple of minutes then try several times touch cookiefile; \
sed -ri '/VISITOR_INFO1_LIVE/d' cookiefile; \
cat cookiefile; \
youtube-dl --cookies cookiefile -g -v -x -f m4a -o '~/Music/%(title)s.%(ext)s' "ytsearch1:heads will rock a-trak remix yeah yeah yeahs"; \
awk '/VISITOR_INFO1_LIVE/ { printf "%s=%s\n", $6, $7 }' cookiefile |
|
You might got lucky with that cookie.. I tried it at least 30 times with waiting in between, but no luck yet. |
|
Fun https://twitter.com/search?q=VISITOR_INFO1_LIVE&src=typed_query&f=live https://twitter.com/LeoWattenberg/status/1306864495722139649 Maybe Google ban youtube-dl by the special cookie
This is about 180 days (180.5 days) but VISITOR_INFO1_LIVE from lynx does not work in youtube-dl |
FYI: I just checked on my Mac; wiped all cookies, went to youtube.com and did not have the cookie PREF at all, only after signing into my account it appeared with content "al=de". Tried to change the hardcoded PREF in youtube.py, but unfortunately does not change anything for me. |
|
Another tips to try to get a valid VISITOR_INFO1_LIVE youtube-dl --cookies cookiefile When you have it, apply the following code and use a valid VISITOR_INFO1_LIVE code instead of ToGMI3_Nn_I --- /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/youtube.py-orig 2020-10-06 20:50:09.554406404 +0200
+++ /usr/local/lib/python3.7/dist-packages/youtube_dl/extractor/youtube.py-new 2020-10-23 13:48:22.204708619 +0200
@@ -75,12 +75,6 @@
'x-youtube-client-version': '1.20200609.04.02',
}
- def _set_language(self):
- self._set_cookie(
- '.youtube.com', 'PREF', 'f1=50000000&f6=8&hl=en',
- # YouTube sets the expire time to about two months
- expire_time=time.time() + 2 * 30 * 24 * 3600)
-
def _ids_to_results(self, ids):
return [
self.url_result(vid_id, 'Youtube', video_id=vid_id)
@@ -276,10 +270,14 @@
return super(YoutubeBaseInfoExtractor, self)._download_webpage_handle(
*args, **compat_kwargs(kwargs))
+ def _set_youtube_cookies(self, url='https://www.youtube.com/'):
+ self._set_cookie('.youtube.com', 'PREF', 'f1=50000000&f6=8&hl=en', expire_time=time.time() + 180 * 24 * 3600)
+ self._set_cookie('.youtube.com', 'VISITOR_INFO1_LIVE', 'ToGMI3_Nn_I', expire_time=time.time() + 180 * 24 * 3600)
+
def _real_initialize(self):
if self._downloader is None:
return
- self._set_language()
+ self._set_youtube_cookies()
if not self._login():
return
I have close to 12h 100% working everytime when I enable this line and 0% or random when not. |
|
I will do a break as I have a VISITOR_INFO1_LIVE working for me all the time. Nothing else matters [than a valid VISITOR_INFO1_LIVE] Magic VISITOR_INFO1_LIVE, voodoo VISITOR_INFO1_LIVE Magic VISITOR_INFO1_LIVE, magic VISITOR_INFO1_LIVE, voodoo VISITOR_INFO1_LIVE, magic VISITOR_INFO1_LIVE ;-) |

Checklist
Verbose log
Description
I posted a question on this yesterday and it got flagged as a duplicate - the original answer is here. This issue doesn't automatically fix like it did for that user. Instead, it seems to come and go with no pattern, so I don't have a good way to replicate it. Some songs will work at one point in time, then they won't a couple minutes later.
I get the issue when attempting to extract the data from the video. Here is a snippet of the code I am using: