New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[youtube] Fix throttling by decrypting n-sig #1437
Conversation
Amazing. Will this be the final and explicit solution to the throttling issue, rather than a mere workaround like the android-client trick? |
Yep! 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested it works on the other js-based clients so far (web_embedded, mweb, web_music, web_creator)
I think for the time-being we should keep the android client, but as a secondary client by default (i.e. player_client=web,android
). But we'll need to rework the skipping formats that already exist code to account for that.
yt_dlp/extractor/youtube.py
Outdated
fmt.get('qualityLabel') or quality.replace('audio_quality_', '')))), | ||
fmt.get('qualityLabel') or quality.replace('audio_quality_', ''), | ||
' (throttled)' if not n_sig else None))), | ||
'source_preference': -10 if not n_sig else -1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could look into overwriting an existing format with same itag if the source preference is higher (or something similar)?
e.g. we use player_client=web,android. If we fail to decrypt nsig for web, the throttled web formats have lower preference than the equivalent formats on android.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be too hard to add (actually, it is a bit tricky), but if we are keeping the android fallback, why not keep android,web iself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah if it's going to be too much trouble than it's worth probably best to just keep android,web
self.to_screen(f'Extracted nsig function from {player_id}:\n{func_code[1]}\n') | ||
|
||
return lambda s: jsi.extract_function_from_code(*func_code)([s]) | ||
|
||
def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coletdjnz Do you know why this isnt cached to disk like the signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like how the nsig is caching the code to the disk, whereas the sig is caching some characters?
I'm not actually sure. Unless I'm reading this wrong (it should really have been documented), it seems like they cache some sort of character mapping/offset to the disk?
yt-dlp/yt_dlp/extractor/youtube.py
Lines 1743 to 1755 in b7b186e
if cache_spec is not None: | |
return lambda s: ''.join(s[i] for i in cache_spec) | |
if self._load_player(video_id, player_url): | |
code = self._code_cache[player_id] | |
res = self._parse_sig_js(code) | |
test_string = ''.join(map(compat_chr, range(len(example_sig)))) | |
cache_res = res(test_string) | |
cache_spec = [ord(c) for c in cache_res] | |
self._downloader.cache.store('youtube-sigfuncs', func_id, cache_spec) | |
return res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand it either. Hence why for nsig, I went with storing the actual js function itself. For _extract_signature_timestamp
I assume we can directly cache the sts itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_extract_signature_timestamp I assume we can directly cache the sts itself.
I never actually thought about caching that, but yeah we could do that as well. Though I think it would currently only help in the case where the webpage is not downloaded, but the iframe fallback is (usually happens when 429nd).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had another look at it:
For the normal signature function, it seems to be reversing the encrypted signature, but 'shuffling' some of the characters with replacement (so using same characters and same length). Hence being able to cache a mapping of the character positions?
But with the nsig it's different - the decrypted nsig seems to have different characters and length than the encrypted nsig.
5dee610
to
ec89afd
Compare
5b7ba21
to
6eb0a33
Compare
6eb0a33
to
88227c3
Compare
This comment has been minimized.
This comment has been minimized.
* master: (28 commits) [ffmpeg] Detect libavformat version for `aac_adtstoasc` and print available features in verbose head Based on ytdl-org/youtube-dl#29581 [ffmpeg] Accurately detect presence of setts [ExtractAudio] Use `libfdk_aac` if available Closes yt-dlp#1502 Authored by: CrypticSignal [ffmpeg] Framework for feature detection Related: yt-dlp#1502, yt-dlp#1237, ytdl-org/youtube-dl#29581 [ExtractAudio] Rescale --audio-quality correctly Authored by: CrypticSignal, pukkandan [fragment] Fix progress display in fragmented downloads Closes yt-dlp#1517 [youtube] Remove unnecessary no-playlist warning [utils] Parse `vp09` as vp9 [jsinterp] Handle default in switch better [Instagram] Fix incorrect resolution (yt-dlp#1494) [vk] Fix login (yt-dlp#1495) [docs,cleanup] Improve docs and minor cleanup Closes yt-dlp#1387, yt-dlp#1404, yt-dlp#1408, yt-dlp#1485, yt-dlp#1415, yt-dlp#1450, yt-dlp#1492 [youtube] refactor itag processing [linkedin] Don't login multiple times [vk] Add subtitles (yt-dlp#1480) [Olympics] Fix extractor (yt-dlp#1483) [PlanetMarathi] Add extractor (yt-dlp#1484) [Instagram] Add login to playlist (yt-dlp#1488) [ceskatelevize] Fix extractor (yt-dlp#1489) [youtube] Fix throttling by decrypting n-sig (yt-dlp#1437) ...
Overhauls jsinterp to handle youtube nsig decryption
Fixes: ytdl-org/youtube-dl#29326
TODO: