Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[youtube] Fix throttling by decrypting n-sig #1437

Merged
merged 6 commits into from Oct 31, 2021

Conversation

pukkandan
Copy link
Member

@pukkandan pukkandan commented Oct 27, 2021

Overhauls jsinterp to handle youtube nsig decryption
Fixes: ytdl-org/youtube-dl#29326

TODO:

  • Check fallbacks and ensure all failures are non-fatal
  • Add tests
  • Refactor/cleanup

@shoxie007
Copy link

Amazing. Will this be the final and explicit solution to the throttling issue, rather than a mere workaround like the android-client trick?

@coletdjnz
Copy link
Member

Amazing. Will this be the final and explicit solution to the throttling issue, rather than a mere workaround like the android-client trick?

Yep! 😉

Copy link
Member

@coletdjnz coletdjnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it works on the other js-based clients so far (web_embedded, mweb, web_music, web_creator)

I think for the time-being we should keep the android client, but as a secondary client by default (i.e. player_client=web,android). But we'll need to rework the skipping formats that already exist code to account for that.

yt_dlp/extractor/youtube.py Outdated Show resolved Hide resolved
fmt.get('qualityLabel') or quality.replace('audio_quality_', '')))),
fmt.get('qualityLabel') or quality.replace('audio_quality_', ''),
' (throttled)' if not n_sig else None))),
'source_preference': -10 if not n_sig else -1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could look into overwriting an existing format with same itag if the source preference is higher (or something similar)?

e.g. we use player_client=web,android. If we fail to decrypt nsig for web, the throttled web formats have lower preference than the equivalent formats on android.

Copy link
Member Author

@pukkandan pukkandan Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be too hard to add (actually, it is a bit tricky), but if we are keeping the android fallback, why not keep android,web iself?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah if it's going to be too much trouble than it's worth probably best to just keep android,web

yt_dlp/extractor/youtube.py Outdated Show resolved Hide resolved
@coletdjnz coletdjnz linked an issue Oct 27, 2021 that may be closed by this pull request
7 tasks
self.to_screen(f'Extracted nsig function from {player_id}:\n{func_code[1]}\n')

return lambda s: jsi.extract_function_from_code(*func_code)([s])

def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False):
Copy link
Member Author

@pukkandan pukkandan Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coletdjnz Do you know why this isnt cached to disk like the signature?

Copy link
Member

@coletdjnz coletdjnz Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like how the nsig is caching the code to the disk, whereas the sig is caching some characters?

I'm not actually sure. Unless I'm reading this wrong (it should really have been documented), it seems like they cache some sort of character mapping/offset to the disk?

if cache_spec is not None:
return lambda s: ''.join(s[i] for i in cache_spec)
if self._load_player(video_id, player_url):
code = self._code_cache[player_id]
res = self._parse_sig_js(code)
test_string = ''.join(map(compat_chr, range(len(example_sig))))
cache_res = res(test_string)
cache_spec = [ord(c) for c in cache_res]
self._downloader.cache.store('youtube-sigfuncs', func_id, cache_spec)
return res

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand it either. Hence why for nsig, I went with storing the actual js function itself. For _extract_signature_timestamp I assume we can directly cache the sts itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_extract_signature_timestamp I assume we can directly cache the sts itself.

I never actually thought about caching that, but yeah we could do that as well. Though I think it would currently only help in the case where the webpage is not downloaded, but the iframe fallback is (usually happens when 429nd).

Copy link
Member

@coletdjnz coletdjnz Oct 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another look at it:
For the normal signature function, it seems to be reversing the encrypted signature, but 'shuffling' some of the characters with replacement (so using same characters and same length). Hence being able to cache a mapping of the character positions?

But with the nsig it's different - the decrypted nsig seems to have different characters and length than the encrypted nsig.

@pukkandan pukkandan marked this pull request as ready for review October 29, 2021 16:55
@pukkandan pukkandan merged commit 404f611 into yt-dlp:master Oct 31, 2021
@megapro17

This comment has been minimized.

gaming-hacker added a commit to gaming-hacker/yt-dlp that referenced this pull request Nov 3, 2021
* master: (28 commits)
  [ffmpeg] Detect libavformat version for `aac_adtstoasc` and print available features in verbose head Based on ytdl-org/youtube-dl#29581
  [ffmpeg] Accurately detect presence of setts
  [ExtractAudio] Use `libfdk_aac` if available Closes yt-dlp#1502 Authored by: CrypticSignal
  [ffmpeg] Framework for feature detection Related: yt-dlp#1502, yt-dlp#1237, ytdl-org/youtube-dl#29581
  [ExtractAudio] Rescale --audio-quality correctly Authored by: CrypticSignal, pukkandan
  [fragment] Fix progress display in fragmented downloads Closes yt-dlp#1517
  [youtube] Remove unnecessary no-playlist warning
  [utils] Parse `vp09` as vp9
  [jsinterp] Handle default in switch better
  [Instagram] Fix incorrect resolution (yt-dlp#1494)
  [vk] Fix login (yt-dlp#1495)
  [docs,cleanup] Improve docs and minor cleanup Closes yt-dlp#1387, yt-dlp#1404, yt-dlp#1408, yt-dlp#1485, yt-dlp#1415, yt-dlp#1450, yt-dlp#1492
  [youtube] refactor itag processing
  [linkedin] Don't login multiple times
  [vk] Add subtitles (yt-dlp#1480)
  [Olympics] Fix extractor (yt-dlp#1483)
  [PlanetMarathi] Add extractor (yt-dlp#1484)
  [Instagram] Add login to playlist (yt-dlp#1488)
  [ceskatelevize] Fix extractor (yt-dlp#1489)
  [youtube] Fix throttling by decrypting n-sig (yt-dlp#1437)
  ...
@pukkandan pukkandan deleted the nsig_decrypt branch January 29, 2022 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"This video is not available" on some youtube videos [YouTube] Randomly slow youtube download speed
5 participants