Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/twitter] Migrate to GraphQL, handle auth errors #6957

Merged
merged 7 commits into from May 1, 2023

Conversation

bashonly
Copy link
Member

Sometime in early April, Twitter changed something and broke our age-gate bypass for sensitive media. We had been using a "deprecated" bearer token to get around the age-restriction, which has now been rendered effectively useless. And because the legacy API (that we were using by default for tweet extraction) only returns a 404 without any further info for age-restricted tweets, this site change has necessitated that tweet extraction defaults to the GraphQL API instead, since GraphQL returns error info that can be used to detect when to raise_login_required.

The change in default behavior makes the force_graphql extractor-arg pointless, so this PR replaces it with the legacy_api extractor-arg, which has the opposite effect, and could be used as a fallback in case of potential future GraphQL breakage. Also, the legacy API is still used by other subclasses, so legacy API code cannot be removed/replaced entirely.

Closes #6763

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

@bashonly bashonly added the site-bug Issue with a specific website label Apr 30, 2023

if result.get('errors'):
errors = traverse_obj(result, ('errors', ..., 'message', {str}))
if first_attempt and any('bad guest token' in error.lower() for error in errors):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if first_attempt and any('bad guest token' in error.lower() for error in errors):
if first_attempt and not self.is_logged_in and any('bad guest token' in error.lower() for error in errors):

correct? Otherwise it may unnecessarily try twice when logged in.

Copy link
Member Author

@bashonly bashonly May 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it wouldn't hurt to add, but I don't think it's necessary as we are already never passing a guest token when logged in

EDIT: ended up refactoring the error handling and adding it

yt_dlp/extractor/twitter.py Outdated Show resolved Hide resolved
@bashonly bashonly merged commit 147e62f into yt-dlp:master May 1, 2023
11 checks passed
@bashonly bashonly deleted the fix/twitter branch May 2, 2023 11:50
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[twitter] Unable to download JSON metadata (only age-restricted links)
2 participants