[extractor/twitter] Migrate to GraphQL, handle auth errors #6957

bashonly · 2023-04-30T11:53:26Z

Sometime in early April, Twitter changed something and broke our age-gate bypass for sensitive media. We had been using a "deprecated" bearer token to get around the age-restriction, which has now been rendered effectively useless. And because the legacy API (that we were using by default for tweet extraction) only returns a 404 without any further info for age-restricted tweets, this site change has necessitated that tweet extraction defaults to the GraphQL API instead, since GraphQL returns error info that can be used to detect when to raise_login_required.

The change in default behavior makes the force_graphql extractor-arg pointless, so this PR replaces it with the legacy_api extractor-arg, which has the opposite effect, and could be used as a fallback in case of potential future GraphQL breakage. Also, the legacy API is still used by other subclasses, so legacy API code cannot be removed/replaced entirely.

Closes #6763

Template

Before submitting a pull request make sure you have:

At least skimmed through contributing guidelines including yt-dlp coding conventions
Searched the bugtracker for similar pull requests
Checked the code with flake8 and ran relevant tests

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Fix or improvement to an extractor (Make sure to add/update tests)
New extractor (Piracy websites will not be accepted)
Core bug fix/improvement
New feature (It is strongly recommended to open an issue first)

pukkandan · 2023-04-30T21:09:23Z

yt_dlp/extractor/twitter.py

+
+            if result.get('errors'):
+                errors = traverse_obj(result, ('errors', ..., 'message', {str}))
+                if first_attempt and any('bad guest token' in error.lower() for error in errors):


Suggested change

if first_attempt and any('bad guest token' in error.lower() for error in errors):

if first_attempt and not self.is_logged_in and any('bad guest token' in error.lower() for error in errors):

correct? Otherwise it may unnecessarily try twice when logged in.

it wouldn't hurt to add, but I don't think it's necessary as we are already never passing a guest token when logged in

EDIT: ended up refactoring the error handling and adding it

yt_dlp/extractor/twitter.py

Closes yt-dlp#6763 Authored by: bashonly

[extractor/twitter] Migrate to GraphQL, handle auth errors

e11392e

bashonly added the site-bug Issue with a specific website label Apr 30, 2023

bashonly added 2 commits April 30, 2023 10:50

Note which API is being called

51fba44

Allow metadata-only tweet extraction

1103ea3

pukkandan approved these changes Apr 30, 2023

View reviewed changes

pukkandan assigned bashonly Apr 30, 2023

bashonly added 4 commits May 1, 2023 16:42

Rename _token to _guest_token

3ff126a

Refactor guest token fetching per code review

d81e3e8

Improve _call_api error handling

2203d10

Update tests

66329d2

bashonly merged commit 147e62f into yt-dlp:master May 1, 2023
11 checks passed

bashonly deleted the fix/twitter branch May 2, 2023 11:50

aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024

[extractor/twitter] Default to GraphQL, handle auth errors (yt-dlp#6957)

099ed61

Closes yt-dlp#6763 Authored by: bashonly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[extractor/twitter] Migrate to GraphQL, handle auth errors #6957

[extractor/twitter] Migrate to GraphQL, handle auth errors #6957

bashonly commented Apr 30, 2023

pukkandan Apr 30, 2023

bashonly May 1, 2023 •

edited

	if first_attempt and any('bad guest token' in error.lower() for error in errors):
	if first_attempt and not self.is_logged_in and any('bad guest token' in error.lower() for error in errors):

[extractor/twitter] Migrate to GraphQL, handle auth errors #6957

[extractor/twitter] Migrate to GraphQL, handle auth errors #6957

Conversation

bashonly commented Apr 30, 2023

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

What is the purpose of your pull request?

pukkandan Apr 30, 2023

Choose a reason for hiding this comment

bashonly May 1, 2023 • edited

Choose a reason for hiding this comment

bashonly May 1, 2023 •

edited