-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ie/twitter] Fix retweet extraction & remove broken fallback #8016
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bashonly
added
site-bug
Issue with a specific website
pending-review
PR needs a review
labels
Sep 2, 2023
Grub4K
approved these changes
Sep 2, 2023
The fallback actually does work-see the message in Discord |
aalsuwaidi
pushed a commit
to aalsuwaidi/yt-dlp
that referenced
this pull request
Apr 21, 2024
Authored by: bashonly
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When the input URL is for a tweet of a retweeted video, the actual tweet that's being reposted should be extracted instead of the content-less retweet itself. This was being done correctly when passing the
legacy_api
extractor-arg, but the necessary transformations to the GraphQL response were not being made otherwise.Since I was already rewriting
TwitterIE._extract_status()
, I cleaned it up by removing the syndication fallback that is now broken. This will make debugging much simpler.Finally, in the process of running/updating the tests, I realized the
TwitterBroadcast
extractor was raising an unexpectedAttributeError
when a broadcast no longer exists, so I added 2 lines to catch that.Addresses #7850 (comment)
Template
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:
What is the purpose of your pull request?
Copilot Summary
🤖 Generated by Copilot at e8b6fdb
Summary
🐦🎥🔄
Improve Twitter extraction and fix bugs. Add support for retweets, handle missing or private broadcasts, and use legacy API as fallback. Update and add tests for
yt_dlp/extractor/twitter.py
.Walkthrough
functools
(link)TwitterIE
class, testing different scenarios of tweets, retweets, and broadcasts (link, link, link, link, link, link, link)_GRAPHQL_ENDPOINT
property toTwitterIE
class, returning the appropriate GraphQL endpoint depending on login status (link)_graphql_to_legacy
method ofTwitterIE
class, adding support for extracting retweeted status and user from GraphQL response (link)_extract_status
method ofTwitterIE
class, changing the order and logic of API calls, preferring legacy API over GraphQL API if not logged in andlegacy_api
argument is set, and returning retweeted status if present (link)_real_extract
method ofTwitterIE
class, removing redundant extraction of media id from video URL (link)_real_extract
method ofTwitterBroadcastIE
class, adding a check for the existence of broadcast in API response, and raising extractor error if missing (link)_real_extract
method ofTwitterBroadcastIE
class, explaining the purpose of broadcast id extraction (link)