Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ie/crunchyroll] Remove initial state scrape #7632

Merged
merged 5 commits into from Jul 20, 2023

Conversation

Grub4K
Copy link
Member

@Grub4K Grub4K commented Jul 18, 2023

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

This PR removes the initial state scraping, saving one request. The initial state payload was removed from the webpage, and so we hardcode these values instead.

It also tries to adds a more helpful message for 403 workaround (#7442).

Fixes #7624

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Copilot Summary

馃 Generated by Copilot at 06d280f

Summary

馃洜锔忦煂愨渽

Improve Crunchyroll extraction by using better API parameters, handling errors, and fixing a bug.

Crunchyroll extracts
With client IDs, locale
Cloudflare cut off

Walkthrough

  • Add and use class attributes for client IDs and locale codes (link, link, link)
  • Refactor and improve base extractor methods (link, link)
  • Fix minor bug in beta extractor URL regex (link)
  • Add test case for beta extractor with German language URL (link)

@ProDev2
Copy link

ProDev2 commented Jul 18, 2023

Works perfectly for me! Thank you. Also I checked the code and it shouldn't cause any merge conflicts with my PR (#7009) after you merge it to master.

@Grub4K Grub4K added site-enhancement Feature request for some website site-bug Issue with a specific website needs-testing Patch needs testing labels Jul 18, 2023
@Grub4K
Copy link
Member Author

Grub4K commented Jul 18, 2023

This PR and build aim to fix both the 403 and the initial state error.
If working correctly, the 403 should no longer be present.
If it still is, please let me know the specific region and OS that failed in a comment with a verbose (-v) log.

If it fails, there is still a way to do the workaround as discussed here:

  • Extract your current User-Agent from the browser
    • This can be done using google/duckduckgo/... by searching for "my user-agent"
  • Open a crunchyroll.com page in the browser
  • Load cookies from the browser into yt-dlp
    • Use either --cookies-from-browser or --cookies
  • Pass the previously extracted User-Agent to yt-dlp
    • Use --user-agent <your_user_agent>
  • Additionally, pass the flag to disable the automatic workaround
    • --extractor-arg crunchyrollbeta:ua_workaround

To test, first get the build in one of the following ways:

  • update diretly from a regular builds: yt-dlp --update-to Grub4K/yt-dlp@2023.07.18.215323
  • download it from the pr-branch pre-release
  • run from source: git checkout git@github.com:Grub4K/yt-dlp.git,
    then git switch fix/crunchy-initial-state
  • install using pip: python3 -m pip install -U pip setuptools wheel, then python3 -m pip install --force-reinstall https://github.com/Grub4K/yt-dlp/archive/fix/crunchy-initial-state.tar.gz

Afterwards, try and run through the above steps, letting me know of any further errors.

@Grub4K Grub4K merged commit 9b16762 into yt-dlp:master Jul 20, 2023
13 checks passed
@Grub4K Grub4K removed the needs-testing Patch needs testing label Jul 20, 2023
@Grub4K Grub4K deleted the fix/crunchy-initial-state branch July 20, 2023 20:18
@bashonly
Copy link
Member

@hajimekun Doesn't sound like it would be related. You can open a new issue if you think it should be looked into

aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website site-enhancement Feature request for some website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crunchyroll ERROR: Unable to extract initial state
3 participants