New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extractor/iprima] Fix extractor (relax nuxt function regex, add js_to_json hack) #7216
Conversation
…dd js_to_json hack) flake8 fix
…egex, add js_to_json hack) replace all occurences of Array constructor, not just a single one
see #7229 (comment) |
Adjusted for latest changes to iPrima |
…json) make the code more declarative as suggested in review
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
pre-release build available for testing: |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Hello @std-move, this is my log with the test movie, I have a paid access to prima+: [debug] Command-line config: ['-vU', '-u', 'PRIVATE', '-p', 'PRIVATE', 'https://www.iprima.cz/serialy/zoo/zoo/126-epizoda-126'] Thank you for your support. |
Looks like you have a paid account. Another user has reported a similar error in issue #6524, I would advise you to add the report here. With my free user account, the download works ok. So as a temporary workaround, you can create a new free account and use that one to download the show, albeit in 'non-HD' quality. To make the extractor work with paid accounts, some changes would be required I guess (unless the content is DRMed, in which case it wouldn't help, as DRM circumvention is against yt-dlp's policies) |
simplify Array constructor replacement by using backreferences Co-authored-by: Simon Sawicki <accounts@grub4k.xyz>
Thanks @Grub4K for the suggestion, backreferences totally escaped me - they really simplify the code. Also added simple tests for map/array constructor conversion as suggested. Please review the changes again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good like that
use greedy search Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls merge as 2 commits @bashonly
Closes yt-dlp#7229 Authored by: std-move
IMPORTANT: PRs without the template will be CLOSED
Description of your pull request and other information
Fixes the extractor issue mentioned in the latest comment in #6524 (comment), but not the originally reported issue (unable to verify/reproduce that one as I don't have a paid account). Details:
// Fixes #7229
the iPrima extractor has been broken due to failure to extract nuxt_js data. This happens due to additional code being present before the return object statement. Shortened example:
This additional code that modified a passed parameter is not needed for us to successfully extract the data we need. I have relaxed the regex to allow this additional code to be present.
The second issue was that
js_to_json
failed due to anArray()
parameter being present. Shortened example of the input:Without the changes I've made to the function, quotes get added around
Array
, resulting in the following parsing exception:Matching the whole line in the regex is required as the Array parameter is only a part of the original array. This code is not very nice but then again js_to_json is kinda hacky anyway, it is not a proper parser.
This should make the extractor work again and hopefully not break other things. Suggestions to improve the code are very much welcome.
Template
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:
What is the purpose of your pull request?
Copilot Summary
🤖 Generated by Copilot at 5782255
Summary
🐛🔧🚀
Fix Nuxt.js metadata extraction for some websites by improving regexes for
common.py
and adding Array constructor handling in_utils.py
.Walkthrough