Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/nebula] Remove broken cookie support #5979

Merged
merged 6 commits into from
Feb 17, 2023

Conversation

hheimbuerger
Copy link
Contributor

@hheimbuerger hheimbuerger commented Jan 6, 2023

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

NOTE: This is based on the unmerged branch fix-nebula-extractor (apologies for the unspecific name) from PR #4002. It should either be merged after or into that branch.

Removes broken cookie authentication code from the Nebula extractor.

Reason: Nebula has moved their authentication token from the cookies into local storage. Therefore, authenticating via --cookies-from-browser is no longer feasible. Accessing the browsers' local storage cache is currently not supported by yt-dlp. (Discord conversation)

The extractor now uses raise_login_required(method='password') to notify users trying to authenticate from cookies.

Note that this removes some additional (arguably accidental) functionality, where you would allow yt-dlp once to authorize to Nebula using a supported authentication method, and then yt-dlp would store the authentication token as a cookie and you could reuse that during subsequent runs without providing a supported authentication method. But I find that too magic, and difficult to explain and reason about, so I didn't mind it going away.

The extractor-internal test cases are currently broken. I'm pretty sure it's not due to any extractor issues, but rather new requirements since introduced by the test suite. I will look into that next week, but didn't want to hold back on opening the PR in the meantime.
EDIT 2023-01-09: All unit tests now pass, and manual tests (including channel downloading) have been successful, too.

Fixes the bug report in this Discord conversation.

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

hheimbuerger and others added 5 commits June 7, 2022 11:20
Nebula has moved their authentication token from the cookies into local
storage. Therefore, authenticating via `--cookies-from-browser` is no
longer feasible. Accessing the browsers' local storage cache is
currently not supported by yt-dlp.
'url': zype_video_url,
'formats': fmts,
'subtitles': subs,
'url_result': f'https://nebula.app/{episode["slug"]}' if not fetch_formats_and_subs else '',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'url_result': f'https://nebula.app/{episode["slug"]}' if not fetch_formats_and_subs else '',
'url_result': f'https://nebula.app/{episode["slug"]}' if not fetch_formats_and_subs else '',

did u mean webpage_url?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I do.

I've always been using url_result here, I just looked up webpage_url for the first time. I guess url_result is a remnant of when this extractor was still handing off videos to the Zype extractor for downloading.

Changing it now.

@hheimbuerger
Copy link
Contributor Author

All unit tests now pass again.
I've also manually tested downloading individual videos and downloading channels.

This should also fix the broken channel downloading support. (Admittedly, that fix probably should have gone to the earlier PR, if you want to be really precise. It's here purely out of laziness. Sorry for the confusion.)

@pukkandan
Copy link
Member

Admittedly, that fix probably should have gone to the earlier PR, if you want to be really precise. It's here purely out of laziness. Sorry for the confusion.

No, it's fine. I was planning to merge them together anyway

@hheimbuerger
Copy link
Contributor Author

hheimbuerger commented Jan 31, 2023

@pukkandan With no intention to put additional pressure on you — is there anything I can do to support you in advancing this? Anything I could check, improve or deliver (e.g. test results) to make life easier for you?

I could combine the two Nebula PRs, or close both and instead reopen a cleaner, joined one?

@pukkandan
Copy link
Member

I could combine the two Nebula PRs, or close both and instead reopen a cleaner, joined one?

This PR contains the other one already, no?


I've just been too busy irl this month and a lot of PRs are pending since last release. I expect to have more time in a couple of days

@hheimbuerger
Copy link
Contributor Author

I could combine the two Nebula PRs, or close both and instead reopen a cleaner, joined one?

This PR contains the other one already, no?

It kinda does, yeah. It's two different branches (on top of each other), and there's two PRs with somewhat outdated titles and descriptions, so... it was just the only thing I could think of to make things easier for you.

I've just been too busy irl this month and a lot of PRs are pending since last release. I expect to have more time in a couple of days

Ok, no worries. Thanks for your all your hard, highly appreciated work!

@pukkandan pukkandan merged commit d50ea3c into yt-dlp:master Feb 17, 2023
@pukkandan
Copy link
Member

You've re-introduced the --flat-playlist problem with your last commit. I merged this for now since I'm gonna be making a release. I also noticed that a lot of the code could use some refactoring.

Have a look at master...pukkandan:yt-dlp-dev:features/nebula-flat. Note that I don't have an account and so haven't tested any of the changes. Expect some bugs

@hheimbuerger
Copy link
Contributor Author

You've re-introduced the --flat-playlist problem with your last commit. I merged this for now since I'm gonna be making a release. I also noticed that a lot of the code could use some refactoring.

Have a look at master...pukkandan:yt-dlp-dev:features/nebula-flat. Note that I don't have an account and so haven't tested any of the changes. Expect some bugs

👍 I'll look into and will test your branch right now!

@hheimbuerger
Copy link
Contributor Author

hheimbuerger commented Feb 17, 2023

@pukkandan I took a look and ran some tests, but on the spot, I'm struggling to understand the reasoning behind your changes. As you know, we don't really see eye to eye on code style. 😉 As all the explaining code comments have been removed, and your commit messages are also not really telling much of a story, I don't have much to go by.
Given that I only look at this every other month or so, even my own code I cannot remember the reasoning for, without the full documentation. So I'd have to reverse engineer the code, but for that I'd need to take a day, which I don't have right now.

Therefore, here are some notes from my tests, but I often don't yet know why this has changed or why it doesn't work anymore and behaves the way it does not.

  • In the smuggle_url call, you meant to include a videos/ path component, to make sure the videos are passed off to NebulaIE. Right now, they're passed to NebulaChannelIE, which is of course not what's intended.
  • NebulaSubscriptionsIE._generate_playlist_entries() still contains a call to _build_video_info(), which you removed. I think you meant to make the same change you made to NebulaChannelIE._generate_playlist_entries(), just looping over channel['results'] this time. That seems to work in my limited experiments.
  • Unit tests fail, because the timestamp and upload_date fields are now non-existant. I guess the published_at extraction doesn't work? But I haven't looked into why or how.
  • In some cases, the Nebula API delivers nebula.app URLs again, instead of nebula.tv, failing the tests checking for the latter. I assume that's some weirdness on their end, but haven't further investigated.
  • The first two video tests fail with Expected test_Nebula_5c271b40b13fd613090034fd.mp4 to be at least 9.77KiB, but it's only 756.00B. I'm not sure why that is, as they seem to kick off the right thing, here's a log excerpt: Invoking hlsnative downloader on "https://media-production.nebula.app/[CONFIRMED VALID MEDIA URL]" / Total fragments: 369 / [download] 100% of 756.00B in 00:00:00 at 5.02KiB/s. The third video test passes! It also shows in the log that it's downloading 4.60MiB. Not sure what's different about this one. In normal execution, all video URLs produce large, valid video files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants