Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching youtube history doesn't respect playlist-end parameter #16238

Closed
mymikemiller opened this issue Apr 21, 2018 · 3 comments
Closed

Fetching youtube history doesn't respect playlist-end parameter #16238

mymikemiller opened this issue Apr 21, 2018 · 3 comments
Labels

Comments

@mymikemiller
Copy link

@mymikemiller mymikemiller commented Apr 21, 2018

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.04.16. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2018.04.16

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-u', u'PRIVATE', u'-p', u'PRIVATE', u':ythistory', u'--flat-playlist', u'--playlist-end', u'10', u'-j', u'-v']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.04.16
[debug] Python version 2.7.10 (CPython) - Darwin-17.5.0-x86_64-i386-64bit
[debug] exe versions: none
[debug] Proxy map: {}

Description of your issue, suggested solution and other information

I'm trying to fetch my youtube watch history using youtube-dl as it's no longer provided by the youtube api v3. This works great using the following command:

Youtube-dl -u myemail -p mypassword https://www.youtube.com/feed/history --flat-playlist --playlist-end 10 -j

or

Youtube-dl -u myemail -p mypassword :ythistory --flat-playlist --playlist-end 10 -j

As expected, this returns the latest 10 videos watched, in json format.

Unfortunately, it seems to have to download every page of my history (as you can see by removing the -j flag: 72 pages total, 8427 videos) before returning the first 10 videos, which are all on the first page.

[youtube:history] Youtube History: Downloading page #1
[youtube:history] Youtube History: Downloading page #2
...
[youtube:history] Youtube History: Downloading page #71
[youtube:history] Youtube History: Downloading page #72

It should have only had to request 1 page, but it keeps requesting pages until it has them all. As it takes a couple seconds to fetch each page, shouldn't it stop once it's found 10 videos, as I've specified in the --playlist-end parameter?

I understand why it would need to look at all the pages until it gets to the video I specified in --playlist-start, but once it gets to the --playlist-end'th video, I expect it to stop fetching pages and return what it's found now that it has all the information I requested (it's seen everything between --playlist-start and --playlist-end).

@mymikemiller
Copy link
Author

@mymikemiller mymikemiller commented Apr 21, 2018

I dug into the code a little, and I found a potential solution, though I don't know how much it would break things.

In the _real_extract() function in youtube.py, after ids.extend(new_ids), you could check the length of 'ids' and break out of the loop if that length is greater than what was specified for the --playlist-end parameter, because only the items found before this will be returned anyway.

Does this seem like a reasonable solution? I'd make a pull request, but I don't know how to get access to the --playlist-end parameter inside the _real_extract function.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 21, 2018

No, it's not. Correct solution is to use generator based entries or PagedList playlist.

@mymikemiller
Copy link
Author

@mymikemiller mymikemiller commented Apr 21, 2018

Searching for "PagedList" (I have no idea what that is), I found this bug from two years ago. Seems like they want the same thing I want: #10184

A change like this is a bit over my head, without help.

In any case, thank you and the other collaborators for this awesome project!

@dstftw dstftw closed this Apr 21, 2018
@dstftw dstftw added the duplicate label Apr 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.