Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

History feed download is not downloading properly #1845

Closed
yuvadm opened this issue Nov 27, 2013 · 12 comments
Closed

History feed download is not downloading properly #1845

yuvadm opened this issue Nov 27, 2013 · 12 comments
Assignees
Labels
bug

Comments

@yuvadm
Copy link

@yuvadm yuvadm commented Nov 27, 2013

Continuing the new viewing history feature in #1821 -

When running the following command:

$ youtube-dl ':ythistory' -u USER -p PWD --write-pages

The result is:

[youtube:history] Logging in
[youtube:history] History: Downloading webpage
[youtube:history] Saving request to History_https_-_www.youtube.com_feed_history.dump
[youtube:history] history feed: Downloading page 0
[youtube:history] Saving request to history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=0.dump
[youtube:history] history feed: Downloading page 1
[youtube:history] Saving request to history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=1384803931993001.dump
[youtube:history] history feed: Downloading page 2
[youtube:history] Saving request to history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=2769607863986002.dump
[youtube:history] history feed: Downloading page 3
[youtube:history] Saving request to history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=4154411795979003.dump
[youtube:history] history feed: Downloading page 4

with the following files created:

-rw-r--r--   1 user  staff  355768 Nov 27 22:58 History_https_-_www.youtube.com_feed_history.dump
-rw-r--r--   1 user  staff  353934 Nov 27 22:58 history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=0.dump
-rw-r--r--   1 user  staff  384896 Nov 27 22:58 history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=1384803931993001.dump
-rw-r--r--   1 user  staff  353934 Nov 27 22:58 history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=2769607863986002.dump
-rw-r--r--   1 user  staff  353934 Nov 27 22:58 history_feed_http_-_www.youtube.com_feed_ajaxaction_load_personal_feed=1_feed_name=history_paging=4154411795979003.dump

As you can see (and can be confirmed with a hash check), all files beside the second dump are exactly the same. I let this run for a long time and it got to a few thousand pages, so this is likely a bug.

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 27, 2013

How many videos does your history contains? on my account (which I used to implement it) I only have 151, so I didn't test it with many videos.
Could you open http://www.youtube.com/feed/history and copy the http://www.youtube.com/feed_ajax?action_load_personal_feed=1&feed_name=history&paging={some_number} urls that are loaded when scrolling to the bottom?

@yuvadm
Copy link
Author

@yuvadm yuvadm commented Nov 27, 2013

This is likely a very large viewing history.

What do you need from the AJAX calls? Just the URLs? Or any specific HTTP header as well?

@ghost ghost assigned jaimeMF Nov 27, 2013
@jaimeMF jaimeMF closed this in 0e44d83 Nov 27, 2013
@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 27, 2013

Fixed, I managed to reproduce it. Now we obtain the paging value from the json info, it won't use wrong values.
Thanks for the report!

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 27, 2013

I forgot to mention, if you try it (using the last commit in the repo) be aware that there seems to be a bug in the YouTube interface: it will say you have more videos in your history than the actual number (for example I had only 4 videos but it said there were 10 videos).

@phihag
Copy link
Contributor

@phihag phihag commented Nov 28, 2013

This fix has been added into youtube-dl 2013.11.28. Type sudo youtube-dl -U to update.

@yuvadm
Copy link
Author

@yuvadm yuvadm commented Nov 28, 2013

It works! Is there an easy way to nicely export the data instead of having youtube-dl download thousands of videos?

@5shekel
Copy link

@5shekel 5shekel commented Nov 28, 2013

yuvadm you can output the video to console using

-g, --get-url              simulate, quiet but print URL
-j, --dump-json         simulate, quiet but print JSON information

i have 10K videos in my history (i think its the limit) so i cant check if it works,
it will not dump the urls until its finished with all pages. i guess.

@5shekel
Copy link

@5shekel 5shekel commented Nov 28, 2013

scratch that, it does work

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 28, 2013

@yuvadm if you want the whole info for all the videos, the options given by @yoshco will work.
If you just want to get the ids of the videos, you can use a python script:

#!/usr/bin/env python
import youtube_dl


params = {
    'quiet': True,
    'usenetrc': True,
    # or use
    # 'username': foo,
    # 'password': bar,
}
with youtube_dl.YoutubeDL(params) as ydl:
    history_ie = youtube_dl.extractor.YoutubeHistoryIE(ydl)
    result = history_ie.extract(':ythistory')
    for entry in result['entries']:
        # It will print only the video_id
        print(entry['url'])

The script will need to import youtube_dl, you can install it with pip install -U youtube-dl

@phihag
Copy link
Contributor

@phihag phihag commented Nov 28, 2013

@jaimeMF I don't think you need to write a script, you can just pass in --get-id if you want the video IDs.

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 28, 2013

@phihag You're right, if you want a built-in solution, you can use --get-id. But if I'm not wrong, it would download the whole info for each video (which may take a while if the list is really long). The script will only fetch the pages for the history (there are around 70 videos in each page), so it will be faster.

@Lutraphobia
Copy link

@Lutraphobia Lutraphobia commented May 19, 2016

How can I use the script from jaimeMF to get the TITLE and the URL. When I change print(entry['url']) to print(entry['title']) it tells me invalid argument or something like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.