Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[archive.org] Add support for collections #7154

Open
parkerlreed opened this issue Oct 12, 2015 · 6 comments
Open

[archive.org] Add support for collections #7154

parkerlreed opened this issue Oct 12, 2015 · 6 comments
Labels

Comments

@parkerlreed
Copy link

@parkerlreed parkerlreed commented Oct 12, 2015

css80@allied80 /cygdrive/c/Users/css80/Desktop
$ ./youtube-dl.exe --verbose https://archive.org/details/attentionkmartshoppers
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'--verbose', u'https://archive.org/details/attentionkmartshoppers']
[debug] Encodings: locale cp1252, fs mbcs, out None, pref cp1252
[debug] youtube-dl version 2015.10.12
[debug] Python version 2.7.8 - Windows-8-6.2.9200
[debug] exe versions: ffmpeg N-75841-g5911eeb, ffprobe N-75841-g5911eeb
[debug] Proxy map: {}
[archive.org] attentionkmartshoppers: Downloading JSON metadata
ERROR: attentionkmartshoppers: Failed to parse JSON  (caused by ValueError('No JSON object could be decoded',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "youtube_dl\extractor\common.pyo", line 483, in _parse_json
  File "json\__init__.pyo", line 338, in loads
  File "json\decoder.pyo", line 366, in decode
  File "json\decoder.pyo", line 384, in raw_decode
ValueError: No JSON object could be decoded
Traceback (most recent call last):
  File "youtube_dl\YoutubeDL.pyo", line 660, in extract_info
  File "youtube_dl\extractor\common.pyo", line 290, in extract
  File "youtube_dl\extractor\archiveorg.pyo", line 37, in _real_extract
  File "youtube_dl\extractor\common.pyo", line 477, in _download_json
  File "youtube_dl\extractor\common.pyo", line 487, in _parse_json
ExtractorError: attentionkmartshoppers: Failed to parse JSON  (caused by ValueError('No JSON object could be decoded',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

@jaimeMF jaimeMF changed the title archive.org not downloading [archive.org] Add support for collections Oct 12, 2015
@jaimeMF jaimeMF added the request label Oct 12, 2015
@naturallymitchell
Copy link

@naturallymitchell naturallymitchell commented Apr 20, 2016

archive.org has lots of great media. hopefully someone can add this feature.

@naturallymitchell
Copy link

@naturallymitchell naturallymitchell commented Dec 13, 2017

bump

@pljones
Copy link

@pljones pljones commented Jun 27, 2018

OK, three years near enough and no progress on this? Maybe I'll have to take a look. The parsing doesn't look to be too tricky.

@rudedogg
Copy link

@rudedogg rudedogg commented Aug 24, 2018

As a workaround, if you click the RSS icon on a collection page, and use that URL with youtube-dl it'll use the generic extractor.

@pljones
Copy link

@pljones pljones commented Aug 24, 2018

I didn't get anything going in youtube-dl but, for those with bash, shell utils, wget and metaflac, the attached two scripts may prove of interest.

download.sh.gz
tagger.sh.gz

gunzip them...

  • download.sh - edit it, change the collection name to the page you want to slurp. Note the nasty hack of writing to collection.html. Could do with tidying up... The script then downloads all the "of interest" contents for each page pointed to by the collection.
  • tagger.sh - parses (in bash...) the metadata files slurped by download.sh and applies my tagging scheme to flac files. Feel free to switch to a more general tagger and apply different rules :)

@rudedogg -- unfortunately the RSS link doesn't necessarily get you the full collection. For example, it only goes back to 2014 on the OresundSpaceCollective archive, whereas the full collection goes back to 2005. I don't know (not tried it) whether youtube-dl has a way around that but I'd guess it just assumes it's been given all it can get. But thanks for reminding me about this ticket!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.