Skip to content

Releases: jjjake/internetarchive

Version 1.7.4

06 Nov 19:46
Compare
Choose a tag to compare

Feautres and Improvements

  • Increased timeout in search from 12 seconds to 24.
  • Added ability to set the max_retries in :func:internetarchive.search_items.
  • Made :meth:internetarchive.ArchiveSession.mount_http_adapter a public method for supporting complex custom retry logic.
  • Added --timeout option to ia search for setting a custom timeout.

Bugfixes

  • The scraping API has reverted to using items key rather than docs key.
    v1.7.3 will still work, but this change keeps ia consistent with the API.

Version 1.7.2

11 Sep 22:21
Compare
Choose a tag to compare

Feautres and Improvements

  • Added support for adding custom headers to ia search.

Bugfixes

  • internetarchive.utils.get_s3_xml_text() is used to parse errors returned by S3 in XML.
    Sometimes there is no XML in the response.
    Most of the time this is due to 5xx errors.
    Either way, we want to always return the HTTPError, even if the XML parsing fails.
  • Fixed a regression where : was being stripped from filenames in upload.
  • Do not create a directory in download() when return_responses is True.
  • Fixed bug in upload where file-like objects were failing with a TypeError exception.

Version 1.7.1

25 Jul 19:32
Compare
Choose a tag to compare

Bugfixes

  • Fixed bug in ia upload where all commands would fail if multiple collections were specified (e.g. -m collection:foo -m collection:bar).

Version 1.7.0

25 Jul 18:17
Compare
Choose a tag to compare

Feautres and Improvements

  • Loosened up jsonpatch requirements, as the metadata API now supports more recent versions of the JSON Patch standard.
  • Added support for building "snap" packages (https://snapcraft.io/).

Bugfixes

  • Fixed bug in upload where users were unable to add their own timeout via request_kwargs.
  • Fixed bug where files with non-ascii filenames failed to upload on some platforms.
  • Fixed bug in upload where metadata keys with an index (e.g. subject[0]) would make the request fail if the key was the only indexed key provided.
  • Added a default timeout to ArchiveSession.s3_is_overloaded().
    If it times out now, it returns True (as in, yes, S3 is overloaded).

Version 1.6.0

27 Jun 19:18
Compare
Choose a tag to compare

Features and Improvements

  • Added 60 second timeout to all upload requests.
  • Added support for uploading empty files.
  • Refactored Item.get_files() to be faster, especially for items with many files.
  • Updated search to use IA-S3 keys for auth instead of cookies.

Bugfixes

  • Fixed bug in upload where derives weren't being queued in some cases where checksum=True was set.
  • Fixed bug where ia tasks and other Catalog functions were always using HTTP even when it should have been HTTPS.
  • ia metadata was exiting with a non-zero status for "no changes to xml" errors.
    This now exits with 0, as nearly every time this happens it should not be considered an "error".
  • Added unicode support to ia upload --spreadsheet and ia metadata --spreadsheet using the backports.csv module.
  • Fixed bug in ia upload --spreadsheet where some metadata was accidentally being copied from previous rows
    (e.g. when multiple subjects were used).
  • Submitter wasn't being added to ia tasks --json ouptut, it now is.
  • row_type in ia tasks --json was returning integer for row-type rather than name (e.g. 'red').

Version 1.4.0

26 Jan 23:57
Compare
Choose a tag to compare

Features and Improvements

  • Added ia copy and ia move for copying and moving files in archive.org items.
  • Added support for outputing JSON in ia tasks.
  • Added support to ia download to write to stdout instead of file.

Bugfixes

  • Fixed bug in upload where AttributeError was rasied when trying to upload file-like objects without a name attribute.
  • Removed identifier validation from ia delete.
    If an identifier already exists, we don't need to validate it.
    This only makes things annoying if an identifier exists but fails internetarchive id validation.
  • Fixed bug where error message isn't returned in ia upload if the response body is not XML.
    Ideally IA-S3 would always return XML, but that's not the case as of now.
    Try to dump the HTML in the S3 response if unable to parse XML.
  • Fixed bug where ArchiveSession headers weren't being sent in prepared requests.
  • Fixed bug in ia upload --size-hint where value was an integer, but requests requries it to be a string.
  • Added support for downloading files to stdout in ia download and File.download.

Version 1.0.8

10 Aug 00:11
Compare
Choose a tag to compare

Features and Improvements

  • Increased maximum identifier length from 80 to 100 characters in ia upload.

Bugfixes

  • As of version 2.11.0 of the requests library, all header values must be strings (i.e. not integers).
    internetarchive now converts all header values to strings.

Version 1.0.7

03 Aug 22:17
Compare
Choose a tag to compare

Features and Improvements

  • Added internetarchive.api.get_user_info().

Version 1.0.5

07 Jul 18:59
Compare
Choose a tag to compare

Features and Improvements

  • All metadata writes are now submitted at -5 priority by default. This is friendlier to the archive.org catalog, and should only be changed for one-off metadata writes.
  • Expanded scope of valid identifiers in utils.validate_ia_identifier (i.e. ia upload). Periods are now allowed. Periods, underscores, and dashes are not allowed as the first character.

Version 1.0.4

28 Jun 18:55
Compare
Choose a tag to compare

Features and Improvements

  • Search now uses the v1 scraping API endpoint.
  • Moved internetarchive.item.Item.upload.iter_directory() to internetarchive.utils.
  • Added support for downloading "on-the-fly" files (e.g. EPUB, MOBI, and DAISY) via ia download <id> --on-the-fly or item.download(on_the_fly=True).

Bugfixes

  • s3_is_overloaded() now returns True if the call is unsuccessful.
  • Fixed bug in upload where a derive task wasn't being queued when a directory is uploaded.