Releases: jjjake/internetarchive
Releases · jjjake/internetarchive
Version 1.7.4
Feautres and Improvements
- Increased timeout in search from 12 seconds to 24.
- Added ability to set the
max_retries
in :func:internetarchive.search_items
. - Made :meth:
internetarchive.ArchiveSession.mount_http_adapter
a public method for supporting complex custom retry logic. - Added
--timeout
option toia search
for setting a custom timeout.
Bugfixes
- The scraping API has reverted to using
items
key rather thandocs
key.
v1.7.3 will still work, but this change keeps ia consistent with the API.
Version 1.7.2
Feautres and Improvements
- Added support for adding custom headers to
ia search
.
Bugfixes
internetarchive.utils.get_s3_xml_text()
is used to parse errors returned by S3 in XML.
Sometimes there is no XML in the response.
Most of the time this is due to 5xx errors.
Either way, we want to always return the HTTPError, even if the XML parsing fails.- Fixed a regression where
:
was being stripped from filenames in upload. - Do not create a directory in
download()
whenreturn_responses
isTrue
. - Fixed bug in upload where file-like objects were failing with a TypeError exception.
Version 1.7.1
Bugfixes
- Fixed bug in
ia upload
where all commands would fail if multiple collections were specified (e.g. -m collection:foo -m collection:bar).
Version 1.7.0
Feautres and Improvements
- Loosened up
jsonpatch
requirements, as the metadata API now supports more recent versions of the JSON Patch standard. - Added support for building "snap" packages (https://snapcraft.io/).
Bugfixes
- Fixed bug in upload where users were unable to add their own timeout via
request_kwargs
. - Fixed bug where files with non-ascii filenames failed to upload on some platforms.
- Fixed bug in upload where metadata keys with an index (e.g.
subject[0]
) would make the request fail if the key was the only indexed key provided. - Added a default timeout to
ArchiveSession.s3_is_overloaded()
.
If it times out now, it returnsTrue
(as in, yes, S3 is overloaded).
Version 1.6.0
Features and Improvements
- Added 60 second timeout to all upload requests.
- Added support for uploading empty files.
- Refactored
Item.get_files()
to be faster, especially for items with many files. - Updated search to use IA-S3 keys for auth instead of cookies.
Bugfixes
- Fixed bug in upload where derives weren't being queued in some cases where checksum=True was set.
- Fixed bug where
ia tasks
and otherCatalog
functions were always using HTTP even when it should have been HTTPS. ia metadata
was exiting with a non-zero status for "no changes to xml" errors.
This now exits with 0, as nearly every time this happens it should not be considered an "error".- Added unicode support to
ia upload --spreadsheet
andia metadata --spreadsheet
using thebackports.csv
module. - Fixed bug in
ia upload --spreadsheet
where some metadata was accidentally being copied from previous rows
(e.g. when multiple subjects were used). - Submitter wasn't being added to
ia tasks --json
ouptut, it now is. row_type
inia tasks --json
was returning integer for row-type rather than name (e.g. 'red').
Version 1.4.0
Features and Improvements
- Added
ia copy
andia move
for copying and moving files in archive.org items. - Added support for outputing JSON in
ia tasks
. - Added support to
ia download
to write to stdout instead of file.
Bugfixes
- Fixed bug in upload where AttributeError was rasied when trying to upload file-like objects without a name attribute.
- Removed identifier validation from
ia delete
.
If an identifier already exists, we don't need to validate it.
This only makes things annoying if an identifier exists but failsinternetarchive
id validation. - Fixed bug where error message isn't returned in
ia upload
if the response body is not XML.
Ideally IA-S3 would always return XML, but that's not the case as of now.
Try to dump the HTML in the S3 response if unable to parse XML. - Fixed bug where ArchiveSession headers weren't being sent in prepared requests.
- Fixed bug in
ia upload --size-hint
where value was an integer, but requests requries it to be a string. - Added support for downloading files to stdout in
ia download
andFile.download
.
Version 1.0.8
Features and Improvements
- Increased maximum identifier length from 80 to 100 characters in
ia upload
.
Bugfixes
- As of version 2.11.0 of the requests library, all header values must be strings (i.e. not integers).
internetarchive
now converts all header values to strings.
Version 1.0.7
Features and Improvements
- Added
internetarchive.api.get_user_info()
.
Version 1.0.5
Features and Improvements
- All metadata writes are now submitted at -5 priority by default. This is friendlier to the archive.org catalog, and should only be changed for one-off metadata writes.
- Expanded scope of valid identifiers in
utils.validate_ia_identifier
(i.e.ia upload
). Periods are now allowed. Periods, underscores, and dashes are not allowed as the first character.
Version 1.0.4
Features and Improvements
- Search now uses the v1 scraping API endpoint.
- Moved
internetarchive.item.Item.upload.iter_directory()
tointernetarchive.utils
. - Added support for downloading "on-the-fly" files (e.g. EPUB, MOBI, and DAISY) via
ia download <id> --on-the-fly
oritem.download(on_the_fly=True)
.
Bugfixes
s3_is_overloaded()
now returnsTrue
if the call is unsuccessful.- Fixed bug in upload where a derive task wasn't being queued when a directory is uploaded.