Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk-download #16

Closed
ccremer opened this issue Jan 16, 2023 · 0 comments · Fixed by #17
Closed

Bulk-download #16

ccremer opened this issue Jan 16, 2023 · 0 comments · Fixed by #17

Comments

@ccremer
Copy link
Owner

ccremer commented Jan 16, 2023

Summary

As user
I want to download all documents
So that I can create a local offline copy of the documents for emergency access

Context

In case of a downtime of Paperless, all documents become unavailable as well.
With a local offline copy, one can at least access the documents.

For this, we need to download all documents and optionally unzip them.
In an ideal use case, this step is automated using cron or systemd timer and executed regularly.

Out of Scope

  • Backup of the Paperless instance

Further links

Acceptance criteria

Given a running Paperless instance with REST API access
When the CLI is invoked with subcommand `bulk-download LOCAL-FILENAME`
Then all documents are downloaded in `archive` format into `LOCAL-FILENAME.zip`
Given a running Paperless instance with REST API access
When the CLI is invoked with subcommand `bulk-download --unzip LOCAL-FILENAME`
Then all documents are downloaded and extracted into `LOCAL-FILENAME` dir
Given a running Paperless instance with REST API access
And file or dir `LOCAL-FILENAME` exists locally
When the CLI is invoked with subcommand `bulk-download LOCAL-FILENAME`
Then the download is aborted with an error message
Given a running Paperless instance with REST API access
And file or dir `LOCAL-FILENAME` exists locally
When the CLI is invoked with subcommand `bulk-download --overwrite LOCAL-FILENAME`
Then any previously existing file (or dir) called `LOCAL-FILENAME[.zip]` is removed
And all documents are downloaded into `LOCAL-FILENAME[.zip]` dir or zip file without error

Implementation Ideas

  • CLI should be something like $ paperless-cli bulk-download [--archive|--original] [--overwrite] [--unzip] LOCAL-FILENAME
  • Start with bulk-downloading everything at every invocation
  • Later on, we might download the diff only. For example, we can list all documents and keep that list saved locally. In the next invocation, fetch the list again and compare with local saved list so that only the diff needs to be redownloaded. Might be worth its own feature request.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant