Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PR] extend the backup mechanism during updates #1

Closed
qcasey opened this issue Feb 28, 2022 · 3 comments · Fixed by #42
Closed

[PR] extend the backup mechanism during updates #1

qcasey opened this issue Feb 28, 2022 · 3 comments · Fixed by #42

Comments

@qcasey
Copy link
Member

qcasey commented Feb 28, 2022

This is an unmerged PR involving ansible, if this repo is maintained it should be considered.

See jonaswinkler/paperless-ng#1058 and paperless-ngx/paperless-ngx#16

stumpylog pushed a commit that referenced this issue Aug 20, 2022
Changes:

* Working molecule pipeline with Github Actions
    * Builds in Ubuntu 20/22.04
* Added ansible-lint exceptions
* Added a paperless temp directory for git clone using the tempfile ansible module
* Added a task with the ansible find module to replace the command that runs already. WIP

Fixes:

* Corrected the path for paperless directory
* Added correct defaults for URL/Listen address to run paperless in docker/molecule
* Added a separate task to install jbig2enc from source due to issues finding apt repo that contained a working package
* Added jmespath package as part of the base install
* Added becomes_user/become back in to fix the issues with dir/file owner/groups on install 
* Added a task to install latest version of pikepdf even if installing an earlier version than v1.8.0
@stevenengland
Copy link
Collaborator

stevenengland commented Mar 8, 2023

Hi contributers @qcasey @SiM22 @stumpylog ,

this role was completely remade in the past month with breaking changes introduced last week.
For more beackground information have a look at:

Input for this issue:

I would like to reframe the issue a little bit. First of all I would like to split things up.

  1. Backup of existing paperless installations (scope of this issue)
  2. Recovery from existing paperless backups (not the scope of this issue but maybe of another new one if needed)

For both portions:

I am not yet sure how deep we should integrate with the Ops flow. I think it is a kind of a seperation of concerns thing. Backup and restore should always be part of the rollout of newer application versions but personally I think outside of this roles concern. My personal flow is described below. But as you might expect that differs from system to system a little bit and I am not sure if we shall abstract that flow and offer a general backup/restore mechanism or let this portion be up to the user operating paperless. Let's see what you think :)

For the backup portion:

My personal flow looks like this:

  • run the sanitizer module to check for inconsistencies
  • create a Proxmox LXC backup of the complete machine state
  • running the document_exporter module to have a backup of the documents and meta data
  • create a backup of the /media directory that is sitting on a NAS.
  • upgrade the app
  • run the sanitizer module to check for inconsistencies

I could imagine to abstract this flow what leeds to this flow we could add to the role.

Possible role flow
Since the remade role clearly distiguishes the directories in terms of

  • Where is the paperless app itself installed (default in /opt/)
  • Where is the paperless config data stored (default in /etc/ for paperless.conf and gunicorn in paperless.d)
  • Where is the paperless app runtime data stored (default in /var/lib for e.g. logs, ...)
  • Where are the user data stored (default in /var/lib for e. g. media, consumption, ...)

and does not allow to put the user data into a subdir of the paperless app dir (as this always leeds to trouble somewhen) I would recommend a backup workflow like this:

  1. introduce a new var for a backup location that points to somwhere/backup_dir
  2. Stop the services
  3. (optional) run the sanitizer
  4. run the document_exporter module, place the backup to /backup_dir/user_data/document_exporter/yyyymmdd_export.zip
  5. media dir: copy all files from the media dir as zipped object to /backup_dir/user_data/media/yyyymmdd_media.zip
  6. paperless app dir: copy all files as zipped object to /backup_dir/app/yyyymmdd_app.gzip
  7. paperless config: copy all files as zipped object to /backup_dir/config/yyyymmdd_config.gzip
  8. don't backup runtime data
  9. do the upgrade including restart of the services
  10. (optional) run the sanitizer
  11. (optional) clean files (backup objects) that are older than x days

What do you think?

@muued
Copy link

muued commented Mar 9, 2023

I created the original CR, because the old behaviour of a non-optional backup together with the situation, that a default setup would have all media files within the directory that was backed up, seemed imperfect.

Personally, I don't think this role should create / restore a backup at all (the main feature of the PR for me was the ability to turn it off).
There already are great solutions for backup/restore functionality out there. We should let the users decide, which of them they want to use.

Performing a backup is also not ansible-specific. The docker / docker-compose setup also leaves backup stuff to the user.

@stevenengland
Copy link
Collaborator

I totally agree with @muued Therefore I opened #42 to remove the incomplete existing backup feature at all. If no other voices with good reasons for a backup mechanism appear I will pull it in the next days ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants