Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and document backup & restore & vacuum process #41

Closed
6 tasks done
novoj opened this issue Mar 3, 2023 · 3 comments · Fixed by #626
Closed
6 tasks done

Create and document backup & restore & vacuum process #41

novoj opened this issue Mar 3, 2023 · 3 comments · Fixed by #626
Assignees
Labels
breaking change Backward incompatible data model change enhancement New feature or request
Milestone

Comments

@novoj
Copy link
Collaborator

novoj commented Mar 3, 2023

We need to implement the vacuuming process, the backup & restore procedure. We want all of them to be executable while the system is running and writing to the original files without interruption. This requires special integration tests and manual testing. We should also partially rewrite the current storage logic where we use "nice" names for entity collections and catalogs. We (probably) need to avoid situations that require renaming existing files and use monotonically increasing indexes for the files and switch the reference to the currently used file only in storage data structures. Finally, if the system detects that there is no living pointer to the old file, it will be permanently deleted.

Vacuuming processes:

  • small: when trash ratio exceeds limit ⇨ write memory snapshot to a new file with incremented index
  • big: regularly delete old files when:
    • there is no active transaction working with the file (otherwise log error)
    • their last modification timestamp is lesser than the required history to be kept
    • compact catalog header file to track only those headers that relate to existing files

Backup process:

  • small: creates ZIP file with:
    1. last record catalog header only
    2. current catalog content file
    3. current entity content files
  • big: creates ZIP file with:
    1. entire catalog header file
    2. all catalog content files
    3. all entity content files

Ensures no vacuuming process is running and prevents it to be executed for particular catalog.

Restoring process:

  • simple - just unzip entire catalog contents and load the contents from it
  • chirurgical (can be used only on "big" backups) - it looks up into the ZIP file and finds catalog header that matches requested timestamp, if the catalog history is newer than the timestamp, error is printed with the information of the earliest timestamp, that can be used. If the particular header can be found, its contents are restored as a fresh snapshot without any waste; it accepts two arguments:
    • backup file name (mandatory)
    • timestamp the catalog should be restored in (optional - if not set, current date and time is used)
@novoj novoj self-assigned this Mar 3, 2023
@novoj novoj added the enhancement New feature or request label Mar 3, 2023
@novoj novoj added this to the Alpha milestone Mar 3, 2023
novoj added a commit that referenced this issue Mar 27, 2024
@novoj novoj modified the milestones: Alpha, Beta Apr 26, 2024
novoj added a commit that referenced this issue May 4, 2024
Backup & restore done and working. No client implementation and missing integration test.
novoj added a commit that referenced this issue May 6, 2024
novoj added a commit that referenced this issue May 28, 2024
@novoj
Copy link
Collaborator Author

novoj commented May 31, 2024

The API of the backup should be in form:

  1. api call "backup start" -> backup_id
  2. api call "backup status " -> queued, still running, finished on path: xxx
  3. backup
    1. api call "backup download " -> backup archive (developers)
    2. direct backup from volume path (on the same host cluster) that can be derived from known location and
  4. api call "backup release " -> deleted (or autoremoval after certain period - configurable)

novoj added a commit that referenced this issue Jun 2, 2024
@novoj novoj added the breaking change Backward incompatible data model change label Jun 4, 2024
@novoj
Copy link
Collaborator Author

novoj commented Jun 4, 2024

During implementation the naming convention for data files has changed. We'd need full-reindex with this version.

novoj added a commit that referenced this issue Jun 5, 2024
This will open-up the path to time-traveling catalog restoration.
novoj added a commit that referenced this issue Jun 14, 2024
We decided to make backup and restore operation asynchronous.
novoj added a commit that referenced this issue Jun 19, 2024
novoj added a commit that referenced this issue Jun 19, 2024
Also removed duplicate max queue configuration for transactions.
novoj added a commit that referenced this issue Jun 20, 2024
Also corrected the backup logic, that takes heavy indexing into an account.
novoj added a commit that referenced this issue Jun 20, 2024
Also corrected the backup logic, that takes heavy indexing into an account.
novoj added a commit that referenced this issue Jun 21, 2024
Automatic re-scheduling and transaction pipeline recreation.
novoj added a commit that referenced this issue Jun 23, 2024
Automatic re-scheduling and transaction pipeline recreation.
novoj added a commit that referenced this issue Jun 23, 2024
Automatic re-scheduling and transaction pipeline recreation.
novoj added a commit that referenced this issue Jun 23, 2024
Automatic re-scheduling and transaction pipeline recreation.
novoj added a commit that referenced this issue Jun 23, 2024
Automatic re-scheduling and transaction pipeline recreation.
novoj added a commit that referenced this issue Jun 23, 2024
Automatic re-scheduling and transaction pipeline recreation.
novoj added a commit that referenced this issue Jul 9, 2024
novoj added a commit that referenced this issue Jul 9, 2024
@novoj novoj linked a pull request Jul 9, 2024 that will close this issue
novoj added a commit that referenced this issue Jul 9, 2024
…-restore--vacuum-process

feat(#41): create and document backup  restore  vacuum process
@novoj
Copy link
Collaborator Author

novoj commented Jul 9, 2024

Finally, implemented. Documentation will be added when the functionality is available through REST / GraphQL / Lab interfaces. This will be done in follow-up issue #627

@novoj novoj closed this as completed Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Backward incompatible data model change enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant