Create and document backup & restore & vacuum process #41

novoj · 2023-03-03T14:25:36Z

We need to implement the vacuuming process, the backup & restore procedure. We want all of them to be executable while the system is running and writing to the original files without interruption. This requires special integration tests and manual testing. We should also partially rewrite the current storage logic where we use "nice" names for entity collections and catalogs. We (probably) need to avoid situations that require renaming existing files and use monotonically increasing indexes for the files and switch the reference to the currently used file only in storage data structures. Finally, if the system detects that there is no living pointer to the old file, it will be permanently deleted.

Vacuuming processes:

small: when trash ratio exceeds limit ⇨ write memory snapshot to a new file with incremented index
big: regularly delete old files when:
- there is no active transaction working with the file (otherwise log error)
- their last modification timestamp is lesser than the required history to be kept
- compact catalog header file to track only those headers that relate to existing files

Backup process:

small: creates ZIP file with:
1. last record catalog header only
2. current catalog content file
3. current entity content files
big: creates ZIP file with:
1. entire catalog header file
2. all catalog content files
3. all entity content files

Ensures no vacuuming process is running and prevents it to be executed for particular catalog.

Restoring process:

simple - just unzip entire catalog contents and load the contents from it
chirurgical (can be used only on "big" backups) - it looks up into the ZIP file and finds catalog header that matches requested timestamp, if the catalog history is newer than the timestamp, error is printed with the information of the earliest timestamp, that can be used. If the particular header can be found, its contents are restored as a fresh snapshot without any waste; it accepts two arguments:
- backup file name (mandatory)
- timestamp the catalog should be restored in (optional - if not set, current date and time is used)

…file)

Backup & restore done and working. No client implementation and missing integration test.

Implemented and tested

novoj · 2024-05-31T12:09:12Z

The API of the backup should be in form:

api call "backup start" -> backup_id
api call "backup status " -> queued, still running, finished on path: xxx
backup
1. api call "backup download " -> backup archive (developers)
2. direct backup from volume path (on the same host cluster) that can be derived from known location and
api call "backup release " -> deleted (or autoremoval after certain period - configurable)

novoj · 2024-06-04T13:02:36Z

During implementation the naming convention for data files has changed. We'd need full-reindex with this version.

This will open-up the path to time-traveling catalog restoration.

We decided to make backup and restore operation asynchronous.

Also removed duplicate max queue configuration for transactions.

Also corrected the backup logic, that takes heavy indexing into an account.

Automatic re-scheduling and transaction pipeline recreation.

…-restore--vacuum-process feat(#41): create and document backup restore vacuum process

novoj · 2024-07-09T14:50:58Z

Finally, implemented. Documentation will be added when the functionality is available through REST / GraphQL / Lab interfaces. This will be done in follow-up issue #627

novoj self-assigned this Mar 3, 2023

novoj added the enhancement New feature or request label Mar 3, 2023

novoj added this to the Alpha milestone Mar 3, 2023

novoj added a commit that referenced this issue Mar 6, 2024

feat(#41): finalized fuzzy test with record compaction

188e175

novoj added a commit that referenced this issue Mar 25, 2024

feat(#41): implemented obsolete files pruning (except from bootstrap …

7c836d2

…file)

novoj added a commit that referenced this issue Mar 27, 2024

feat(#41): tests finally passing

c3f5fce

novoj added a commit that referenced this issue Mar 27, 2024

feat(#41): bootstrap file vacuuming with tests

d526c01

novoj added a commit that referenced this issue Mar 28, 2024

feat(#41): revisited TODOs and TOBEDONEs

456529c

novoj modified the milestones: Alpha, Beta Apr 26, 2024

novoj added a commit that referenced this issue May 4, 2024

feat(#41): create and document backup & restore process

7430158

Backup & restore done and working. No client implementation and missing integration test.

novoj added a commit that referenced this issue May 6, 2024

feat(#41): backup & restore over the client

b2c0d75

Implemented and tested

novoj added a commit that referenced this issue May 28, 2024

fix(#41): merge with dev branch

d75ccf8

novoj added a commit that referenced this issue Jun 2, 2024

fix(#41): merge with dev branch

dbd3254

novoj added the breaking change Backward incompatible data model change label Jun 4, 2024

novoj added a commit that referenced this issue Jun 5, 2024

fix(#41): Altered obsolete files removal

99b0433

This will open-up the path to time-traveling catalog restoration.

novoj mentioned this issue Jun 10, 2024

Ensure the computation is killed quickly when timeout is exceeded #570

Closed

novoj added a commit that referenced this issue Jun 14, 2024

feat(#41): Rethinked approach to asynchronous backup and restoration

9ee6943

We decided to make backup and restore operation asynchronous.

novoj added a commit that referenced this issue Jun 19, 2024

feat(#37): merge #41 and #37 branches

4ed014f

novoj added a commit that referenced this issue Jun 19, 2024

feat(#41): automatic submission publisher recreation

312b5e0

Also removed duplicate max queue configuration for transactions.

novoj added a commit that referenced this issue Jun 20, 2024

feat(#41): corrected file removal and bootstrap trimming

de2e49b

novoj added a commit that referenced this issue Jun 20, 2024

feat(#41): tackled thread pool starving situations

34bff88

Also corrected the backup logic, that takes heavy indexing into an account.

novoj added a commit that referenced this issue Jun 20, 2024

feat(#41): tackled thread pool starving situations

6e8949e

Also corrected the backup logic, that takes heavy indexing into an account.

novoj added a commit that referenced this issue Jun 21, 2024

feat(#41): tackled thread pool starving situations

2801492

Automatic re-scheduling and transaction pipeline recreation.

novoj added a commit that referenced this issue Jun 23, 2024

feat(#41): tackled thread pool starving situations

09fad93

Automatic re-scheduling and transaction pipeline recreation.

novoj added a commit that referenced this issue Jun 23, 2024

feat(#41): tackled thread pool starving situations

494cd15

Automatic re-scheduling and transaction pipeline recreation.

novoj added a commit that referenced this issue Jun 23, 2024

feat(#41): tackled thread pool starving situations

e27b0c1

Automatic re-scheduling and transaction pipeline recreation.

novoj added a commit that referenced this issue Jun 23, 2024

feat(#41): tackled thread pool starving situations

5245634

Automatic re-scheduling and transaction pipeline recreation.

novoj added a commit that referenced this issue Jun 23, 2024

feat(#41): tackled thread pool starving situations

d448582

Automatic re-scheduling and transaction pipeline recreation.

novoj added a commit that referenced this issue Jul 8, 2024

feat(#41): backup and restoration task done

6fedeb6

novoj added a commit that referenced this issue Jul 8, 2024

feat(#41): added tests for new task / file methods

926fc69

novoj added a commit that referenced this issue Jul 9, 2024

feat(#41): added automatic upgrade from storage version 1 to version 2

5213224

novoj added a commit that referenced this issue Jul 9, 2024

feat(#41): test tweaks

915d82e

novoj added a commit that referenced this issue Jul 9, 2024

feat(#41): gRPC services rebuilding

ca83425

novoj linked a pull request Jul 9, 2024 that will close this issue

feat(#41): create and document backup restore vacuum process #626

Merged

novoj added a commit that referenced this issue Jul 9, 2024

Merge pull request #626 from FgForrest/41-create-and-document-backup-…

96051e0

…-restore--vacuum-process feat(#41): create and document backup restore vacuum process

novoj mentioned this issue Jul 9, 2024

REST / GraphQL support for backup / restore process #627

Open

1 task

novoj closed this as completed Jul 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create and document backup & restore & vacuum process #41

Create and document backup & restore & vacuum process #41

novoj commented Mar 3, 2023 •

edited

Loading

novoj commented May 31, 2024

novoj commented Jun 4, 2024

novoj commented Jul 9, 2024

Create and document backup & restore & vacuum process #41

Create and document backup & restore & vacuum process #41

Comments

novoj commented Mar 3, 2023 • edited Loading

novoj commented May 31, 2024

novoj commented Jun 4, 2024

novoj commented Jul 9, 2024

novoj commented Mar 3, 2023 •

edited

Loading