Skip to content

Commit

Permalink
Backup diagrams (canonical#123)
Browse files Browse the repository at this point in the history
* Add On S3 Credentials Changed Hook diagram

* Add On Restore Hook diagram

* Add On List Backups Hook diagram

* Add On Create Backup Hook diagram

* Add note about restore status

* Add note about TLS settings change
  • Loading branch information
marceloneppel committed Mar 31, 2023
1 parent 3a28787 commit efec44d
Showing 1 changed file with 117 additions and 0 deletions.
117 changes: 117 additions & 0 deletions docs/reference/backups.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Backups.py Reference Documentation

This file contains functions for related to backups management, including its major hooks. This file can be found at [src/backpus.py](../../../src/backups.py).

## Hook Handler Flowcharts

These flowcharts detail the control flow of the hooks in this program. Unless otherwise stated, **a hook deferral is always followed by a return**.

### On S3 Credentials Changed Hook

```mermaid
flowchart TD
hook_fired([s3-credentials-changed Hook]) --> has_cluster_initialised{Has cluster\n initialised?}
has_cluster_initialised -- no --> defer>defer]
defer --> rtn([return])
has_cluster_initialised -- yes --> are_all_required_settings_provided{Are all required\nS3 settings provided?}
are_all_required_settings_provided -- no --> rtn
are_all_required_settings_provided -- yes --> render_pgbackrest_config[Update backup settings]
render_pgbackrest_config --> is_leader{Is current\nunit leader?}
is_leader -- no --> rtn
is_leader -- yes --> is_blocked{Is unit in\nblocked state?}
is_blocked -- yes --> rtn
is_blocked -- no --> could_initialise_stanza{Could initialise\npgBackRest Stanza?}
could_initialise_stanza -- no --> set_blocked[Set Blocked\nstatus]
set_blocked --> rtn
could_initialise_stanza -- yes --> is_wal_archiving_to_s3_working{Is WAL archiving\nto S3 working?}
is_wal_archiving_to_s3_working -- no --> set_blocked
is_wal_archiving_to_s3_working -- yes --> is_tls_disabled_or_single_unit_cluster{Is TLS disabled or\nsingle unit cluster}
is_tls_disabled_or_single_unit_cluster -- yes --> stop_pgbackrest_tls_server[Stop pgBackRest\nTLS server]
is_tls_disabled_or_single_unit_cluster -- no --> is_replica_and_tls_server_not_running_on_primary{Is current\nunit a replica\nand TLS server isn't\nrunning on primary?}
is_replica_and_tls_server_not_running_on_primary -- yes --> rtn
stop_pgbackrest_tls_server --> rtn
is_replica_and_tls_server_not_running_on_primary -- no --> start_pgbackrest_tls_server[Start pgBackRest\nTLS server]
start_pgbackrest_tls_server --> rtn
```

When certificates are received from TLS certificates operator through the `certificates` relation (or the relation is
removed) the steps starting from `Is TLS disabled or single unit cluster` are also executed.

### On Create Backup Hook

```mermaid
flowchart TD
hook_fired([create-backup Hook]) --> is_blocked{Is unit in\nblocked state?}
is_blocked -- yes --> fail_action([fail action])
is_blocked -- no --> is_primary_tls_enabled_multiple_unit_cluster{Is primary in\na TLS enabled\nmultiple units cluster?}
is_primary_tls_enabled_multiple_unit_cluster -- yes --> fail_action
is_primary_tls_enabled_multiple_unit_cluster -- no --> is_replica_tls_disabled{Is replica and\nwith TLS disabled?}
is_replica_tls_disabled -- yes --> fail_action
is_replica_tls_disabled -- no --> has_stanza_been_initialises{Has stanza been initialised?}
has_stanza_been_initialises -- no --> fail_action
has_stanza_been_initialises -- yes --> is_s3_relation_established{Is S3 relation\nestablished?}
is_s3_relation_established -- no --> fail_action
is_s3_relation_established -- yes --> has_missing_s3_parameters{Has missing S3 parameters?}
has_missing_s3_parameters -- yes --> fail_action
has_missing_s3_parameters -- no --> was_possible_upload_metadata_file_test_connectivity_s3{Was it possible\nto upload metadata file\nto test connectivity to S3?}
was_possible_upload_metadata_file_test_connectivity_s3 -- no --> fail_action
was_possible_upload_metadata_file_test_connectivity_s3 -- yes --> is_replica{Is current\nunit a replica?}
is_replica -- no --> set_maintenance[Set Maintenance Status]
is_replica -- yes --> block_new_connections[Block new\nconnections to this\nunit's database]
block_new_connections --> set_maintenance
set_maintenance --> has_backup_creation_succeeded{Has backup creation succeeded?}
has_backup_creation_succeeded -- no --> upload_error_logs_s3[Upload error\nlogs to S3]
upload_error_logs_s3 --> fail_action2([fail action])
fail_action2 --> is_replica2{Is current\nunit a replica?}
is_replica2 -- no --> set_active[Set Active Status]
is_replica2 -- yes --> allow_new_connections[Allow new\nconnections to this\nunit's database]
has_backup_creation_succeeded -- yes --> were_backup_logs_uploaded_s3{Were backup logs\nuploaded to S3?}
were_backup_logs_uploaded_s3 -- no --> fail_action2
were_backup_logs_uploaded_s3 -- yes --> finish_action[backup created]
finish_action --> is_replica2
```

### On List Backups Hook

```mermaid
flowchart TD
hook_fired([list-backups Hook]) --> is_s3_relation_established{Is S3 relation\nestablished?}
is_s3_relation_established -- no --> fail_action([fail action])
is_s3_relation_established -- yes --> has_missing_s3_parameters{Has missing S3 parameters?}
has_missing_s3_parameters -- yes --> fail_action
has_missing_s3_parameters -- no --> does_pgbackrest_returned_backups_list{Does pgBackRest\nreturned the\nbackups list?}
does_pgbackrest_returned_backups_list -- no --> fail_action
does_pgbackrest_returned_backups_list -- yes --> return_formatted_backup_list[Return formatted\nbackup list]
```

### On Restore Hook

```mermaid
flowchart TD
hook_fired([restore Hook]) --> has_user_provided_backup_id{Has user provided\na backup id?}
has_user_provided_backup_id -- no --> fail_action([fail action])
has_user_provided_backup_id -- yes --> is_workload_container_accessible{Is workload\ncontainer accessible?}
is_workload_container_accessible -- no --> fail_action
is_workload_container_accessible -- yes --> is_blocked{Is unit in\nblocked state?}
is_blocked -- yes --> fail_action
is_blocked -- no --> is_single_unit_cluster{Is single\nunit cluster?}
is_single_unit_cluster -- no --> fail_action
is_single_unit_cluster -- yes --> is_leader{Is current\nunit leader?}
is_leader -- no --> fail_action
is_leader -- yes --> is_backup_id_valid{Is backup id\nvalid?}
is_backup_id_valid -- no --> fail_action
is_backup_id_valid -- yes --> set_maintenance[Set Maintenance Status]
set_maintenance --> has_database_stopped{Has database\nstopped?}
has_database_stopped -- no --> fail_action
has_database_stopped -- yes --> was_previous_cluster_info_removed{Was previous cluster\ninfo removed?}
was_previous_cluster_info_removed -- no --> start_database_again[Start database again]
start_database_again --> fail_action
was_previous_cluster_info_removed -- yes --> was_data_directory_emptied{Was the data\ndirectory emptied?}
was_data_directory_emptied --> no --> start_database_again
was_data_directory_emptied --> yes --> configure_restore[Configure Patroni to restore the backup]
configure_restore --> start_database[Start the database]
start_database --> finish_action[restore started]
```

The unit status becomes `Active` or `Blocked` after a, respectively, successful or failed restore
is detected in the update status hook.

0 comments on commit efec44d

Please sign in to comment.