Skip to content

Commit 4983600

Browse files
committed
work on database internals section
1 parent 22973fe commit 4983600

File tree

1 file changed

+19
-15
lines changed

1 file changed

+19
-15
lines changed

modules/ROOT/pages/database-internals/checkpointing.adoc

Lines changed: 19 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,24 @@
11
[[checkpointing-log-pruning]]
22
= Checkpointing and log pruning
33

4-
Checkpointing refers to the procedure of transferring all pending updates of pages from the page cache to the storage files.
5-
This action is crucial to limit the number of transactions that need to be replayed during the recovery process, particularly in order to minimize the time required for recovery after an improper shutdown.
4+
Checkpointing is the process of flushing all pending updates of pages from the page cache to the storage files.
5+
This action is crucial to limit the number of transactions that need to be replayed during the recovery process, particularly to minimize the time required for recovery after an improper shutdown.
66

77
Despite the presence of checkpoints, database operations remain secure, as any transactions that have not been confirmed to have their modifications persisted to storage will be replayed upon the next database startup.
8-
However, this assurance is contingent upon the availability of the collection of changes comprising these transactions, which is maintained in the transaction logs.
8+
However, this assurance is contingent upon the availability of the collection of changes comprising these transactions, which is maintained in the xref:database-internals/transaction-logs.adoc[transaction logs].
99

1010
Maintaining a long list of unapplied transactions (due to infrequent checkpoints) leads to the accumulation of transaction logs, as they are essential for recovery purposes.
11-
Checkpointing involves the inclusion of a special "Checkpointing" entry in the transaction log, marking the last transaction at which checkpointing occurred.
11+
Checkpointing involves the inclusion of a special _Checkpointing_ entry in the transaction log, marking the last transaction at which checkpointing occurred.
1212
This entry serves the purpose of identifying transaction logs that are no longer necessary, as all the transactions they contain have been securely stored in the storage files.
1313

14-
The process of eliminating transaction logs that are no longer required for recovery is known as pruning. From the aforementioned explanation, it becomes evident that pruning is reliant on checkpointing.
15-
In other words, checkpointing determines which logs can be pruned and determines the occurrence of pruning, as the absence of a checkpoint implies that the set of transaction log files available for pruning cannot have changed.
14+
The process of eliminating transaction logs that are no longer required for recovery is known as _pruning_.
15+
Pruning is reliant on checkpointing.
16+
Checkpointing determines which logs can be pruned and determines the occurrence of pruning, as the absence of a checkpoint implies that the set of transaction log files available for pruning cannot have changed.
1617
Consequently, pruning is triggered whenever checkpointing takes place, with or without a specific verification of their existence.
1718

18-
== Triggering of checkpointing (and pruning) events
19+
== Configure the checkpointing and pruning events
20+
21+
Depending on your needs, you can This is done periodically and is used to recover the database in case of a crash. The checkpoint settings control the frequency of checkpoints, and the amount of data that is written to disk in each checkpoint.
1922

2023
The checkpointing policy, which is the driving event for pruning is configured by xref:configuration/configuration-settings.adoc#config_db.checkpoint[`db.checkpoint`] and can be triggered in a few different ways:
2124

@@ -27,21 +30,22 @@ Note that no checkpointing is being performed implying no pruning happens.
2730
This is the default behavior and the only one available in Community Edition.
2831

2932
* `CONTINUOUS` label:enterprise[Enterprise Edition]
30-
This policy constantly checks if a checkpoint is possible (i.e if any transactions committed since the last successful checkpoint) and if so, it performs it.
31-
* Pruning is triggered immediately after it completes, just like in the periodic policy.
33+
This policy constantly checks for transactions committed after the last successful checkpoint and when there are some, it performs the checkpointing.
34+
The log pruning is triggered immediately after the checkpointing completes, just like in the periodic policy.
35+
36+
* `VOLUME` label:enterprise[Enterprise Edition]
3237

3338
* `VOLUMETRIC` label:enterprise[Enterprise Edition]
34-
This checkpointing policy checks every 10 seconds if any logs are available for pruning and, if so, it triggers a checkpoint and subsequently, it prunes the logs.
39+
This policy checks every 10 seconds if there is enough volume of logs available for pruning and, if so, it triggers a checkpoint and subsequently, it prunes the logs.
40+
By default, the volume is set to 250MiB, but it can be configured using the setting xref:configuration/configuration-settings.adoc#config_db.checkpoint.tx_log.volume_threshold[`db.checkpoint.tx_log.volume_threshold`].
3541
This policy appears to invert the control between checkpointing and pruning, but in reality, it only changes the criteria for when checkpointing must happen.
36-
Instead of relying on a time trigger, as in the previous two, it relies on a pruning check.
37-
Pruning will still happen after checkpointing has occurred, as with the other two policies.
38-
Nevertheless, since the check depends on the existence of prunable transaction log files, this policy depends on pruning configuration.
42+
The pruning is still triggered by the checkpointing event.
3943

4044
[[transaction-logging-log-pruning]]
4145
== Configure log pruning
4246

4347
Transaction log pruning refers to the safe and automatic removal of old, unnecessary transaction log files.
44-
The transaction log can be pruned when o=ne or more files fall outside of the configured retention policy.
48+
The transaction log can be pruned when one or more files fall outside of the configured retention policy.
4549

4650
Two things are necessary for a file to be removed:
4751

@@ -70,7 +74,7 @@ The interval between checkpoints can be configured using:
7074

7175
== Controlling transaction log pruning
7276

73-
Transaction log pruning configuration primarily deals with specifing the number of transaction logs that should remain available. The primary reason for leaving more than the absolute minimum amount required for recovery comes from requirements of clustered deployments and online backup. Since database updates are communicated between cluster members and backup clients through the transaction logs, keeping more than the minimum amount necessary allows for transferring just the incremental changes (in the form of transactions) instead of the whole store files, which can lead to substantial savings in time and network bandwidth. This is true for HA deployments, backups and Read Replicas in Causal Clusters. However, in the case of Core members in Causal Clustering it is not the transaction logs that matter, but rather the Raft log contents. That scenario is covered in a separate KB article.
77+
Transaction log pruning configuration primarily deals with specifying the number of transaction logs that should remain available. The primary reason for leaving more than the absolute minimum amount required for recovery comes from the requirements of clustered deployments and online backup. Since database updates are communicated between cluster members and backup clients through the transaction logs, keeping more than the minimum amount necessary allows for transferring just the incremental changes (in the form of transactions) instead of the whole store files, which can lead to substantial savings in time and network bandwidth. This is true for HA deployments, backups and Read Replicas in Causal Clusters. However, in the case of Core members in Causal Clustering it is not the transaction logs that matter, but rather the Raft log contents. That scenario is covered in a separate KB article.
7478

7579
The amount of transaction logs left after a pruning operation is controlled by the setting `dbms.tx_log.rotation.retention_policy` and it can take a variety of values. They are of the form `<numerical value> <measurement>`.
7680

0 commit comments

Comments
 (0)