Skip to content

Collecting ideas on how to speed up recovery #943

@toydarian

Description

@toydarian

I have a fairly large database and currently replaying WALs takes over 24 hours.
I'm trying to find ways to speed this up and would appreciate some input.
Up until now, I have found two possible improvements that could be done on the barman-wal-restore script.

  • The first one is in try_deliver_from_spool where the file is actually copied instead of moved. Assuming we don't run on a copy-on-write file-system and the spool is on the same file-system as pg_wal, it would be faster to move or hard-link the file instead of copying.
  • As far as I understand it, even when running with multiple parallel processes fetching files, the script will fetch n files, then PostgreSQL will replay those n files and then ask for the next one, where barman-wal-restore will fetch the next batch of n files and the whole thing starts from the beginning. I wonder if it would be possible to continuously fetch files, so the database never has to wait for any files to get delivered.

Just to be clear, I don't expect anybody to implement any of this. I'm collecting ideas which I plan to implement myself.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions