New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve MySQL restore performance #1131
Conversation
pkg/storages/storage/mergewriter.go
Outdated
continue | ||
} | ||
rbytes := len(block) | ||
wbytes, err := sink.Write(block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io.Writer may write block partially without error
We should retry with the rest of the block until it's over or error returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acoarding to io.Writer documentation:
Write must return a non-nil error if it returns n < len(p).
2ea5738
to
f323485
Compare
@@ -155,3 +196,19 @@ func (uploader *Uploader) UploadMultiple(objects []UploadObject) error { | |||
} | |||
return nil | |||
} | |||
|
|||
func (uploader *Uploader) ChangeDirectory(relativePath string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that SetFolder(storage.Folder)
would be more universal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetFolder
sounds like absolute path. Change Directory is like uniux cd
) | ||
|
||
type BackupStreamMetadata struct { | ||
Type string `json:"type"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this metadata does not consume much space, maybe just add it to the sentinel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will increase coupling between uploader metadata and sentinel
internal/stream_metadata.go
Outdated
} | ||
|
||
func UploadBackupStreamMetadata(uploader UploaderProvider, metadata interface{}, backupName string) error { | ||
sentinelName := MetadataNameFromBackup(backupName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this name interferes with the Postgres metadata.json
file, but has the different purpose. Since this file contains only the stream-specific metadata maybe name it like stream_metadata.json
?
} | ||
|
||
uploader := &SplitStreamUploader{ | ||
Uploader: &Uploader{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pass the Uploader
to the NewSplitStreamUploader
to avoid constructing the internal uploader. Actually, can you use the UploaderProvider
interface here?
* Improve download speed by splitting backup stream in multiple blobs. Each of which can be decoded & decompressed independetly * use background readahead goroutine to separate decoding and decompressing stages. * add concurrent range-based S3-reader (WIP)
* save type of the backup in sentinel * use sentinel in backup-fetch command
* improve error handling
…allelism. However it is better to just increase `WALG_STREAM_SPLITTER_PARTITIONS`: in this case parallelism increased without adding unnecessary memory copy.
…se memory buffers
…hronously copy data with size greater than 2*blockSize.
* Hide all upload logic inside new SplitStreamUploader * use new Uploader in mongodb & mysql implementations
This patch improves wal-g-mysql backup-push/backup-fetch by splitting single-stream backup into multi-stream backups:
In order to provide database-agnostic approach we don't make assumptions about backup-stream wal-g getting from stdin.
Instead, wal-g splits backup stream in blocks of
WALG_STREAM_SPLITTER_BLOCK_SIZE
bytes (instorages/splitmerge/splitreader.go
). Then blocks are sent to one ofWALG_STREAM_SPLITTER_PARTITIONS
output streams. Each stream compressed, encrypted and uploaded separately.Restore process is doing the opposite: it fetches multiple streams (utilize network by using multiple connections), then each stream can be decrypted and decompressed independently (utilize more CPU cores). Then it merges (in
storages/splitmerge/mergewriter.go
) all streams into one stream.In my benchmarks
wal-g backup-fetch LATEST --turbo
: