Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Compress/decompress snapshots before/after uploading/downloading to/from object store #255

Closed
amshuman-kr opened this issue Aug 5, 2020 · 18 comments · Fixed by #293
Assignees
Labels
area/cost Cost related area/storage Storage related kind/enhancement Enhancement, improvement, extension status/accepted Issue was accepted as something we need to work on

Comments

@amshuman-kr
Copy link
Collaborator

amshuman-kr commented Aug 5, 2020

Feature (What you would like to be added):
At present, (full and incremental) snapshots are stored in the object store uncompressed.

Can we optionally support storing compressed (full and incremental) snapshots in object store? I.e. Compress the snapshots before uploading to object store and decompress backups after downloading from object store for restoration.

This could be controlled by a configuration/cli flag, e.g. compressSnapshots: true or --compress-snapshots=true.

Motivation (Why is this needed?):
While storing uncompressed snapshots in the object store may lead to some simplicity in implementation and optimization in time performance, it also leads to higher network and storage usage.

Approach/Hint to the implement solution (optional):

  • Almost none of the object store providers seem to support on-the-go HTTP Transfer-Encoding.
  • Almost all the object store providers seem to support Content-Encoding: gzip header to mark the content in the object store as compressed while supporting the Content-Type header for the actual uncompressed content.
    • However, there is no uniformity in supporting (see links below) automatic compression while uploading or automatic decompression while downlloading/serving.

So, it makes sense for etcd-backup-restore to

  • Compress the snapshots explicitly before uploading.
    • In the case of a multipart object, the whole object is to be compressed before splitting into chunks.
  • Mark the content in the Content-Encoding: gzip in the object store to indicate that the content is compressed while maintaining the original Content-Type.
  • Decompress the snapshots after downloading for restoration.
  • Support backward compatibility for restoring older non-compressed snapshots for the same etcd.
  • Streaming or on-the-go compression/decompression would be preferable so that the footprint on the disk doesn't increase.
  1. https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html
  2. https://cloud.google.com/storage/docs/transcoding
  3. https://docs.microsoft.com/en-us/rest/api/storageservices/put-blob
  4. https://docs.microsoft.com/en-us/azure/cdn/cdn-improve-performance
  5. https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ServingCompressedFiles.html#CompressedS3
@amshuman-kr amshuman-kr added the kind/enhancement Enhancement, improvement, extension label Aug 5, 2020
@vlerenc vlerenc added area/cost Cost related area/storage Storage related labels Aug 5, 2020
@majst01
Copy link

majst01 commented Oct 6, 2020

I was looking into this, because we do compression in our backup-restore-sidecar which is for postgres and rethink databases. I would like to reuse this implementation and create a PR here.

But when inspecting the stored files (GCP Buckets in our case), there are always two instances of every backup:

  • one single file with the json content of the full- or increment
  • a directory with the same name with files inside which are actually chunks of the same file already stored.

What is the rational of this approach ?
It creates double the amount of storage and network usage.

From what i understood from reading the code, uploading the backup in chunks should speed up the upload process. But then the single file with the whole content can be skipped.

@amshuman-kr
Copy link
Collaborator Author

I was looking into this, because we do compression in our backup-restore-sidecar which is for postgres and rethink databases. I would like to reuse this implementation and create a PR here.

@majst01 Thanks a lot for the offer of contribution! ❤️

But when inspecting the stored files (GCP Buckets in our case), there are always two instances of every backup:

  • one single file with the json content of the full- or increment
  • a directory with the same name with files inside which are actually chunks of the same file already stored.
    What is the rational of this approach ?

This is because of the approach of uploading in chunks. In S3, the individual chunks are automatically cleaned up. I was under the impression that the same is true in GCS and other storage providers. But I checked the documentations of GCS composite objects and it looks like the source parts are not automatically cleaned up.

It creates double the amount of storage and network usage.

Apparently, not in S3. But yes, in GCS. Perhaps other providers too. Thanks for pointing this out. I will check the other providers documentation and raise a separate issue for this.

@majst01
Copy link

majst01 commented Oct 6, 2020

OK, interesting.
Can you give me some numbers what you gain in upload performance by doing chunked uploads ? Because when doing compression, doings so for every chunk is not optimal. It would be much better to compress only one file and upload it.

@amshuman-kr
Copy link
Collaborator Author

amshuman-kr commented Oct 6, 2020

Can you give me some numbers what you gain in upload performance by doing chunked uploads ?

I afraid, I do not have the numbers as chunking implementation was done quite some time ago. But I do remember that the main reason for implementing chunking upload because we started facing upload timeouts on flaky networks and since the retry would start the upload from the beginning, the upload would practically never happen in such cases. Hence, the chunked upload to make it possible to upload under flaky networks and not just large files.

Because when doing compression, doings so for every chunk is not optimal. It would be much better to compress only one file and upload it.

Yes. The idea of this issue was to compress the one file and then upload the chunks of the compressed file, if needed. On the restoration side, the compressed file would be downloaded and decompressed.

@amshuman-kr
Copy link
Collaborator Author

Just opened #268

@shreyas-s-rao
Copy link
Collaborator

@majst01 Regarding your question:

It creates double the amount of storage and network usage

Chunk upload of objects to the bucket basically chunks the object on client side and first uploads the chunks to the bucket. In order to finalize this as a proper object on the bucket, the client finally "composes" the chunks into a logical encapsulation, which you would see on the GCS bucket as the final object. See here for more details about GCS composite objects and composing sub-objects.

And as @amshuman-kr mentioned, chunk upload was adopted to avoid failing uplaods due to poor network connections.

when doing compression, doings so for every chunk is not optimal. It would be much better to compress only one file and upload it

Indeed, it would make more sense to compress the single file first and then let the GCS API chunk it and upload it.

I would like to reuse this implementation and create a PR here.

That would be great @majst01 . Thanks for that 😀

@vlerenc
Copy link
Member

vlerenc commented Oct 6, 2020

And as @amshuman-kr mentioned, chunk upload was adopted to avoid failing uplaods due to poor network connections.

+1

Quite some time ago, that was indeed extremely painful and we couldn't have an unreliable backup, so we went for chunks (more effort, not as efficient, but it was magnitudes more important to have reliable backups and no alerts of failed backups increasing in likelihood with larger ETCDs). So that is the history of that. :-)

I would like to reuse this implementation and create a PR [for compress/uncompress] here.

Awesome, thank you very much @majst01. 👍

@majst01
Copy link

majst01 commented Oct 6, 2020

First Shoot of my WIP can be seen here: https://github.com/gardener/etcd-backup-restore/compare/master...majst01:compression?expand=1

Most difficult part is the cli flag handling :-)

i dont want to open a Draft PR already, i want to get a first feedback of my approach. So if someone would spend some minutes to have a look, this would be much appreciated.

@amshuman-kr
Copy link
Collaborator Author

@majst01 Thanks. That was quick!

The changes generally look good and I liked the idea of supporting multiple compression method.

I had a couple of points though.

1. Backward compatibility

Especially, while restoring from uncompressed previous snapshots. This is mentioned in the issue description above.

Support backward compatibility for restoring older non-compressed snapshots for the same etcd.

Since this change will be rolled out to all live etcds in gardener landscapes, we need to make sure the restoration from existing snapshots work. I was thinking of some kind of metadata. Rules of thumb based on snapshot file extension might be ok too.

2. Streaming or on-the-go compression/decompression

Perhaps using the Compressor/Decompressor interfaces? Unfortunately, I haven't mentioned this in the issue description.

Typically, we allocate volumes of 25Gi size for etcd and the sidecar. Both etcd and sidecar use the same volume for storing both the etcd database and the temporary backups before upload (also temporary storage during restoration). Since, we set the etcd database quota to 8Gi, 25Gi is basically double of 8Gi plus some buffer for encryption (which is pending in #83 but we are planning to get to it soon considering the guidelines from SGS).

Additional storage during compression/decompression would eat up into the buffer. Do you think stream/on-the-go compression/decompression is possible?

@majst01
Copy link

majst01 commented Oct 7, 2020

@majst01 Thanks. That was quick!

The changes generally look good and I liked the idea of supporting multiple compression method.

I had a couple of points though.

1. Backward compatibility

Especially, while restoring from uncompressed previous snapshots. This is mentioned in the issue description above.

Support backward compatibility for restoring older non-compressed snapshots for the same etcd.

Since this change will be rolled out to all live etcds in gardener landscapes, we need to make sure the restoration from existing snapshots work. I was thinking of some kind of metadata. Rules of thumb based on snapshot file extension might be ok too.

I pushed a small enhancement to support uncompressed snaphots as well. No metadata needed because all is decided based on the file extension.

2. Streaming or on-the-go compression/decompression

Perhaps using the Compressor/Decompressor interfaces? Unfortunately, I haven't mentioned this in the issue description.

Typically, we allocate volumes of 25Gi size for etcd and the sidecar. Both etcd and sidecar use the same volume for storing both the etcd database and the temporary backups before upload (also temporary storage during restoration). Since, we set the etcd database quota to 8Gi, 25Gi is basically double of 8Gi plus some buffer for encryption (which is pending in #83 but we are planning to get to it soon considering the guidelines from SGS).

Additional storage during compression/decompression would eat up into the buffer. Do you think stream/on-the-go compression/decompression is possible?

mholt/archiver is able to compress/decompress a io.Reader which will the base for doing streamed compression.
But otoh, compressing 8Gi of JSON will end in ~2-3Gi of compressed output, i would start with a file based compression support and see how that goes and support streaming in a later go, WDYT ?

@amshuman-kr
Copy link
Collaborator Author

I pushed a small enhancement to support uncompressed snaphots as well. No metadata needed because all is decided based on the file extension.

Thanks a lot! The change looks good.

mholt/archiver is able to compress/decompress a io.Reader which will the base for doing streamed compression.
But otoh, compressing 8Gi of JSON will end in ~2-3Gi of compressed output, i would start with a file based compression support and see how that goes and support streaming in a later go, WDYT ?

Sounds good to me 👍 We can pick up on-the-go compression/decompression later before picking encryption.

@majst01
Copy link

majst01 commented Oct 7, 2020

One question arises for me, how do you guys test this sidecar actually, bit hard because deployment is usually done via etcd-druid which is managed by gardener. We dont have a lot of gardener seeds where we can modify image-vectors easily.

Any hint is welcome.

@amshuman-kr
Copy link
Collaborator Author

@shreyas-s-rao should be able to help more with that. But there is the sample helm chart, which might be of help. It even uses the Local provider (uploads to a local folder in the volume. So, you don't need to mess with buckets.

@majst01
Copy link

majst01 commented Oct 8, 2020

OK, thanks. Was able to test locally for the snapshot creation with the local snap store including compression. @shreyas-s-rao how do you test restoration in this setup ?

@amshuman-kr
Copy link
Collaborator Author

@majst01 I tested your branch for both backup and restoration. I had to make a small adjustment (please see the patch below).

diff --git a/pkg/compress/compress.go b/pkg/compress/compress.go
index 4ba7a0df..b7f95e0e 100644
--- a/pkg/compress/compress.go
+++ b/pkg/compress/compress.go
@@ -44,7 +44,7 @@ func (c *Compressor) Compress(snap *snapstore.Snapshot) error {
        if !c.enabled {
                return nil
        }
-       err := archiver.Archive([]string{snap.SnapDir}, path.Join(snap.SnapDir, snap.SnapName))
+       err := archiver.Archive([]string{snap.SnapDir}, path.Join(snap.SnapDir, snap.SnapName + c.extension))
        if err != nil {
                return err
        }

I could verify that compression/decompression was working well for full snapshots. But the delta snapshots are not compressed/decompressed. I checked the code for delta snapshots and found it not conducive for file compression. In fact, the implementation as well as the snapstore interface is more conducive to on-the-go compression especially during backup upload. The temporary file creation is currently being done in the individual snapstore implementations (only if needed).

IMHO, instead of changing delta snapshot backup code to be more file compression friendly, it is better to implement on-the-go compression/decompression which then can be used in all the cases (full and delta, backup and restoration).

In this context, I found golang's standard compress package suite more suitable for on-the-go compression given that it has the natural support for decorating io.ReadCloser and io.WriteCloser. Archiving is anyway not much relevant because each snapshots (full and delta) are single files. WDYT?

PS: Last but not the least, I am keenly aware that this might be asking a bit to much from you. So, it is perfectly OK if you do not pursue this change. In that case, we will pick this up along the lines you have already shown.

@majst01
Copy link

majst01 commented Oct 14, 2020

Hi @amshuman-kr
thanks for your deep insight and suggestions. I was trying also myself and found also some differences in the handling of full/incr backups. Therefore your suggestion is totally valid to go for a in memory compression(instead of archiving) an the io.ReaderCloser.

I am a bit short in time to spent more on my actual effort, so if you would like to pick my ideas, go ahead. I am happy to help in any form you wish or ask for.

@amshuman-kr
Copy link
Collaborator Author

@majst01 Thanks a lot for the help already. Much appreciated ❤️

@gardener-robot gardener-robot added the status/accepted Issue was accepted as something we need to work on label Oct 22, 2020
@ishan16696
Copy link
Member

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cost Cost related area/storage Storage related kind/enhancement Enhancement, improvement, extension status/accepted Issue was accepted as something we need to work on
Projects
None yet
6 participants