chunk objects when uploading via PutObject #30

skriss · 2020-03-04T21:44:46Z

closes #26

I've pushed a dev image containing this change to steveheptio/velero-plugin-for-microsoft-azure:chunk-objects.

skriss · 2020-03-04T21:49:27Z

@dharmab I'm not sure if you have an environment where you're comfortable testing out a patch, but I've pushed a dev image containing this change (#30 (comment)). It's seemed to work well in my testing but would be great to get additional eyes.

If you do have an environment where you can test this, you can either kubectl -n velero edit deploy/velero and change the image for the plugin init container, or you can do:

velero plugin remove velero/velero-plugin-for-microsoft-azure:v1.0.0 # change the version tag if you've already upgraded to v1.0.1
velero plugin add steveheptio/velero-plugin-for-microsoft-azure:chunk-objects

Signed-off-by: Steve Kriss <krisss@vmware.com>

dharmab · 2020-03-04T22:52:28Z

@skriss I'll be able to test this within ~~24 hours.~~ Actually, I just realized that the impacted cluster is still running Velero 1.1.0- we'll have it upgraded to 1.2.0 soon, though.

dharmab · 2020-03-05T17:56:27Z

I've upgraded my impacted cluster to Velero 1.2 and will test this PR there.

dharmab · 2020-03-05T21:19:29Z

I deployed the plugin and ran a test backup, which failed without logging any errors:

[kube]$ velero backup get dharmab-test-2
NAME             STATUS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
dharmab-test-2   Failed   2020-03-05 14:03:48 -0700 MST   6d        default            <none>
[kube]$ velero backup logs dharmab-test-2 | grep -v "level=info"
[kube]$

The log content is somewhat sensitive so I cannot post it publicly, but it just contained the standard info logs about the items it was backing up.

skriss · 2020-03-05T21:22:45Z

Hmm. Are there any errors in the server log (e.g. kubectl -n velero logs deploy/velero | grep error)?

dharmab · 2020-03-05T21:50:25Z

I checked the server logs we shipped to our log collector and I only see "level=info" logs from the velero server.

velero-plugin-for-microsoft-azure/object_store.go

skriss · 2020-03-05T22:51:08Z

@dharmab can you find in the server logs from around where the backup finished running? Any info from that would be helpful for tracking this down.

I'll try some more tests and see if I can figure out what's happening too

skriss · 2020-03-05T22:54:08Z

I wonder if you're running into pod resource issues - do you have CPU/mem requests/limits set for the velero pod?

It's possible the chunks need to be smaller, since this is reading 100MB at a time which could be pushing past memory limits.

dharmab · 2020-03-06T16:34:13Z

Pod resource limits aren't the issue- we had to raise our limits very high due to vmware-tanzu/velero#2069, which was only fixed in 1.3.0 a few days ago.

I'll see if I can create a reproducible case with a script to create a few thousand configmaps or something.

skriss · 2020-03-06T16:47:56Z

OK; I did further testing and was unable to reproduce so any additional info you can provide from the logs, etc. will be very helpful for figuring this out.

dharmab · 2020-03-06T21:17:32Z

I was able to reproduce and test this successfully with Velero 1.3. I think the mentioned memory leak in Velero 1.2 tainted the result.

Steps to reproduce:

dd if=/dev/urandom bs=1024 count=256 of=256k-padding.txt
kubectl create namespace velero-azure-issue-26
for i in {0..2048}; do kubectl -n velero-azure-issue-26 create configmap padding-$i --from-file 256k-padding.txt; done

This creates ~512MB of configmaps on a cluster (takes a while to run if the apiserver isn't on localhost). Note that excess random data is needed to prevent Velero from shrinking the upload below 256MiB through compression.

With the above configmaps present, plugin 1.0.0 fails to create backups while the patch works. Unfortunately there aren't any reproducible error logs in Velero when the backup fails- not sure why.

velero-plugin-for-microsoft-azure/object_store.go

skriss · 2020-03-06T22:35:19Z

alright, this is ready for review from other maintainers.

For anyone following, I pushed a new image, steveheptio/velero-plugin-for-microsoft-azure:chunk-objects-final, containing the current version of the code that allows the block/chunk size to be optionally configured via the BackupStorageLocation's config.blockSizeInBytes field. It still defaults to 100MB.

skriss · 2020-03-13T16:32:17Z

@carlisia @nrb @ashish-amarnath gentle reminder that this needs some review, 🙏

ashish-amarnath

LGTM!

velero-plugin-for-microsoft-azure/object_store.go

carlisia

Just a question, otherwise lgtm.

carlisia · 2020-03-16T22:45:00Z

backupstoragelocation.md

+    # for more information on block blobs.
+    # 
+    # Optional (defaults to 104857600, i.e. 100MB).
+    blockSizeInBytes: "10485760"


Should the value here be "104857600"? I think 10485760 corresponds to 10MB.

Yeah, I put a non-default value in here since the field is not required if you just want the default. I could put something totally different (e.g. 1024) so it's not confusing

Ahhhhhhh. Could it be left blank?

eh, I'll just put the default value in. I'd like to have a sample value.

Yes, that's good too.

Signed-off-by: Steve Kriss <krisss@vmware.com>

carlisia

👍

skriss force-pushed the chunk-objects branch from 32b8e95 to f5c37ef Compare March 4, 2020 21:46

skriss added 2 commits March 4, 2020 15:18

support --log-level flag

0ba7746

Signed-off-by: Steve Kriss <krisss@vmware.com>

PutObject: use chunking to support large objects

78341fb

Signed-off-by: Steve Kriss <krisss@vmware.com>

dharmab reviewed Mar 5, 2020

View reviewed changes

velero-plugin-for-microsoft-azure/object_store.go Show resolved Hide resolved

skriss commented Mar 6, 2020

View reviewed changes

velero-plugin-for-microsoft-azure/object_store.go Outdated Show resolved Hide resolved

skriss force-pushed the chunk-objects branch from f5c37ef to cd8deab Compare March 6, 2020 22:27

skriss changed the title ~~[WIP] chunk objects when uploading via PutObject~~ chunk objects when uploading via PutObject Mar 6, 2020

skriss requested review from ashish-amarnath and nrb March 6, 2020 22:33

skriss marked this pull request as ready for review March 6, 2020 22:35

skriss requested a review from carlisia March 10, 2020 21:53

ashish-amarnath previously approved these changes Mar 13, 2020

View reviewed changes

ashish-amarnath reviewed Mar 13, 2020

View reviewed changes

velero-plugin-for-microsoft-azure/object_store.go Show resolved Hide resolved

carlisia reviewed Mar 16, 2020

View reviewed changes

skriss added 2 commits March 17, 2020 08:56

allow block size to be configured via BackupStorageLocation

87d82f9

Signed-off-by: Steve Kriss <krisss@vmware.com>

changelog

1ab3f99

Signed-off-by: Steve Kriss <krisss@vmware.com>

skriss dismissed ashish-amarnath’s stale review via 1ab3f99 March 17, 2020 14:56

skriss force-pushed the chunk-objects branch from 5cbae30 to 1ab3f99 Compare March 17, 2020 14:56

nrb approved these changes Mar 17, 2020

View reviewed changes

carlisia approved these changes Mar 17, 2020

View reviewed changes

carlisia merged commit f524207 into vmware-tanzu:master Mar 17, 2020

skriss deleted the chunk-objects branch May 7, 2020 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chunk objects when uploading via PutObject #30

chunk objects when uploading via PutObject #30

skriss commented Mar 4, 2020

skriss commented Mar 4, 2020

dharmab commented Mar 4, 2020 •

edited

dharmab commented Mar 5, 2020

dharmab commented Mar 5, 2020 •

edited

skriss commented Mar 5, 2020

dharmab commented Mar 5, 2020

skriss commented Mar 5, 2020

skriss commented Mar 5, 2020 •

edited

dharmab commented Mar 6, 2020

skriss commented Mar 6, 2020

dharmab commented Mar 6, 2020 •

edited

skriss commented Mar 6, 2020

skriss commented Mar 13, 2020

ashish-amarnath left a comment

carlisia left a comment

carlisia Mar 16, 2020

skriss Mar 16, 2020

carlisia Mar 16, 2020

skriss Mar 16, 2020

carlisia Mar 16, 2020

skriss Mar 17, 2020

carlisia left a comment

chunk objects when uploading via PutObject #30

chunk objects when uploading via PutObject #30

Conversation

skriss commented Mar 4, 2020

skriss commented Mar 4, 2020

dharmab commented Mar 4, 2020 • edited

dharmab commented Mar 5, 2020

dharmab commented Mar 5, 2020 • edited

skriss commented Mar 5, 2020

dharmab commented Mar 5, 2020

skriss commented Mar 5, 2020

skriss commented Mar 5, 2020 • edited

dharmab commented Mar 6, 2020

skriss commented Mar 6, 2020

dharmab commented Mar 6, 2020 • edited

skriss commented Mar 6, 2020

skriss commented Mar 13, 2020

ashish-amarnath left a comment

Choose a reason for hiding this comment

carlisia left a comment

Choose a reason for hiding this comment

carlisia Mar 16, 2020

Choose a reason for hiding this comment

skriss Mar 16, 2020

Choose a reason for hiding this comment

carlisia Mar 16, 2020

Choose a reason for hiding this comment

skriss Mar 16, 2020

Choose a reason for hiding this comment

carlisia Mar 16, 2020

Choose a reason for hiding this comment

skriss Mar 17, 2020

Choose a reason for hiding this comment

carlisia left a comment

Choose a reason for hiding this comment

dharmab commented Mar 4, 2020 •

edited

dharmab commented Mar 5, 2020 •

edited

skriss commented Mar 5, 2020 •

edited

dharmab commented Mar 6, 2020 •

edited