Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore can complete but fail writing to etcd with request too large #108

Closed
dgoodwin opened this issue Sep 29, 2017 · 11 comments
Closed

Restore can complete but fail writing to etcd with request too large #108

dgoodwin opened this issue Sep 29, 2017 · 11 comments
Assignees
Labels
Milestone

Comments

@dgoodwin
Copy link
Contributor

dgoodwin commented Sep 29, 2017

I created a backup for a specific namespace, and restored it into a rebuilt cluster. My particular cluster may contain a larger than normal amount of cluster objects. (cluster role bindings, etc) I saw quite a bit of throttling request messages in the log.
ark-restore.log

The restore completed, my namespace is restored (though failed with service account problems probably due to the new cluster, I will dig into that next), but fails to write status because etcd finds the request too large. My restore then is hung in progress.

I0929 12:56:11.805636       1 restore_controller.go:255] restore heptio-ark/test1-11-20170929125444 completed      
I0929 12:56:11.805661       1 restore_controller.go:258] updating restore heptio-ark/test1-11-20170929125444 final status                                                                                                              
I0929 12:56:11.991224       1 restore_controller.go:260] error updating restore heptio-ark/test1-11-20170929125444 final status: etcdserver: request is too large    
$ ark restore get
NAME                      BACKUP     STATUS       WARNINGS   ERRORS    CREATED                         SELECTOR
test1-11-20170929125444   test1-11   InProgress   0          0         2017-09-29 12:54:45 +0000 UTC   <none>

Environment was OpenShift 3.7 based on Kubernetes 1.7. Etcd 3.2.5.

@ncdc ncdc added the Bug label Sep 29, 2017
@ncdc
Copy link
Contributor

ncdc commented Sep 29, 2017

Can you grab the log for the restore itself (ark restore logs <restore name>)?

I'm thinking there are a couple of issues here:

  1. We need to make the qps and burst configurable for the client the ark server uses
  2. We may need to rethink how/where we store warnings and errors for a restore. I'm guessing that you had a ton of warnings and errors such that the overall Restore object was over the etcd raft message size limit (1.5MB).

I'm going to create a separate issue for 1 and we'll keep this to track 2

@dgoodwin
Copy link
Contributor Author

Adding restore log.
restore.log

Definitely looks like some interplay here with reloading all that cluster data, in this case it was an entirely new cluster and things are not entirely happy.

@timothysc
Copy link
Member

@dgoodwin are these really large lists?
If so, we should dbl-check where clayton's paging work is at.

@ncdc
Copy link
Contributor

ncdc commented Sep 29, 2017

@timothysc the problem is we store the "lists" as structured data within restore.status

@ncdc
Copy link
Contributor

ncdc commented Sep 29, 2017

I'm thinking of some ways around this:

  1. Store only # of warnings & # of errors in restore.status
  2. Users can download the restore log if they want to see warnings & errors (they'd have to sift through it), OR
  3. We could store the structured data for warnings & errors in object storage and make it easy to combine that information with what's in etcd; e.g., ark restore get --details could grab the warnings & errors and display them.

@ncdc ncdc removed the Help wanted label Oct 26, 2017
@ncdc ncdc self-assigned this Oct 26, 2017
@ncdc ncdc added this to the v0.6.0 milestone Oct 26, 2017
@ncdc
Copy link
Contributor

ncdc commented Nov 1, 2017

Here's what I have so far:

$ ark restore describe b2-20171101135335
Name:         b2-20171101135335
Namespace:    heptio-ark
Labels:       <none>
Annotations:  <none>

Backup:  b2

Namespaces:
  Included:  *
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        nodes
  Cluster-scoped:  auto

Namespace mappings:  <none>

Label selector:  <none>

Restore PVs:  true

Phase:  Completed

Validation errors:  <none>

Warnings:
  Ark:        <none>
  Cluster:    <none>
  Namespaces:
    default:      services "kubernetes" already exists
    heptio-ark:   services "minio" already exists
                  services "some-svc" already exists
    kube-system:  services "kube-dns" already exists

Errors:
  Ark:        <none>
  Cluster:    <none>
  Namespaces: <none>

WDYT?

@nrb
Copy link
Contributor

nrb commented Nov 1, 2017

@ncdc The format here looks good to me. I assume you're using option 3 of your previous comment for storing the data? That one seems like the best UX, as opposed to just throwing a potentially huge log at the user.

@dgoodwin
Copy link
Contributor Author

dgoodwin commented Nov 1, 2017

Really nice functionality and output, how would this work around the etcd request size issue?

@ncdc
Copy link
Contributor

ncdc commented Nov 1, 2017 via email

@dgoodwin
Copy link
Contributor Author

dgoodwin commented Nov 1, 2017

Love it, seeing them broken out per ns/cluster/ark is fantastic. The describe subcommand threw me for a loop but forgot you can do whatever you want as it's not kubectl directly.

@ncdc
Copy link
Contributor

ncdc commented Nov 1, 2017

@nrb yes this is option 3, although using a new describe subcommand.

jmontleon pushed a commit to jmontleon/velero that referenced this issue Sep 14, 2021
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
Mathieu-Ferraton pushed a commit to Mathieu-Ferraton/velero that referenced this issue Sep 30, 2021
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
jaidevmane pushed a commit to jaidevmane/velero that referenced this issue Oct 13, 2021
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
alaypatel07 pushed a commit to alaypatel07/velero that referenced this issue Oct 26, 2021
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
JaydipGabani pushed a commit to JaydipGabani/velero that referenced this issue Jan 24, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
jmontleon pushed a commit to jmontleon/velero that referenced this issue Mar 2, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
jmontleon pushed a commit to jmontleon/velero that referenced this issue May 6, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue Jun 28, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue Jul 13, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue Jul 21, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue Aug 11, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue Sep 26, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Oct 14, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Oct 14, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Oct 24, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
github-actions bot pushed a commit to kaovilai/velero that referenced this issue Nov 4, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
weshayutin pushed a commit to weshayutin/velero that referenced this issue Dec 2, 2022
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Feb 6, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Mar 16, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue Mar 30, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue Apr 5, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue May 5, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
shubham-pampattiwar pushed a commit to shubham-pampattiwar/velero that referenced this issue May 10, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Aug 11, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Oct 2, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Oct 24, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
kaovilai pushed a commit to kaovilai/velero that referenced this issue Oct 24, 2023
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
weshayutin pushed a commit to weshayutin/velero that referenced this issue Mar 19, 2024
(cherry picked from commit ccb545f)

Update PR-BZ automation mapping (vmware-tanzu#84)

(cherry picked from commit aa2b019)

Update PR-BZ automation (vmware-tanzu#92)

Co-authored-by: Rayford Johnson <rjohnson@redhat.com>
(cherry picked from commit ecc563f)

Add publish workflow (vmware-tanzu#108)

(cherry picked from commit f87b779)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants