-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resources under the single app limit issue #41
Comments
that's interesting... how long does it hang until the error?
does it hang before showing the diff (im guessing so since the error includes can you describe your cluster a bit more, more specifically:
|
im also curious how long following command runs against your cluster: https://gist.github.com/cppforlife/25890e4a9e732413bbf83c81e4a808b1 (122 resources to be created, ~3 to check cluster and calculate diff) |
im also creating a new release for kapp to include debug flag so that we can get to the bottom of whats going on in your cluster. |
Sorry for disappearing:
Executed several times on my local machine (100mbit wifi)
correct, before diff
It's about 20 instances of the same application with different settings (2 deployments, 1 crd certificate, 2 pdb, 2 configmaps, 1 ingress, 2 services)
https://gist.github.com/pavel-khritonenko/a4ffb3bec510a1d4d1a3b419cfd92993
Cluster admin permissions (no limits)
EKS (amazon web services) |
Currently I added deployment to CI/CD process and run deployment manually from gitlab on the runner near the cluster (the same subnet) - it never fails. |
19 seconds, success |
$ cue dump | grep kind |
oh, that's interesting. im using default GKE cluster with 3 nodes, and would have expected to have similar response time (~3s).
how beefy are control plane machines? not sure if AWS tells you those details. |
They don't tell us such things. If you want me to run any requests with timings - would be glad to test for you |
@pavel-khritonenko would you mind trying out https://github.com/k14s/kapp/releases/tag/v0.14.0 with --debug flag and posting results. |
|
I just faced the same issue deploying 76 Kubernetes deployments (nothing special, single deployment single container, different env variables). Initial creation was fast and flawless, update such definition fails for the same reason. |
@pavel-khritonenko i be been away on vacation hence my slow response (coming back next week). meanwhile im intrigued that you mention that creation vs update is fast. could you attach debug for creation as well? given that above debug log showed that |
@pavel-khritonenko would you mind building from develop and running kapp. ive made two changes: (1) cd6e6bb throttles seemingly expensive operation on your cluster to 10 at a time, and, (2) 78ab39c includes more info in --debug. to build (requires checking out to GOPATH):
|
@pavel-khritonenko any update on this issue? |
Sorry, on vacation atm, will check next week.
…On Tue, 29 Oct 2019 at 18:09, Dmitriy Kalinin ***@***.***> wrote:
@pavel-khritonenko <https://github.com/pavel-khritonenko> any update on
this issue?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATEZSQMWGHKSU3BSGYTNQ3QRBGZ5ANCNFSM4I3R3IVQ>
.
|
@pavel-khritonenko checking in, any updates on this? |
Yes, sorry for disappearing. Yesterday I figured out a few things caused a timeout. First - I don't specify anywhere in my deployments parameter The second - we use https://keel.sh to auto-update our deployments, so any commit to a branch triggers a deployment update and creates a new replica set. As a result, we get a lot of replica sets for each of ~60 deployments. So even I specified |
nice finds.
i didnt quite follow this one. are you saying 2147483647 showed in the diff? who was setting it? |
I'm not sure where that value comes from, but I haven't set it previously. Cannot reproduce it with the latest version of
My build agent is still using |
Managed to reproduce with the version Manually changed
What I see after applying:
Deleted that deployment manually ( |
Seems it's not related to Finally got it, it seems it's because of |
yup, sounds like a server side behaviour. ill close the issue (ill probably file a different issue that throws in a warning when fetching resources takes a long time). thanks for digging in. |
I use
kapp
to manage a set of deployments. Under the single application, I deploy about ~230 resources (generated). On some point in time, deployment started taking a long time, after adding more resources is stopped working at all. Hangs for a couple of minutes then I get such error when I run it locally:When I run it closer to target kubernetes cluster (in the same AWS network) - it works better (fails less often).
The text was updated successfully, but these errors were encountered: