Integration in error phase can't be scaled: why don't we just rebuild it? #2640

nicolaferraro · 2021-09-16T10:26:00Z

It seems silly to scale an integration if it does not work, but now the error condition includes also pod failures.

So if a container loses liveness because it receives too much traffic, then you can't scale the integration up.

The number of replicas is not synced back into the deployment/ksvc when the phase is "Error", and we know it:

camel-k/pkg/controller/integration/error.go

Line 95 in 135020e

// TODO check also if deployment matches (e.g. replicas)

What prevents us to just add "replicas" to the set of fields considered for the digest, so that when the number of replicas change the integration restarts from "Initialization"? All steps from "Initialization" to "Running" are idempotent, so why is there such a shortcut for the "replicas" field, just to keep the phase in "Running"? It often causes issues with the GC, like #2639

cc: @astefanutti , @squakez , @lburgazzoli

astefanutti · 2021-09-16T10:47:18Z

The replicas field is already a bit special for Deployment, as it does not change the PodSpec hash, and does not trigger a rollout when changed. We kind of do the same thing for Integration. I'd be inclined to think core k8s developers had good reasons to do so, and I would find a bit weird to re-initialise an Integration when scaled.

Now for the relation to GC, there are multiple things. In the context of #2639, I still think creating a Kit that's owned by the Integration in that case is a shortcut, see #2365. In the context of other issues about GC, AFAIR, it was about forgetting to add both IntegrationPhaseDeploying and IntegrationPhaseRunning phases. It turns out both are always added, so the question arises, what is the point of having both?

I'm working on scaling right now as I've found issues in the context of #2443. So I can piggy-back any new issues related to scaling.

astefanutti · 2021-09-16T10:54:03Z

Let me assign this to myself. The Error phase is kind of messy, and I'll fix that in the context of #2443.

nicolaferraro added the kind/bug Something isn't working label Sep 16, 2021

nicolaferraro added this to the 1.7.0 milestone Sep 16, 2021

astefanutti self-assigned this Sep 16, 2021

astefanutti mentioned this issue Sep 17, 2021

fix: Unify post-build integration phases reconciliation #2645

Merged

astefanutti closed this as completed in #2645 Sep 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration in error phase can't be scaled: why don't we just rebuild it? #2640

Integration in error phase can't be scaled: why don't we just rebuild it? #2640

nicolaferraro commented Sep 16, 2021

astefanutti commented Sep 16, 2021 •

edited

Loading

astefanutti commented Sep 16, 2021 •

edited

Loading

Integration in error phase can't be scaled: why don't we just rebuild it? #2640

Integration in error phase can't be scaled: why don't we just rebuild it? #2640

Comments

nicolaferraro commented Sep 16, 2021

astefanutti commented Sep 16, 2021 • edited Loading

astefanutti commented Sep 16, 2021 • edited Loading

astefanutti commented Sep 16, 2021 •

edited

Loading

astefanutti commented Sep 16, 2021 •

edited

Loading