Flaky scale integration test or scale intermittent issue #2250

squakez · 2021-04-30T07:36:37Z

--- FAIL: TestIntegrationScale/Scale_integration_with_Camel_K_client (60.05s)

I'm running this problem many times during the last period (last time here). It seems that sometimes this test fails for no reasons, so, either the test is a flaky one or there is indeed some problem in the scaling. Whichever the case I think we should fix it.

The text was updated successfully, but these errors were encountered:

astefanutti · 2021-04-30T08:25:37Z

It seems the scaling works as expected, as one of the pod is marked as deleted after the down-scaling:

DeletionTimestamp: {
  Time: 2021-04-30T07:33:16Z,
},

But the pod is not actually deleted. Either it's just the timeout that's too short (60s), or there is an issue during the graceful shutdown of the Integration pod:

 ContainerStatuses: [
    {
        Name: "integration",
        State: {
            Waiting: {
                Reason: "ContainerCreating",
                Message: "",
            },
            Running: nil,
            Terminated: nil,
        },
        LastTerminationState: {
            Waiting: nil,
            Running: nil,
            Terminated: {
                ExitCode: 137,
                Signal: 0,
                Reason: "ContainerStatusUnknown",
                Message: "The container could not be located when the pod was deleted.  The container used to be Running",
                StartedAt: {
                    Time: 0001-01-01T00:00:00Z,
                },
                FinishedAt: {
                    Time: 0001-01-01T00:00:00Z,
                },
                ContainerID: "",
            },
        },
        Ready: false,
        RestartCount: 0,
        Image: "kind-registry:5000/test-93060b55-d350-424a-940b-ac238a56a23e/camel-k-kit-c25r4v0p0t9ikun92a80@sha256:be19aebd01eb8c08805670ff92d44f1b7c74915d4d94846d1314a01a3223999d",
        ImageID: "",
        ContainerID: "",
        Started: false,
    },
],

astefanutti · 2021-04-30T08:28:45Z

It seems related to kubernetes/kubernetes#97288.

astefanutti · 2021-04-30T08:39:20Z

If the assumption is correct, the fix is provided with kubernetes/kubernetes#97980, that's been cherry-picked in 1.20.x with kubernetes/kubernetes#97998.

Let me try bumping Kubernetes version used in CI to 1.20.6 ...

astefanutti mentioned this issue Apr 30, 2021

fix(ci): Upgrade Kubernetes to version 1.21.1 #2251

Merged

astefanutti added the area/continuous integration Related to CI and automated testing label Apr 30, 2021

squakez mentioned this issue May 6, 2021

refactor(example): error handler using a real Kafka topic #2248

Merged

astefanutti closed this as completed in #2251 May 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky scale integration test or scale intermittent issue #2250

Flaky scale integration test or scale intermittent issue #2250

squakez commented Apr 30, 2021

astefanutti commented Apr 30, 2021 •

edited

Loading

astefanutti commented Apr 30, 2021

astefanutti commented Apr 30, 2021

Flaky scale integration test or scale intermittent issue #2250

Flaky scale integration test or scale intermittent issue #2250

Comments

squakez commented Apr 30, 2021

astefanutti commented Apr 30, 2021 • edited Loading

astefanutti commented Apr 30, 2021

astefanutti commented Apr 30, 2021

astefanutti commented Apr 30, 2021 •

edited

Loading