Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[E2E] Reconciliation duration metric failure #2742

Closed
squakez opened this issue Nov 10, 2021 · 5 comments · Fixed by #2833 or #2836
Closed

[E2E] Reconciliation duration metric failure #2742

squakez opened this issue Nov 10, 2021 · 5 comments · Fixed by #2833 or #2836
Assignees
Labels
area/continuous integration Related to CI and automated testing kind/bug Something isn't working

Comments

@squakez
Copy link
Contributor

squakez commented Nov 10, 2021

I've noticed this since a while, not sure if we're tracking somewhere else already. Basically there seems to be some error in all the PRs (ie, https://github.com/apache/camel-k/runs/4163440983?check_suite_focus=true) due to:

 --- FAIL: TestMetrics (100.50s)
    --- PASS: TestMetrics/Build_duration_metric (0.02s)
    --- PASS: TestMetrics/Build_recovery_attempts_metric (0.02s)
    --- FAIL: TestMetrics/reconciliation_duration_metric (0.04s)

Having a look at the code, it seems it complains here: https://github.com/apache/camel-k/blob/main/e2e/common/operator_metrics_test.go#L256

Because it cannot match with the following:

TestMetrics/reconciliation_duration_metric
    operator_metrics_test.go:256: 
        Expected
            <string>: MetricFamily
        to match fields: {
        .Metric:
        	Expected
        	    <[]*io_prometheus_client.Metric | len:8, cap:8>: [
        	        {
        	            Label: [
        	                {
        	                    Name: "group",
        	                    Value: "camel.apache.org",
        	                    XXX_NoUnkeyedLiteral: {},
        	                    XXX_unrecognized: nil,
        	                    XXX_sizecache: 0,
        	                },
        	                {
        	                    Name: "kind",
        	                    Value: "Build",
        	                    XXX_NoUnkeyedLiteral: {},
        	                    XXX_unrecognized: nil,
        	                    XXX_sizecache: 0,
        	                },
        	                {
        	                    Name: "namespace",
        	                    Value: "test-d047b7ac-9a61-409a-b312-449c4ef86bfa",
        	                    XXX_NoUnkeyedLiteral: {},
        	                    XXX_unrecognized: nil,
        	                    XXX_sizecache: 0,
        	                    ],
        	                    missingElements: [
        	                        <*io_prometheus_client.LabelPair | 0xc00080e120>{
        	                            Name: "version",
        	                            Value: "v1",
        	                            XXX_NoUnkeyedLiteral: {},
        	                            XXX_unrecognized: nil,
        	                            XXX_sizecache: 0,
        	                        },
        	                        <*io_prometheus_client.LabelPair | 0xc00080e150>{
        	                            Name: "kind",
        	                            Value: "Integration",
        	                            XXX_NoUnkeyedLiteral: {},
        	                            XXX_unrecognized: nil,
        	                            XXX_sizecache: 0,
        	                        },
        	                    ],
        	                    extraElements: [
        	                        <*io_prometheus_client.LabelPair | 0xc00065f770>{
        	                            Name: "kind",
        	                            Value: "Kamelet",
        	                            XXX_NoUnkeyedLiteral: {},
        	                            XXX_unrecognized: nil,
        	                            XXX_sizecache: 0,
        	                        },
        	                        <*io_prometheus_client.LabelPair | 0xc00065f830>{
        	                            Name: "version",
        	                            Value: "v1alpha1",
        	                            XXX_NoUnkeyedLiteral: {},
        	                            XXX_unrecognized: nil,
        	                            XXX_sizecache: 0,
        	                        },
        	                    ],
        	                },
        	                "Histogram": <*gstruct.PointerMatcher | 0xc0007b9840>{
        	                    Matcher: <*gstruct.FieldsMatcher | 0xc00080e240>{
        	                        Fields: {
        	                       ...
        	
        	Gomega truncated this representation as it exceeds 'format.MaxLength'.
        	Consider having the object provide a custom 'GomegaStringer' representation
        	or adjust the parameters in Gomega's 'format' package.
        	
        	Learn more here: https://onsi.github.io/gomega/#adjusting-output
        	
        }

fyi @tadayosi @astefanutti @nicolaferraro

@squakez squakez added the area/continuous integration Related to CI and automated testing label Nov 10, 2021
@tadayosi
Copy link
Member

FYI Antonin’s insight on the exact issue.
#2714 (comment)

@squakez
Copy link
Contributor Author

squakez commented Nov 10, 2021

FYI Antonin’s insight on the exact issue. #2714 (comment)

I missed that, thanks. So, it seems that we need to keep this open and try to fix it.

@squakez squakez added the kind/bug Something isn't working label Nov 10, 2021
@tadayosi
Copy link
Member

By the way, when we use Gomega's structural matches the diffs at test failure can be really lengthy and thus truncated by default. In the above case, the real mismatch is hidden in the truncated part.

If we are ok, probably we should add:

format.MaxLength = 0

somewhere in test_support.go to disable the diff truncation.
https://onsi.github.io/gomega/#adjusting-output

@astefanutti
Copy link
Member

@tadayosi I agree this would be useful 👍🏼.

@squakez
Copy link
Contributor Author

squakez commented Dec 9, 2021

Now that we don't truncate log, I can see the exact failure:

"Histogram": <*gstruct.PointerMatcher | 0xc0006c9740>{
   Matcher: <*gstruct.FieldsMatcher | 0xc00066ed80>{
       Fields: {
        	  "SampleCount": <*gstruct.PointerMatcher | 0xc0006c9720>{
        	      Matcher: <*matchers.EqualMatcher | 0xc000c5a500>{Expected: <uint64>12},
        	      failure: "Expected\n    <uint64>: 272\nto equal\n    <uint64>: 12",
        	  },
       },

This is an excerpt from this failed build: https://github.com/apache/camel-k/runs/4446921484?check_suite_focus=true

Digging deeper:

Label: [
...
    {
        Name: "kind",
        Value: "Kamelet",
...
    {
        Name: "result",
        Value: "Reconciled",
...
    },
...
],
...
Histogram: {
    SampleCount: 272,

It seems that the reconciliation count for Kamelet is used instead of the Integration. We're probably counting wrong in the log. I'm trying to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/continuous integration Related to CI and automated testing kind/bug Something isn't working
Projects
None yet
3 participants