Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metrics labeling with only directed source/destination context op… #27792

Merged

Conversation

marqc
Copy link
Contributor

@marqc marqc commented Aug 29, 2023

…tions set.

Fixes: #27717

Please ensure your pull request adheres to the following guidelines:

  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.

In case only directed Source/Destination Ingress/Egress context options were specified, metrics were incorrectly labeled. this change ensures that correct labels are included when only directed options are specified in metrics configuration.

Fixes: #27717

Fix hubble metric labeling when only directed Source/Destination Ingress/Egress options are specified.

@marqc marqc requested review from a team as code owners August 29, 2023 10:32
@marqc marqc requested review from chancez and gandro August 29, 2023 10:32
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 29, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Aug 29, 2023
@gandro gandro added release-note/bug This PR fixes an issue in a previous release of Cilium. area/metrics Impacts statistics / metrics gathering, eg via Prometheus. sig/hubble Impacts hubble server or relay needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Aug 29, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 29, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.2 Aug 29, 2023
@gandro
Copy link
Member

gandro commented Aug 30, 2023

/test

@marqc marqc force-pushed the issue-27717-fix-ingress-egress-context-options branch from 8e701d6 to b419e85 Compare September 1, 2023 13:43
Fixes: cilium#27717
Signed-off-by: Marek Chodor <mchodor@google.com>
@marqc marqc force-pushed the issue-27717-fix-ingress-egress-context-options branch from b419e85 to 37e81ce Compare September 4, 2023 08:26
@gandro
Copy link
Member

gandro commented Sep 4, 2023

/test

Edit: ci-ipsec-upgrade hit #27827, clearly unrelated to the PR since it doesn't affect the datapath or IPSec

@gandro
Copy link
Member

gandro commented Sep 4, 2023

Not sure why Travis was not started. I don't see any failure or pending run in the dashboard https://app.travis-ci.com/github/cilium/cilium/pull_requests

@gandro
Copy link
Member

gandro commented Sep 4, 2023

I'll close and re-open the PR to hopefully re-trigger Travis. This will invalidate the tests (which are all green at the momen ) unfortunately, but I don't think there is any other way.

@gandro gandro closed this Sep 4, 2023
@gandro gandro reopened this Sep 4, 2023
@gandro
Copy link
Member

gandro commented Sep 4, 2023

Travis is refusing to acknowledge the existence of this PR. Not sure what else we can do.

I'll mark it ready to merge, since all other CI is green and the affected tests here seem to have passed in IntegrationTests

@gandro gandro added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 4, 2023
@matmerr
Copy link

matmerr commented Sep 5, 2023

@gandro not sure if ready to merge yet, I built this PR and ran some manual validation from the scenario in #27717 with the flow enabled,

node 1: agnhost-a
node 2: agnhost-b

k exec -it agnhost-a -- curl <agnhost-b ip>

and it seems that ingress is working, both egress for source and destination don't report any metrics. In the case of sourceEgressContext, source="" is just empty string, and case of destinationEgressContext, destination="" is empty string as well

hubble-metrics: flow:destinationIngressContext
Node 0
hubble_flows_processed_total{destination="kube-system/agnhost-a",protocol="TCP",subtype="to-endpoint",type="Trace",verdict="FORWARDED"} 4

Node 1
hubble_flows_processed_total{destination="kube-system/agnhost-b",protocol="TCP",subtype="to-endpoint",type="Trace",verdict="FORWARDED"} 4

hubble-metrics: flow:destinationEgressContext=pod
Node 0
empty response

Node 1
empty response

hubble-metrics: flow:sourceEgressContext=pod
Node 0
empty response

Node 1
empty response

hubble-metrics: flow:sourceIngressContext=pod
Node 0
hubble_flows_processed_total{protocol="TCP",source="kube-system/agnhost-b",subtype="to-endpoint",type="Trace",verdict="FORWARDED"} 4

Node 1
hubble_flows_processed_total{protocol="TCP",source="kube-system/agnhost-a",subtype="to-endpoint",type="Trace",verdict="FORWARDED"} 4

@chancez chancez added dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Sep 5, 2023
@marqc
Copy link
Contributor Author

marqc commented Sep 6, 2023

@matmerr I verified your scenario with these code changes and everything looks fine.

test pods

nginx-77b4fdf86c-7wflz                1/1     Running     0          8s      10.80.7.153   node-A
nginx-77b4fdf86c-v2pfw                1/1     Running     0          8s      10.80.6.184   node-B

network call:

kubectl exec -it nginx-77b4fdf86c-7wflz -- curl 10.80.6.184

Case 1

cilium config:

hubble-metrics: flow:destinationEgressContext=pod;labelsContext=source_pod,destination_pod,traffic_direction

metrics on node-A

hubble_flows_processed_total{>>>>destination="default/nginx-77b4fdf86c-7wflz"<<<<,destination_pod="nginx-77b4fdf86c-7wflz",protocol="TCP",source_pod="nginx-77b4fdf86c-v2pfw",subtype="to-endpoint",traffic_direction="egress",type="Trace",verdict="FORWARDED"} 3
hubble_flows_processed_total{>>>>destination="default/nginx-77b4fdf86c-v2pfw"<<<<,destination_pod="nginx-77b4fdf86c-v2pfw",protocol="TCP",source_pod="nginx-77b4fdf86c-7wflz",subtype="to-stack",traffic_direction="egress",type="Trace",verdict="FORWARDED"} 5

metrics on the node-B

<none>

Case 2

cilium config:

hubble-metrics: flow:sourceIngressContext=pod;labelsContext=source_pod,destination_pod,traffic_direction

metrics on node-A

<none>

metrics on the node-B

hubble_flows_processed_total{destination_pod="nginx-77b4fdf86c-7wflz",protocol="TCP",>>>>source="default/nginx-77b4fdf86c-v2pfw"<<<<,source_pod="nginx-77b4fdf86c-v2pfw",subtype="to-stack",traffic_direction="ingress",type="Trace",verdict="FORWARDED"} 3
hubble_flows_processed_total{destination_pod="nginx-77b4fdf86c-v2pfw",protocol="TCP",>>>>source="default/nginx-77b4fdf86c-7wflz"<<<<,source_pod="nginx-77b4fdf86c-7wflz",subtype="to-endpoint",traffic_direction="ingress",type="Trace",verdict="FORWARDED"} 5

@anubhabMajumdar
Copy link
Contributor

sourceEgressContext=

Hi @marqc , can you check sourceEgressContext scenario? I think @matmerr faced some issue with that particular context.

@matmerr
Copy link

matmerr commented Sep 6, 2023

fwiw I was also validating with just hubble-metrics: flow:sourceEgressContext=pod, no additional labelsContext

@marqc
Copy link
Contributor Author

marqc commented Sep 7, 2023

@matmerr @anubhabMajumdar

Case 3

cilium config:

hubble-metrics: flow:sourceEgressContext=pod;labelsContext=source_pod,destination_pod,traffic_direction

metrics on node-A

hubble_flows_processed_total{
  destination_pod="nginx-77b4fdf86c-7wflz",
  protocol="TCP",
  source="default/nginx-77b4fdf86c-v2pfw",
  source_pod="nginx-77b4fdf86c-v2pfw",
  subtype="to-endpoint",
  traffic_direction="egress",
  type="Trace",
  verdict="FORWARDED"
} 5
hubble_flows_processed_total{
  destination_pod="nginx-77b4fdf86c-v2pfw",
  protocol="TCP",
  source="default/nginx-77b4fdf86c-7wflz",
  source_pod="nginx-77b4fdf86c-7wflz",
  subtype="to-stack",
  traffic_direction="egress",
  type="Trace",
  verdict="FORWARDED"
} 7

metrics on the node-B

<none>

Case 3.1

cilium config:

hubble-metrics: flow:sourceEgressContext=pod

metrics on node-A

hubble_flows_processed_total{
  protocol="TCP",
  source="default/nginx-77b4fdf86c-7wflz",
  subtype="to-stack",
  type="Trace",
  verdict="FORWARDED"
} 7
hubble_flows_processed_total{
  protocol="TCP",
  source="default/nginx-77b4fdf86c-v2pfw",
  subtype="to-endpoint",
  type="Trace",
  verdict="FORWARDED"
} 5

metrics on the node-B

<none>

@matmerr
Copy link

matmerr commented Sep 7, 2023

UT's seemed to be working for egress, and tested in fresh cluster which seems to be working. Not worth investigating the first one for the time being, lgtm.

@chancez chancez added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. and removed dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. labels Sep 8, 2023
@michi-covalent michi-covalent added this to Needs backport from main in 1.14.3 Sep 9, 2023
@michi-covalent michi-covalent removed this from Needs backport from main in 1.14.2 Sep 9, 2023
@youngnick youngnick merged commit e733b8a into cilium:main Sep 10, 2023
102 checks passed
@gandro gandro mentioned this pull request Sep 12, 2023
15 tasks
@gandro gandro added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Sep 12, 2023
@gandro gandro added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Sep 25, 2023
@jrajahalme jrajahalme moved this from Needs backport from main to Backport done to v1.14 in 1.14.3 Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics Impacts statistics / metrics gathering, eg via Prometheus. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/hubble Impacts hubble server or relay
Projects
No open projects
1.14.3
Backport done to v1.14
Development

Successfully merging this pull request may close these issues.

Hubble pod metrics missing when using (Source|Destination)(Ingress|Egress)Context
6 participants