Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1344: include inner traffic in metrics #452

Closed
wants to merge 1 commit into from

Conversation

jotak
Copy link
Member

@jotak jotak commented Oct 9, 2023

Previously, "inner" traffic (ie. traffic between pods running on the same node) wasn't included in the metrics

Dependencies

Requires netobserv/flowlogs-pipeline#478

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 9, 2023

@jotak: This pull request references NETOBSERV-1344 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.15.0" version, but no target version was set.

In response to this:

Previously, "inner" traffic (ie. traffic between pods running on the same node) wasn't included in the metrics

Dependencies

Requires netobserv/flowlogs-pipeline#478

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link

openshift-ci bot commented Oct 9, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jotak. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jotak
Copy link
Member Author

jotak commented Oct 9, 2023

(note that this change will be obsoleted by #447 - but since we want to fix it in 1.4.1, it still must be merged & cherry-picked to 1.4)

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 9, 2023

@jotak: This pull request references NETOBSERV-1344 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.15.0" version, but no target version was set.

In response to this:

Previously, "inner" traffic (ie. traffic between pods running on the same node) wasn't included in the metrics

Dependencies

Requires netobserv/flowlogs-pipeline#478

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jotak
Copy link
Member Author

jotak commented Oct 9, 2023

Screenshot showing byte rates before and after applying this PR (+ flp changes) :

Capture d’écran du 2023-10-09 10-18-32

After 10:15, byte rates is higher, reaching 31 kBps which was the expected target

Or at the workloads level:
Capture d’écran du 2023-10-09 10-19-31

We see traffic from "player-locals" to "ball" and "ui" being doubled.
This is because one of the 2 player-locals pods was running on same node that ui and ball => that traffic was previously ignored:

$ oc get pods -owide -n mesh-arena
NAME                              READY   STATUS    RESTARTS   AGE    IP            NODE                                        NOMINATED NODE   READINESS GATES
ball-base-5f58bfdd79-jgdd2        1/1     Running   0          7m9s   10.131.0.20   ip-10-0-38-56.eu-west-3.compute.internal    <none>           <none>
player-locals-6cfb9db945-6jc2t    1/1     Running   0          7m9s   10.128.2.26   ip-10-0-18-148.eu-west-3.compute.internal   <none>           <none>
player-locals-6cfb9db945-jnzmk    1/1     Running   0          7m9s   10.131.0.19   ip-10-0-38-56.eu-west-3.compute.internal    <none>           <none>
player-visitors-769bf5458-j5lvk   1/1     Running   0          7m9s   10.128.2.27   ip-10-0-18-148.eu-west-3.compute.internal   <none>           <none>
player-visitors-769bf5458-z4dts   1/1     Running   0          7m9s   10.130.2.24   ip-10-0-94-168.eu-west-3.compute.internal   <none>           <none>
stadium-base-597db8c96c-tnf4p     1/1     Running   0          7m9s   10.128.2.28   ip-10-0-18-148.eu-west-3.compute.internal   <none>           <none>
ui-base-66c48fbdb-5rnpc           1/1     Running   0          7m9s   10.131.0.21   ip-10-0-38-56.eu-west-3.compute.internal    <none>           <none>

Copy link
Contributor

@OlivierCazade OlivierCazade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but this is also fixed by the allowList PR right?

May be you want to merge this one first so we can only backport this one if we release a 1.4.1?

@OlivierCazade
Copy link
Contributor

Ok, should also have read comment below...

@jotak
Copy link
Member Author

jotak commented Oct 11, 2023

I'll close than one since #447 is already ready for merging.
I opened a backport PR for 1.4: #456

@jotak jotak closed this Oct 11, 2023
@jotak jotak deleted the include-inner branch November 1, 2023 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants