Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1109 remove reinterpret direction #402

Closed
wants to merge 1 commit into from

Conversation

jpinsonneau
Copy link
Contributor

@jpinsonneau jpinsonneau commented Aug 9, 2023

Following NETOBSERV-1099 EGRESS / INGRESS metrics are not consistent using reinterpretDirection function.

I suggest to simply remove it to stick with IPFIX definition. The metrics are consistent after that (see netobserv/network-observability-console-plugin#366 (comment)).
Keeping it double the numbers in the Overview section graphs

@openshift-ci
Copy link

openshift-ci bot commented Aug 9, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jpinsonneau. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jpinsonneau jpinsonneau marked this pull request as ready for review August 9, 2023 10:44
@jpinsonneau jpinsonneau changed the title WIP NETOBSERV-1109 remove reinterpret direction NETOBSERV-1109 remove reinterpret direction Aug 9, 2023
@jpinsonneau jpinsonneau added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 9, 2023
@github-actions
Copy link

github-actions bot commented Aug 9, 2023

New images:

  • quay.io/netobserv/network-observability-operator:22cd8f4
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-22cd8f4
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-22cd8f4

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:22cd8f4 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-22cd8f4

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-22cd8f4
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@jpinsonneau jpinsonneau changed the title NETOBSERV-1109 remove reinterpret direction NETOBSERV-1109 & NETOBSERV-1235 remove reinterpret direction Aug 9, 2023
@msherif1234
Copy link
Contributor

I am fine removing reinterpret logic given the issues seen
/LGTM

@jpinsonneau
Copy link
Contributor Author

jpinsonneau commented Aug 10, 2023

I am fine removing reinterpret logic given the issues seen /LGTM

I think I will wait for @jotak anyways before merging this.
I'm still not convinced by the results I get using concrete download / upload testing (see netobserv/network-observability-console-plugin#366 (comment))

@msherif1234
Copy link
Contributor

I am fine removing reinterpret logic given the issues seen /LGTM

I think I will wait for @jotak anyways before merging this. I'm still not convinced by the results I get using concrete download / upload testing (see netobserv/network-observability-console-plugin#366 (comment))
as long as QE not blocked then that can wait

@jpinsonneau jpinsonneau requested a review from jotak August 21, 2023 11:27
@jotak
Copy link
Member

jotak commented Aug 22, 2023

Did you double-check that the original issue https://issues.redhat.com/browse/NETOBSERV-696 is not being reintroduced with this?
I am not sure why fixing the FlowDirection (ie use reinterpret-direction) would cause troubles, but, that said, I guess I get why it would work now: since reporters are now merged in queries, we're less affected than before by the "unsoundness" of FlowDirection as provided by the agent (ie. direction relative to the interfaces rather than the host).

@jpinsonneau
Copy link
Contributor Author

Did you double-check that the original issue https://issues.redhat.com/browse/NETOBSERV-696 is not being reintroduced with this?

That brings back the initial issue indeed:

  • Without reinterpret:
    image
    image

  • With reinterpret:
    image
    image

I am not sure why fixing the FlowDirection (ie use reinterpret-direction) would cause troubles

As written in the description, it messes up the upload / download scenarios. Amonts are incorrect. Here is the example for download:

  • Without reinterpret (correct numbers):
    image
    image

  • With reinterpret (wrong numbers):
    image
    image

@jotak
Copy link
Member

jotak commented Aug 25, 2023

Trying to reproduce, I do indeed see an issue with the bytes volume reported but I get the same issue with or without this PR, in other words, apparently not related to reinterpret-direction

To try locate the bug, I'm looking at the raw flows stored in loki rather than the computed metrics: the raw flows themselves are wrong, ie. they always show a value smaller to what I'd expect after downloading large volumes.

Example: capture in Loki after downloading a 500MB image (sampling is 1) :
Capture d’écran du 2023-08-25 15-17-51

It shows only ~100MB while I would expect 500MB. I see the same issue keeping or removing reinterpret-direction.
My gut feeling would be something wrong in the ebpf agent aggregation logic

@jotak jotak added ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. and removed ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. labels Aug 25, 2023
@github-actions
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:22cd8f4
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-22cd8f4
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-22cd8f4

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:22cd8f4 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-22cd8f4

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-22cd8f4
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@jpinsonneau jpinsonneau changed the title NETOBSERV-1109 & NETOBSERV-1235 remove reinterpret direction NETOBSERV-1109 remove reinterpret direction Aug 30, 2023
@jpinsonneau
Copy link
Contributor Author

Closing this in favor of "INNER" alternative

@jpinsonneau jpinsonneau closed this Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants