Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1597: skip RecordKeyMissing error #660

Merged
merged 2 commits into from
May 21, 2024

Conversation

jpinsonneau
Copy link
Collaborator

Description

Skip all RecordKeyMissing and consider these as 0 since we removed empty fields from eBPF agent.

https://github.com/netobserv/netobserv-ebpf-agent/blob/release-1.5/pkg/decode/decode_protobuf.go#L61-L67

If this needs to be configurable for specific cases please let me know.

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@jotak jotak changed the title NETOBSERV-1597 skip RecordKeyMissing error NETOBSERV-1597: skip RecordKeyMissing error May 16, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented May 16, 2024

@jpinsonneau: This pull request references NETOBSERV-1597 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.

In response to this:

Description

Skip all RecordKeyMissing and consider these as 0 since we removed empty fields from eBPF agent.

https://github.com/netobserv/netobserv-ebpf-agent/blob/release-1.5/pkg/decode/decode_protobuf.go#L61-L67

If this needs to be configurable for specific cases please let me know.

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

m.errorsCounter.WithLabelValues("RecordKeyMissing", info.Name, info.ValueKey).Inc()
return nil
// No value means 0 to keep storage lightweight
return 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should still return nil, I think: for counters, returning 0 or nil will have the same effect (ie. adding nothing to the counter) but for histograms, return 0 would actually corrupt the stats computation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends of the histograms I guess... Do you have some concrete examples in mind ?

For data consistency we should probably add filters before doing such assumptions. For example, DNS metrics must have DNSId, DNSLatencyMs fields etc. That's what we do on plugin query side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the histograms that we use are latency stuff, so I don't see a case where we would want to say: no latency == 0ms. If a flow is missing RTT, or Dns latency, then we should just ignore it for histogram stats. Expecting an explicit filter would add extra processing and could easily be error-prone IMO. I'd return your question: do you have an example in mind where we need to say nil == 0 for an histogram ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes, packets and their respective drops fields must be considered as 0 when keys are not present.

That can happen when a flow is fully sent / fully dropped

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but that's fine these are counters not histograms, adding 0 to the counter is equivalent to ignoring it

@openshift-ci openshift-ci bot added the lgtm label May 16, 2024
@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 16, 2024
Copy link

New image:
quay.io/netobserv/flowlogs-pipeline:db58132

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=db58132 make set-flp-image

@codecov-commenter
Copy link

codecov-commenter commented May 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 66.49%. Comparing base (44438e7) to head (e806059).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #660      +/-   ##
==========================================
- Coverage   66.52%   66.49%   -0.04%     
==========================================
  Files         104      104              
  Lines        6659     6658       -1     
==========================================
- Hits         4430     4427       -3     
- Misses       1915     1916       +1     
- Partials      314      315       +1     
Flag Coverage Δ
unittests 66.49% <ø> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@memodi
Copy link

memodi commented May 20, 2024

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved QE has approved this pull request label May 20, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented May 20, 2024

@jpinsonneau: This pull request references NETOBSERV-1597 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set.

In response to this:

Description

Skip all RecordKeyMissing and consider these as 0 since we removed empty fields from eBPF agent.

https://github.com/netobserv/netobserv-ebpf-agent/blob/release-1.5/pkg/decode/decode_protobuf.go#L61-L67

If this needs to be configurable for specific cases please let me know.

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jotak
Copy link
Member

jotak commented May 21, 2024

thanks @memodi
/approve

Copy link

openshift-ci bot commented May 21, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit fdb4eb1 into netobserv:main May 21, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved jira/valid-reference lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. qe-approved QE has approved this pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants