Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1390: Flow RTT values are n/a #245

Merged
merged 3 commits into from Jan 15, 2024

Conversation

msherif1234
Copy link
Contributor

@msherif1234 msherif1234 commented Jan 3, 2024

Description

using tcp_rcv fentry hook to reuse kernel calculated rtt directly from TCP socket instead of trying to calculate it

Testing

  • verified changes with 4.16 image (RHEL9)
  • verified changes with 4.12 image (RHEL8)
  • run on different arch than amd64
    using cluster-bot launch 4.16 aws,arm64
  • performance and scale run

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jan 3, 2024

@msherif1234: This pull request references NETOBSERV-1390 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.

In response to this:

Description

using tcp_rcv kprobe to reuse kernel calculated rtt directly from TCP socket instead of trying to calculate it

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

codecov bot commented Jan 3, 2024

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Comparison is base (cd878fd) 33.62% compared to head (0095be4) 33.86%.

Files Patch % Lines
pkg/ebpf/tracer.go 0.00% 16 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #245      +/-   ##
==========================================
+ Coverage   33.62%   33.86%   +0.23%     
==========================================
  Files          39       39              
  Lines        3494     3479      -15     
==========================================
+ Hits         1175     1178       +3     
+ Misses       2251     2233      -18     
  Partials       68       68              
Flag Coverage Δ
unittests 33.86% <20.00%> (+0.23%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 3, 2024
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 3, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 3, 2024
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 3, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 3, 2024
@msherif1234
Copy link
Contributor Author

msherif1234 commented Jan 3, 2024

main vs this PR both with RTT enabled, there is cpu increase because of kprobe processing

image

but its looking much better from netobserv UI as long as we filter on TCP and Ingress direction because this Ingress only feature, also w/o the need to set sampling to 1
in fact with high sampling we might not give chance for the new hook to run as often and lose rtt updates

image

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@msherif1234 msherif1234 changed the title WIP: NETOBSERV-1390: Flow RTT values are n/a NETOBSERV-1390: Flow RTT values are n/a Jan 4, 2024
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 4, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jan 4, 2024

@msherif1234: This pull request references NETOBSERV-1390 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.

In response to this:

Description

using tcp_rcv kprobe and fentry hooks to reuse kernel calculated rtt directly from TCP socket instead of trying to calculate it

Testing

  • verified changes with 4.16 image (RHEL9)
  • verified changes with 4.12 image (RHEL8)
  • run on different arch than amd64
    using cluster-bot launch 4.16 aws,arm64
  • performance and scale run

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@netobserv netobserv deleted a comment from openshift-ci-robot Jan 4, 2024
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 12, 2024
Copy link

New image:
quay.io/netobserv/netobserv-ebpf-agent:fe0d62f

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=fe0d62f make set-agent-image

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 12, 2024
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 12, 2024
Copy link

New image:
quay.io/netobserv/netobserv-ebpf-agent:bc220e5

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=bc220e5 make set-agent-image

@jotak
Copy link
Member

jotak commented Jan 12, 2024

/lgtm

@Amoghrd
Copy link

Amoghrd commented Jan 12, 2024

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved QE has approved this pull request label Jan 12, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jan 12, 2024

@msherif1234: This pull request references NETOBSERV-1390 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target either version "4.16." or "openshift-4.16.", but it targets "netobserv-1.5" instead.

In response to this:

Description

using tcp_rcv kprobe and fentry hooks to reuse kernel calculated rtt directly from TCP socket instead of trying to calculate it

Testing

  • verified changes with 4.16 image (RHEL9)
  • verified changes with 4.12 image (RHEL8)
  • run on different arch than amd64
    using cluster-bot launch 4.16 aws,arm64
  • performance and scale run

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Signed-off-by: Mohamed Mahmoud <mmahmoud@redhat.com>
Signed-off-by: Mohamed Mahmoud <mmahmoud@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm label Jan 12, 2024
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 12, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jan 12, 2024

@msherif1234: This pull request references NETOBSERV-1390 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target either version "4.16." or "openshift-4.16.", but it targets "netobserv-1.5" instead.

In response to this:

Description

using tcp_rcv fentry hook to reuse kernel calculated rtt directly from TCP socket instead of trying to calculate it

Testing

  • verified changes with 4.16 image (RHEL9)
  • verified changes with 4.12 image (RHEL8)
  • run on different arch than amd64
    using cluster-bot launch 4.16 aws,arm64
  • performance and scale run

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@msherif1234
Copy link
Contributor Author

msherif1234 commented Jan 12, 2024

removed kprobe and just stayed with fentry only which more advanced kprobe and light weight also no need for additional complexity for no real reason

@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 12, 2024
Copy link

New image:
quay.io/netobserv/netobserv-ebpf-agent:2639b26

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=2639b26 make set-agent-image

@Amoghrd
Copy link

Amoghrd commented Jan 15, 2024

Works as expected with latest commits.

@jotak
Copy link
Member

jotak commented Jan 15, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Jan 15, 2024
@msherif1234
Copy link
Contributor Author

/approve

Copy link

openshift-ci bot commented Jan 15, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msherif1234

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 2864713 into netobserv:main Jan 15, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved jira/valid-reference lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. qe-approved QE has approved this pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants