Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: NETOBSERV-1637: OVS monitoring ebpf hook #286

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

msherif1234
Copy link
Contributor

@msherif1234 msherif1234 commented Mar 4, 2024

Description

OVS monitoring eBPF hook feature

ebpf configs to enable ovs tracking from operator

      advanced:
        env:
          ENABLE_OVS_MONITORING: "true"
- bpftool perf show -p
[{
        "pid": 2854,
        "fd": 8,
        "prog_id": 143,
        "fd_type": "kprobe",
        "func": "psample_sample_packet",
        "offset": 0
    }
]
- bpftool map dump id 258
"key": {
            "eth_protocol": 2048,
            "direction": 0,
            "src_mac": [2,220,231,139,148,213
            ],
            "dst_mac": [10,88,10,128,2,12
            ],
            "src_ip": [0,0,0,0,0,0,0,0,0,0,255,255,10,128,2,2
            ],
            "dst_ip": [0,0,0,0,0,0,0,0,0,0,255,255,10,128,2,12
            ],
            "src_port": 56546,
            "dst_port": 8080,
            "transport_protocol": 6,
            "icmp_type": 0,
            "icmp_code": 0,
            "if_index": 2
        },
        "values": [{
                "cpu": 0,
                "value": {
                    "packets": 1,
                    "bytes": 74,
                    "start_mono_time_ts": 6218828496667,
                    "end_mono_time_ts": 6218828496667,
                    "flags": 2,
                    "errno": 0,
                    "dscp": 0,
                    "pkt_drops": {
                        "packets": 0,
                        "bytes": 0,
                        "latest_flags": 0,
                        "latest_state": 0,
                        "latest_drop_cause": 0
                    },
                    "dns_record": {
                        "id": 0,
                        "flags": 0,
                        "latency": 0,
                        "errno": 0
                    },
                    "flow_rtt": 0,
                     ovs_dp_keys": [[0,0,0,0,12,0,255,238
                        ],[0,0,0,0,0,0,0,0
                        ],[0,0,0,0,0,0,0,0
                        ],[0,0,0,0,0,0,0,0
                        ]

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Copy link

openshift-ci bot commented Mar 4, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from msherif1234. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@msherif1234 msherif1234 marked this pull request as draft March 4, 2024 15:30
Copy link

codecov bot commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 0% with 21 lines in your changes are missing coverage. Please review.

Project coverage is 36.14%. Comparing base (1d85464) to head (02419f3).
Report is 11 commits behind head on main.

Files Patch % Lines
pkg/ebpf/tracer.go 0.00% 11 Missing ⚠️
pkg/agent/agent.go 0.00% 9 Missing ⚠️
pkg/ebpf/bpf_x86_bpfel.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #286      +/-   ##
==========================================
- Coverage   36.26%   36.14%   -0.13%     
==========================================
  Files          42       42              
  Lines        3794     3807      +13     
==========================================
  Hits         1376     1376              
- Misses       2340     2353      +13     
  Partials       78       78              
Flag Coverage Δ
unittests 36.14% <0.00%> (-0.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-commenter
Copy link

codecov-commenter commented May 7, 2024

Codecov Report

Attention: Patch coverage is 17.39130% with 38 lines in your changes are missing coverage. Please review.

Project coverage is 33.07%. Comparing base (0fe7a3d) to head (a9429fc).
Report is 1 commits behind head on main.

Current head a9429fc differs from pull request most recent head de1c222

Please upload reports for the commit de1c222 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #286      +/-   ##
==========================================
- Coverage   33.28%   33.07%   -0.21%     
==========================================
  Files          48       48              
  Lines        3491     3519      +28     
==========================================
+ Hits         1162     1164       +2     
- Misses       2232     2257      +25     
- Partials       97       98       +1     
Flag Coverage Δ
unittests 33.07% <17.39%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
pkg/agent/config.go 10.00% <ø> (ø)
pkg/exporter/grpc_proto.go 82.14% <100.00%> (+0.66%) ⬆️
pkg/exporter/kafka_proto.go 69.23% <100.00%> (ø)
pkg/flow/record.go 66.10% <100.00%> (+0.58%) ⬆️
pkg/flow/tracer_map.go 79.48% <100.00%> (ø)
pkg/ebpf/bpf_x86_bpfel.go 0.00% <0.00%> (ø)
pkg/decode/decode_protobuf.go 27.79% <0.00%> (-0.27%) ⬇️
pkg/agent/agent.go 35.16% <0.00%> (-0.82%) ⬇️
pkg/ebpf/tracer.go 0.00% <0.00%> (ø)

@msherif1234 msherif1234 force-pushed the ovs-monitoring branch 7 times, most recently from c805d52 to 490a098 Compare May 8, 2024 19:51
@msherif1234 msherif1234 changed the title WIP: initial implementation of OVS monitoring WIP: NETOBSERV-1634: initial implementation of OVS monitoring May 8, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented May 8, 2024

@msherif1234: This pull request references NETOBSERV-1634 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Description

OVS monitoring eBPF hook feature

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented May 8, 2024

@msherif1234: This pull request references NETOBSERV-1634 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Description

OVS monitoring eBPF hook feature

ebpf configs to enable ovs tracking from operator

     advanced:
       env:
         ENABLE_OVS_MONITORING: "true"

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 8, 2024
Copy link

github-actions bot commented May 8, 2024

New image:
quay.io/netobserv/netobserv-ebpf-agent:bf271d2

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=bf271d2 make set-agent-image

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented May 9, 2024

@msherif1234: This pull request references NETOBSERV-1634 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Description

OVS monitoring eBPF hook feature

ebpf configs to enable ovs tracking from operator

     advanced:
       env:
         ENABLE_OVS_MONITORING: "true"
pftool perf show -p
[{
       "pid": 2854,
       "fd": 8,
       "prog_id": 143,
       "fd_type": "kprobe",
       "func": "psample_sample_packet",
       "offset": 0
   }
]

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 9, 2024
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 9, 2024
Copy link

github-actions bot commented May 9, 2024

New image:
quay.io/netobserv/netobserv-ebpf-agent:ccad91e

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=ccad91e make set-agent-image

@msherif1234
Copy link
Contributor Author

/retest

@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
Copy link

github-actions bot commented Jun 3, 2024

New image:
quay.io/netobserv/netobserv-ebpf-agent:e4b105c

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=e4b105c make set-agent-image

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
Copy link

github-actions bot commented Jun 3, 2024

New image:
quay.io/netobserv/netobserv-ebpf-agent:76b2c13

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=76b2c13 make set-agent-image

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
Copy link

github-actions bot commented Jun 3, 2024

New image:
quay.io/netobserv/netobserv-ebpf-agent:bf6eedc

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=bf6eedc make set-agent-image

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
Copy link

github-actions bot commented Jun 3, 2024

New image:
quay.io/netobserv/netobserv-ebpf-agent:42693da

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=42693da make set-agent-image

Signed-off-by: Mohamed Mahmoud <mmahmoud@redhat.com>
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 3, 2024
Copy link

github-actions bot commented Jun 3, 2024

New image:
quay.io/netobserv/netobserv-ebpf-agent:fa51710

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=fa51710 make set-agent-image

Copy link

openshift-ci bot commented Jun 3, 2024

@msherif1234: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images de1c222 link true /test images

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jun 3, 2024

@msherif1234: This pull request references NETOBSERV-1637 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Description

OVS monitoring eBPF hook feature

ebpf configs to enable ovs tracking from operator

     advanced:
       env:
         ENABLE_OVS_MONITORING: "true"
- bpftool perf show -p
[{
       "pid": 2854,
       "fd": 8,
       "prog_id": 143,
       "fd_type": "kprobe",
       "func": "psample_sample_packet",
       "offset": 0
   }
]
- bpftool map dump id 258
"key": {
           "eth_protocol": 2048,
           "direction": 0,
           "src_mac": [2,220,231,139,148,213
           ],
           "dst_mac": [10,88,10,128,2,12
           ],
           "src_ip": [0,0,0,0,0,0,0,0,0,0,255,255,10,128,2,2
           ],
           "dst_ip": [0,0,0,0,0,0,0,0,0,0,255,255,10,128,2,12
           ],
           "src_port": 56546,
           "dst_port": 8080,
           "transport_protocol": 6,
           "icmp_type": 0,
           "icmp_code": 0,
           "if_index": 2
       },
       "values": [{
               "cpu": 0,
               "value": {
                   "packets": 1,
                   "bytes": 74,
                   "start_mono_time_ts": 6218828496667,
                   "end_mono_time_ts": 6218828496667,
                   "flags": 2,
                   "errno": 0,
                   "dscp": 0,
                   "pkt_drops": {
                       "packets": 0,
                       "bytes": 0,
                       "latest_flags": 0,
                       "latest_state": 0,
                       "latest_drop_cause": 0
                   },
                   "dns_record": {
                       "id": 0,
                       "flags": 0,
                       "latency": 0,
                       "errno": 0
                   },
                   "flow_rtt": 0,
                    ovs_dp_keys": [[0,0,0,0,12,0,255,238
                       ],[0,0,0,0,0,0,0,0
                       ],[0,0,0,0,0,0,0,0
                       ],[0,0,0,0,0,0,0,0
                       ]

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jun 4, 2024

@msherif1234: This pull request references NETOBSERV-1637 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Description

OVS monitoring eBPF hook feature

ebpf configs to enable ovs tracking from operator

     advanced:
       env:
         ENABLE_OVS_MONITORING: "true"
- bpftool perf show -p
[{
       "pid": 2854,
       "fd": 8,
       "prog_id": 143,
       "fd_type": "kprobe",
       "func": "psample_sample_packet",
       "offset": 0
   }
]
- bpftool map dump id 258
"key": {
           "eth_protocol": 2048,
           "direction": 0,
           "src_mac": [2,220,231,139,148,213
           ],
           "dst_mac": [10,88,10,128,2,12
           ],
           "src_ip": [0,0,0,0,0,0,0,0,0,0,255,255,10,128,2,2
           ],
           "dst_ip": [0,0,0,0,0,0,0,0,0,0,255,255,10,128,2,12
           ],
           "src_port": 56546,
           "dst_port": 8080,
           "transport_protocol": 6,
           "icmp_type": 0,
           "icmp_code": 0,
           "if_index": 2
       },
       "values": [{
               "cpu": 0,
               "value": {
                   "packets": 1,
                   "bytes": 74,
                   "start_mono_time_ts": 6218828496667,
                   "end_mono_time_ts": 6218828496667,
                   "flags": 2,
                   "errno": 0,
                   "dscp": 0,
                   "pkt_drops": {
                       "packets": 0,
                       "bytes": 0,
                       "latest_flags": 0,
                       "latest_state": 0,
                       "latest_drop_cause": 0
                   },
                   "dns_record": {
                       "id": 0,
                       "flags": 0,
                       "latency": 0,
                       "errno": 0
                   },
                   "flow_rtt": 0,
                    ovs_dp_keys": [[0,0,0,0,12,0,255,238
                       ],[0,0,0,0,0,0,0,0
                       ],[0,0,0,0,0,0,0,0
                       ],[0,0,0,0,0,0,0,0
                       ]

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress jira/valid-reference ok-to-test To set manually when a PR is safe to test. Triggers image build on PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants