Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1426: detect external workloads / openshift subnets #559

Merged
merged 1 commit into from
Apr 3, 2024

Conversation

jotak
Copy link
Member

@jotak jotak commented Feb 5, 2024

Description

New fields in FlowCollector API to enable detection of openshift subnets (hence be able to detect cluster-external IPs), and allow custom labelling based on IPs / subnets

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Feb 5, 2024

@jotak: This pull request references NETOBSERV-1426 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Description

New fields in FlowCollector API to enable detection of openshift subnets (hence be able to detect cluster-external IPs), and allow custom labelling based on IPs / subnets

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

openshift-ci bot commented Feb 5, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

codecov bot commented Feb 5, 2024

Codecov Report

Attention: Patch coverage is 30.88235% with 188 lines in your changes are missing coverage. Please review.

Project coverage is 65.68%. Comparing base (b61b9d5) to head (74783d0).
Report is 2 commits behind head on main.

Files Patch % Lines
controllers/flp/flp_controller.go 32.53% 53 Missing and 3 partials ⚠️
...s/flowcollector/v1beta1/zz_generated.conversion.go 24.00% 36 Missing and 2 partials ⚠️
...pis/flowcollector/v1beta1/zz_generated.deepcopy.go 0.00% 35 Missing ⚠️
...pis/flowcollector/v1beta2/zz_generated.deepcopy.go 8.57% 30 Missing and 2 partials ⚠️
controllers/flp/flp_pipeline_builder.go 47.82% 22 Missing and 2 partials ⚠️
controllers/consoleplugin/consoleplugin_objects.go 0.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #559      +/-   ##
==========================================
- Coverage   66.95%   65.68%   -1.28%     
==========================================
  Files          65       65              
  Lines        8116     8363     +247     
==========================================
+ Hits         5434     5493      +59     
- Misses       2334     2512     +178     
- Partials      348      358      +10     
Flag Coverage Δ
unittests 65.68% <30.88%> (-1.28%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jotak
Copy link
Member Author

jotak commented Feb 20, 2024

This is ready for review. Here's an example:

Capture d’écran du 2024-02-20 12-23-49
using this config:

# ...
  processor:
    subnetLabels:
      openShiftAutoDetect: true
      customLabels:
      - cidrs: ["172.217.20.164/32"]
        name: "Google"

... and then running some curl from within a pod to see Google appearing

To detect external traffic, we can now query with filter on Src/DstSubnetLabel=""

CIDRs: podCIDRs,
})
}
svcCIDRs := network.Spec.ServiceNetwork
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u need to loop over serviceNetwork cidrs similar to what u did for clusterNetwork
Adding CNO yaml for reference

oc get networks.config.openshift.io cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Network
metadata:
  creationTimestamp: "2024-02-20T13:08:01Z"
  generation: 2
  name: cluster
  resourceVersion: "5182"
  uid: 336cbbd1-dfb4-4c0a-8bee-017777a48093
spec:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  externalIP:
    policy: {}
  networkType: OVNKubernetes
  serviceNetwork:
  - 172.30.0.0/16
status:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  clusterNetworkMTU: 1360
  networkType: OVNKubernetes
  serviceNetwork:
  - 172.30.0.0/16

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlike clusterNetwork, serviceNetwork is already a []string so I take it as is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't implemented it yet, asking some details to the SDN team bc it's in a different API object and I'm not sure what can be considered the source of truth

// SubnetLabel allows to label subnets and IPs, such as to identify cluster-external workloads or web services.
type SubnetLabel struct {
// List of CIDRs, such as `["1.2.3.4/32"]`.
//+required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u need CIDR api verification here ? this is an example
https://github.com/openshift/api/blob/master/network/v1/types.go#L34

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks yeah I'll look into it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this pattern only matches IPv4, and we allow IPv6 here. Adding regex for ipv6 is quite more complicated; I'd rather go with a validation webhook (there is a future task to implement that)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this the check for the resource u read to pull cidrs from in CNO, I remember there were extension to verify IP CIDR that was under dev I will see if that is already there in such case that will be very light weight compared to verification webhook

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is new cel check for CIDR pls check this slack thread
https://redhat-internal.slack.com/archives/C3VS0LV41/p1708517668688849

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed on slack, using the CIDR check would break compatibility on older k8s/ocp. We need to wait that our last supported version has it

// `openShiftAutoDetect` allows, when set to `true`, to detect automatically the machines, pods and services subnets based on the
// OpenShift install configuration and the Cluster Network Operator configuration.
//+optional
OpenShiftAutoDetect *bool `json:"openShiftAutoDetect,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a default ? false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keeping nil as a default makes it more flexible, imagine if in the future we want to enable it by default, then we'll be able to tell "if it's nil => enabled" .. which will also work for folks upgrading from a previous version.
If we set a default "false", and we later change the default, people upgrading will still have their old default

Copy link
Contributor

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add SrcSubnetLabel / DstSubnetLabel as loki label ? 😸

CIDRs: podCIDRs,
})
}
svcCIDRs := network.Spec.ServiceNetwork
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jotak
Copy link
Member Author

jotak commented Mar 25, 2024

@jpinsonneau I added the externalIP one like you suggested. According to the docs, it's generally only used in bare-metal clusters; but still doesn't hurt to add.

Copy link
Contributor

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @jotak

Do you think that would be useful in CLI ? That could be part of the scripts to configure FLP properly

@jotak
Copy link
Member Author

jotak commented Mar 25, 2024

Do you think that would be useful in CLI ? That could be part of the scripts to configure FLP properly

I guess yes, for instance an interesting use case for the CLI and this feature could be "show me all the external traffic" ie. traffic with empty Src/DstSubnetLabel ; that implies to do also in CLI the extraction of the ocp Network cidrs & reinjection in FLP config

@jpinsonneau
Copy link
Contributor

Created https://issues.redhat.com/browse/NETOBSERV-1578 for CLI

@nathan-weinberg
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Mar 25, 2024
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:2f1a646
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-2f1a646
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-2f1a646

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:2f1a646 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-2f1a646

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-2f1a646
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@openshift-ci openshift-ci bot removed the lgtm label Mar 27, 2024
Copy link

openshift-ci bot commented Mar 27, 2024

New changes are detected. LGTM label has been removed.

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Mar 27, 2024
@jotak
Copy link
Member Author

jotak commented Mar 27, 2024

(rebased without change)

@codecov-commenter
Copy link

codecov-commenter commented Mar 27, 2024

Codecov Report

Attention: Patch coverage is 30.88235% with 188 lines in your changes are missing coverage. Please review.

Project coverage is 65.68%. Comparing base (1888d8a) to head (5e09506).

❗ Current head 5e09506 differs from pull request most recent head ebc74bc. Consider uploading reports for the commit ebc74bc to get more accurate results

Files Patch % Lines
controllers/flp/flp_controller.go 32.53% 53 Missing and 3 partials ⚠️
...s/flowcollector/v1beta1/zz_generated.conversion.go 24.00% 36 Missing and 2 partials ⚠️
...pis/flowcollector/v1beta1/zz_generated.deepcopy.go 0.00% 35 Missing ⚠️
...pis/flowcollector/v1beta2/zz_generated.deepcopy.go 8.57% 30 Missing and 2 partials ⚠️
controllers/flp/flp_pipeline_builder.go 47.82% 22 Missing and 2 partials ⚠️
controllers/consoleplugin/consoleplugin_objects.go 0.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #559      +/-   ##
==========================================
- Coverage   66.95%   65.68%   -1.28%     
==========================================
  Files          65       65              
  Lines        8262     8363     +101     
==========================================
- Hits         5532     5493      -39     
- Misses       2381     2512     +131     
- Partials      349      358       +9     
Flag Coverage Δ
unittests 65.68% <30.88%> (-1.28%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nathan-weinberg
Copy link
Contributor

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved QE has approved this pull request label Mar 27, 2024
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Mar 27, 2024

@jotak: This pull request references NETOBSERV-1426 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Description

New fields in FlowCollector API to enable detection of openshift subnets (hence be able to detect cluster-external IPs), and allow custom labelling based on IPs / subnets

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Configure columns & filters for subnet labels

Fix reading machine network

Document overlaps between customLabels and autoDetect

Rebased & address feedback

- rebased / bump FLP
- read external ips config
- read from config.Network rather than operator.Network, as it's
  considered the best source of truth
@jotak
Copy link
Member Author

jotak commented Apr 3, 2024

(rebased without change)

@jotak
Copy link
Member Author

jotak commented Apr 3, 2024

/approve

Copy link

openshift-ci bot commented Apr 3, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Apr 3, 2024
@jotak jotak merged commit a2fe535 into netobserv:main Apr 3, 2024
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved jira/valid-reference qe-approved QE has approved this pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants