-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-1508 add selector, afinity and priority to components #569
Conversation
Makefile
Outdated
@@ -361,7 +361,7 @@ deploy: kustomize ## Deploy controller to the K8s cluster specified in ~/.kube/c | |||
$(SED) -i -r 's~ebpf-agent:.+~ebpf-agent:main~' ./config/manager/manager.yaml | |||
$(SED) -i -r 's~flowlogs-pipeline:.+~flowlogs-pipeline:main~' ./config/manager/manager.yaml | |||
$(SED) -i -r 's~console-plugin:.+~console-plugin:main~' ./config/manager/manager.yaml | |||
$(KUSTOMIZE) build config/openshift | sed -r "s/openshift-netobserv-operator\.svc/${NAMESPACE}.svc/" | kubectl apply -f - | |||
$(KUSTOMIZE) build config/openshift | sed -r "s/openshift-netobserv-operator\.svc/${NAMESPACE}.svc/" | kubectl create -f - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubectl folks recommend using apply over create IIRC they want to find ways to remove create
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We fall in the following issue keeping apply
here:
metadata.annotations: Too long: must have at most 262144 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is weird which md is it referring to here svc which version of kubectl u have ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was on an old version. Just updated to latest and still have the same issue:
Client Version: v1.29.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.9+e36e183
USER=jpinsonn VERSION=1508 make deploy
cd config/manager && /home/julien/dev/me/network-observability-operator/bin/kustomize edit set image controller=quay.io/jpinsonn/network-observability-operator:1508-3
sed -i -r 's~ebpf-agent:.+~ebpf-agent:main~' ./config/manager/manager.yaml
sed -i -r 's~flowlogs-pipeline:.+~flowlogs-pipeline:main~' ./config/manager/manager.yaml
sed -i -r 's~console-plugin:.+~console-plugin:main~' ./config/manager/manager.yaml
/home/julien/dev/me/network-observability-operator/bin/kustomize build config/openshift | sed -r "s/openshift-netobserv-operator\.svc/netobserv.svc/" | kubectl apply -f -
namespace/netobserv created
customresourcedefinition.apiextensions.k8s.io/flowmetrics.flows.netobserv.io created
serviceaccount/netobserv-controller-manager created
role.rbac.authorization.k8s.io/netobserv-leader-election-role created
role.rbac.authorization.k8s.io/netobserv-openshift-netobserv-operator-prometheus created
clusterrole.rbac.authorization.k8s.io/netobserv-manager-role created
clusterrole.rbac.authorization.k8s.io/netobserv-proxy-role created
rolebinding.rbac.authorization.k8s.io/netobserv-leader-election-rolebinding created
rolebinding.rbac.authorization.k8s.io/netobserv-openshift-netobserv-operator-prometheus created
clusterrolebinding.rbac.authorization.k8s.io/netobserv-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/netobserv-proxy-rolebinding created
configmap/netobserv-manager-config created
service/netobserv-metrics-service created
service/netobserv-webhook-service created
deployment.apps/netobserv-controller-manager created
servicemonitor.monitoring.coreos.com/netobserv-metrics-monitor created
validatingwebhookconfiguration.admissionregistration.k8s.io/netobserv-validating-webhook-configuration created
The CustomResourceDefinition "flowcollectors.flows.netobserv.io" is invalid: metadata.annotations: Too long: must have at most 262144 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that must be the alm-example
, I guess? Perhaps we should try to shrink that a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The corev1.Affinity
object is actually too complex and adds 1071 rows in the manifest for each components.
We can remove this object to solve the issue here but keep in mind that this may force customers to set custom labels on their nodes just for us.
Also, Pod Affinity / Anti Affinity
helps to get or avoid two pods in the same node. This is complementary to NodeSelector
.
I feel we will fall into that situation anyway in the future. WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if you rebase on #577 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested and same 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I did some search around this problem and this is something other people has brought up, apparently a common approach to mitigate that is to remove the description fields on embedded APIs (like Affinity or Autoscaler); there's a ticket for allowing this via kubebuilder but it's currently still open kubernetes-sigs/controller-tools#441 (since 2020).
I tried this simple workaround that seems to work well:
(in our Makefile under manifests
)
##@ Code / files generation
manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
$(CONTROLLER_GEN) $(CRD_OPTIONS) rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
$(YQ) -i 'del(.spec.versions[].schema.openAPIV3Schema.properties.spec.properties.processor.properties.kafkaConsumerAutoscaler.properties.metrics.items | .. | select(has("description")) | .description)' config/crd/bases/flows.netobserv.io_flowcollectors.yaml
$(YQ) -i 'del(.spec.versions[].schema.openAPIV3Schema.properties.spec.properties.consolePlugin.properties.autoscaler.properties.metrics.items | .. | select(has("description")) | .description)' config/crd/bases/flows.netobserv.io_flowcollectors.yaml
$(YQ) -i 'del(.spec.versions[].schema.openAPIV3Schema.properties.spec.properties.agent.properties.ebpf.properties.advanced.properties.affinity.properties | .. | select(has("description")) | .description)' config/crd/bases/flows.netobserv.io_flowcollectors.yaml
$(YQ) -i 'del(.spec.versions[].schema.openAPIV3Schema.properties.spec.properties.processor.properties.advanced.properties.affinity.properties | .. | select(has("description")) | .description)' config/crd/bases/flows.netobserv.io_flowcollectors.yaml
$(YQ) -i 'del(.spec.versions[].schema.openAPIV3Schema.properties.spec.properties.consolePlugin.properties.advanced.properties.affinity.properties | .. | select(has("description")) | .description)' config/crd/bases/flows.netobserv.io_flowcollectors.yaml
It makes the CRD shrinks from ~592kB to ~321kB so not too bad .. although this isn't a definitive solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and for the record, the issue comes from kubectl apply
injecting the last-applied-configuration
, which kubectl create
does not; it's this annotation that is too big since it contains the whole resource)
5edb2f8
to
13f6bd8
Compare
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:819d919 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-819d919 Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-819d919
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #569 +/- ##
==========================================
- Coverage 67.38% 66.98% -0.41%
==========================================
Files 65 65
Lines 7987 8108 +121
==========================================
+ Hits 5382 5431 +49
- Misses 2276 2331 +55
- Partials 329 346 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -52,6 +52,32 @@ func (r *FlowCollector) ConvertTo(dstRaw conversion.Hub) error { | |||
dst.Spec.Loki.Microservices = restored.Spec.Loki.Microservices | |||
dst.Spec.Loki.Manual = restored.Spec.Loki.Manual | |||
|
|||
if restored.Spec.Agent.EBPF.Advanced != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding more complex conversion logic (sometimes error prone), I find it generally easier to just backport a feature to previous versions. As long as we're not modifying or deleting existing parts of the API, I think it's just fine to backport it to v1beta1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting to add the whole advanced
sections to every components here ?
Currently in v1beta1
, only eBPF and flp components have debug
sections containing a single Env
map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no that's not what I was suggesting ... iirc advanced
brought some breaking changes so it had to go via a new API and could not be backported; I was more thinking about adding these fields at the component root like we used to do in beta1, but if you think it won't simplify the things, feel free to keep it like you did
pkg/helper/comparators.go
Outdated
@@ -57,6 +92,28 @@ func annotationsChanged(old, new *corev1.PodTemplateSpec, report *ChangeReport) | |||
return false | |||
} | |||
|
|||
func assignationChanged(old, new *corev1.PodTemplateSpec, report *ChangeReport) bool { | |||
if !equality.Semantic.DeepDerivative(old.Spec.NodeSelector, new.Spec.NodeSelector) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeepDerivative
normally takes the "new" param as a first argument and the "old" as the second; and it checks for stuff in "new" that wasn't in "old", but not the other way around. That's typically useful for annotations/labels, when other parts of the kube infra might add their own labels / annotations on pods and we don't want to interfere with that.
So here I believe either we want to use DeepDerivative
but then we should change the arguments order, or perhaps we actually want DeepEqual
here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave a try with DeepEqual
here and got a infinite reconcile loop. It seemed to work fine with DeepDerivative
but I'll give another try inverting the two args to see if I get differences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed your feedback and simplified the code in this commit: 4178913
Thanks !
Rebased without changes |
@jpinsonneau @msherif1234 : on the |
I guess we loose autocompletion in the UI with this approach right ? |
Yes .. or perhaps what we could do, instead of remove the descriptions, is to replace them with a link to these object's doc, such as https://docs.openshift.com/container-platform/4.15/rest_api/autoscale_apis/horizontalpodautoscaler-autoscaling-v2.html |
/lgtm |
/ok-to-test |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:5550579 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-5550579 Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-5550579
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
Rebased without changes |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:ca8234e make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-ca8234e Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-ca8234e
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
/label qe-approved |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description
This PR expose
NodeSelector
,Affinity
andPriorityClassName
on each netobserv component in their relativeadvanced
sections.https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
Example on a 2 Nodes cluster:
Set a custom annotation on a node:
Set nodeSelector on eBPF agent config:
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.