Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force PAO installation on master nodes #351

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,13 @@ spec:
labels:
name: performance-operator
spec:
affinity:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems OCP control plane is using

    nodeSelector:
      node-role.kubernetes.io/master: ""

but the end result is the same and affinity rules are more powerful than nodeSelector, so it seems is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I preferred affinity since is is reflected in the official docs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is affinity reflected in the openshift official docs or the kubernetes official docs? or both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both

nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
containers:
- command:
- performance-operator
Expand All @@ -137,6 +144,9 @@ spec:
name: performance-operator
resources: {}
serviceAccountName: performance-operator
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
permissions:
- rules:
- apiGroups:
Expand Down
10 changes: 10 additions & 0 deletions deploy/operator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,16 @@ spec:
name: performance-operator
spec:
serviceAccountName: performance-operator
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: performance-operator
# Replace this with the built image name
Expand Down
23 changes: 23 additions & 0 deletions functests/0_config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@ import (
"fmt"
"io/ioutil"
"os"
"strings"
"time"

. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"
. "github.com/onsi/gomega/gstruct"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used by Reject()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. And BTW this is exacty why I dislike the dot imports.


corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
Expand All @@ -27,6 +29,8 @@ import (
testclient "github.com/openshift-kni/performance-addon-operators/functests/utils/client"
"github.com/openshift-kni/performance-addon-operators/functests/utils/discovery"
"github.com/openshift-kni/performance-addon-operators/functests/utils/mcps"
"github.com/openshift-kni/performance-addon-operators/functests/utils/nodes"
"github.com/openshift-kni/performance-addon-operators/functests/utils/pods"
"github.com/openshift-kni/performance-addon-operators/functests/utils/profiles"
"github.com/openshift-kni/performance-addon-operators/pkg/apis"
performancev1 "github.com/openshift-kni/performance-addon-operators/pkg/apis/performance/v1"
Expand All @@ -37,6 +41,25 @@ import (

var _ = Describe("[performance][config] Performance configuration", func() {

It("Should run performance profile pod on a master node", func() {
pod, err := pods.GetPerformanceOperatorPod()
Expect(err).ToNot(HaveOccurred(), "Failed to find the Performance Addon Operator pod")

Expect(strings.HasPrefix(pod.Name, "performance-operator")).To(BeTrue(),
"Performance Addon Operator pod name should start with performance-operator prefix")

masterNodes, err := nodes.GetByRole(testutils.RoleMaster)
Expect(err).ToNot(HaveOccurred(), "Failed to query the master nodes")
for _, node := range masterNodes {
marcel-apf marked this conversation as resolved.
Show resolved Hide resolved
if node.Name == pod.Spec.NodeName {
return
}
}

// Fail
Expect(true).To(Reject(), "Performance Addon Operator is not running in a master node")
})

It("Should successfully deploy the performance profile", func() {

performanceProfile := testProfile()
Expand Down
2 changes: 2 additions & 0 deletions functests/utils/consts.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ func init() {
const (
// RoleWorker contains the worker role
RoleWorker = "worker"
// RoleMaster contains the master role
RoleMaster = "master"
)

const (
Expand Down
22 changes: 22 additions & 0 deletions functests/utils/pods/pods.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@ import (
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/kubernetes"
"sigs.k8s.io/controller-runtime/pkg/client"

testutils "github.com/openshift-kni/performance-addon-operators/functests/utils"
testclient "github.com/openshift-kni/performance-addon-operators/functests/utils/client"
Expand Down Expand Up @@ -143,3 +145,23 @@ func GetContainerIDByName(pod *corev1.Pod, containerName string) (string, error)
}
return "", fmt.Errorf("failed to find the container ID for the container %q under the pod %q", containerName, pod.Name)
}

// GetPerformanceOperatorPod returns the pod running the Performance Profile Operator
func GetPerformanceOperatorPod() (*corev1.Pod, error) {
selector, err := labels.Parse(fmt.Sprintf("%s=%s", "name", "performance-operator"))
if err != nil {
return nil, err
}

pods := &corev1.PodList{}

opts := &client.ListOptions{LabelSelector: selector, Namespace: testutils.PerformanceOperatorNamespace}
if err := testclient.Client.List(context.TODO(), pods, opts); err != nil {
return nil, err
}
if len(pods.Items) != 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just use len(pods.Items) < 1 (just in case we eventually add HA and leader election). Or maybe it would be premature to change the test now, so just asking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, I prefer to keep the test as-is and see it break first when/if we add the HA/leader election.

return nil, fmt.Errorf("incorrect performance operator pods count: %d", len(pods.Items))
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any way to check that pod is indeeded running performance-operator and not any random pod which is happen to have the same label?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the same namespace as PAO "openshift-performance-addon" ? I think is a relatively safe bet...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still a bet, meaning is still anecdotal evidence, and we need quite some of it to gain confidence. Checking the image name seems stronger (does it contain 'performance-operator'). Feel free to add more checks, or to lookup better ones.
The best way would probably be to check a PAO endpoint and to verify it is behaving correctly; that's sufficient proof the service is there. It's probably complex, maybe too complex to be done here, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also checking we have only one pod in the namespace. The check is the first step in the func tests, I suppose the next steps will fail if for some reason the PAO is not in the namespace, but another pod with same label.
While I agree that theoretically is not enough, is it the place to add more checks?
I can add a check the actual pod name starts with performace-operator-{something}, but this is a side effect on how k8s names pods.Or maybe would add to confidence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check the performance operator pod name (not label) starts with "performance-operator"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should increase the confidence in the bet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not completely happy with this solution, but good enough for now.

return &pods.Items[0], nil
}