Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: performance: e2e: configure adaptive cpu in profile #884

Closed
wants to merge 4 commits into from

Conversation

shajmakh
Copy link
Contributor

We've been observing lately that some tests that involve disabling load balancing are failing (like 32646) because the expected result does not have specific anticipated CPUs. After investigation, it turns out that one factor is the profile configuration of the CPU distribution.

PAO functional tests configure fixed CPU values under the PP. This is considered misconfiguration, especially when the system has more than 4 CPUs, and there is no guarantee that the functionality of the performance profile controller will work adequately with not all cpus reflected in the CPU section in the PP.

To fix this, we check the actual CPU capacity on a worker node and divide that between reserved and isolated (required cpu fields in PP).

@shajmakh
Copy link
Contributor Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 14, 2023
Copy link
Contributor

openshift-ci bot commented Dec 14, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shajmakh
Once this PR has been reviewed and has the lgtm label, please assign dagrayvid for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -3,6 +3,7 @@ package __performance_config
import (
"context"
"fmt"
"github.com/openshift/cluster-node-tuning-operator/test/e2e/performanceprofile/functests/utils/nodes"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move it down along with the other relevant imports

return nil, fmt.Errorf("insufficient cpus on a node to create a valid performanceprofile for the tests, found %d cpus, expects at least 4", capacityCPU)
}
reserved := performancev2.CPUSet(fmt.Sprintf("0,%d", capacityCPU/2))
isolated := performancev2.CPUSet(fmt.Sprintf("1-%d,%d-%d", capacityCPU/2-1, capacityCPU/2+1, capacityCPU-1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are 2 reserved CPUs are enough for machine with a big amount of cpus?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause issues on BM because on HT enabled if we don't pick cpus from the same core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the same for isolated, In terms of upstream where there are only 4 cpus, No Numa, this works , but as we use this on multinuma systems with HT enabled, this gets complicated. I strongly suggest to use PPC . Otherwise you will have to duplicate what ppc does here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using PPC can work here, even though it will make the lane setup significantly more complex. But I agree we're reimplementing PPC.
The best IMO would be reuse part of PPC code, but we would need more detailed HW information to enable PPC.

//Get cpus capacity on a node; typically it is same for all nodes
workerNodes, err := nodes.GetByLabels(testutils.NodeSelectorLabels)
Expect(err).ToNot(HaveOccurred())
capacityCPU, _ := workerNodes[0].Status.Capacity.Cpu().AsInt64()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would validate that all workers are homogeneous and failed other wise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please; also, as unlikely it could be, let's handle the error here

Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I side with @mrniranjan here. The main problem is we're reimplementing PPC.
Best would be to either reuse (part of) PPC code or just use PPC as prerequiste before the testsuite starts.

I don't have strong opinions

//Get cpus capacity on a node; typically it is same for all nodes
workerNodes, err := nodes.GetByLabels(testutils.NodeSelectorLabels)
Expect(err).ToNot(HaveOccurred())
capacityCPU, _ := workerNodes[0].Status.Capacity.Cpu().AsInt64()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please; also, as unlikely it could be, let's handle the error here

func testProfile() (*performancev2.PerformanceProfile, error) {
//Get cpus capacity on a node; typically it is same for all nodes
workerNodes, err := nodes.GetByLabels(testutils.NodeSelectorLabels)
Expect(err).ToNot(HaveOccurred())
Copy link
Contributor

@ffromani ffromani Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please either use Expect in the code (but please use ExpectWithOffset) OR return error and let the caller handle - but let's not do both

return nil, fmt.Errorf("insufficient cpus on a node to create a valid performanceprofile for the tests, found %d cpus, expects at least 4", capacityCPU)
}
reserved := performancev2.CPUSet(fmt.Sprintf("0,%d", capacityCPU/2))
isolated := performancev2.CPUSet(fmt.Sprintf("1-%d,%d-%d", capacityCPU/2-1, capacityCPU/2+1, capacityCPU-1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using PPC can work here, even though it will make the lane setup significantly more complex. But I agree we're reimplementing PPC.
The best IMO would be reuse part of PPC code, but we would need more detailed HW information to enable PPC.

//should never reach but still
return nil, fmt.Errorf("insufficient cpus on a node to create a valid performanceprofile for the tests, found %d cpus, expects at least 4", capacityCPU)
}
reserved := performancev2.CPUSet(fmt.Sprintf("0,%d", capacityCPU/2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly recommend to use CPUSets and do logic with that instead of using just numbers

@ffromani
Copy link
Contributor

another option we could explore is somehow a middleground:

  1. get the cpu topology (same format as PPC) as parameter of the config lane.
  2. iff the cpu topology is given, THEN the code will assume all the workers have the same topology. It will be responsability of the lane maintainer to enforce this and to ship up to date topology data
  3. reuse (possibly refactoring the code) the PPC code to consume the given data to optimally compute (parts of) the PPC profile.

It is not fully automated (e.g. just run must-gather and let PPC do its magic) but would be a step in the right direction, would be more automated than it is now and should be deliverable quickly as fix.

@ffromani
Copy link
Contributor

another option we could explore is somehow a middleground:

1. get the cpu topology (same format as PPC) as parameter of the config lane.

2. iff the cpu topology is given, THEN the code will assume all the workers have the same topology. It will be responsability of the lane maintainer to enforce this and to ship up to date topology data

3. reuse (possibly refactoring the code) the PPC code to consume the given data to optimally compute (parts of) the PPC profile.

It is not fully automated (e.g. just run must-gather and let PPC do its magic) but would be a step in the right direction, would be more automated than it is now and should be deliverable quickly as fix.

however we will need to check how to actually ship the data and make it accessible to the 0_config suite

@shajmakh
Copy link
Contributor Author

@Tal-or @mrniranjan @ffromani thank you for the valuable reviews and comments. you highlighted legit points about using PPC. however, this PR is intended to address the blocker CI failure on cnf-feature-deploy, considering that d/s CI it is in fact using PPC to generate the profile. I'm researching to see if we can enhance this even more to satisfy all needs taking into account the urgency of this for u/s.

@shajmakh shajmakh changed the title performance: e2e: configure adaptive cpu in profile WIP: performance: e2e: configure adaptive cpu in profile Dec 15, 2023
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 15, 2023
@ffromani
Copy link
Contributor

ffromani commented Dec 15, 2023

@Tal-or @mrniranjan @ffromani thank you for the valuable reviews and comments. you highlighted legit points about using PPC. however, this PR is intended to address the blocker CI failure on cnf-feature-deploy, considering that d/s CI it is in fact using PPC to generate the profile. I'm researching to see if we can enhance this even more to satisfy all needs taking into account the urgency of this for u/s.

The urgency IS a factor indeed, but the main blocker for this approach is that we need to take into account HT and NUMA cpu affinity when deciding the reserved cpus. Otherwise the perfprof will be incorrect for other reasons, perhaps lesser reasons than the current state but not much better; the NUMA allocation is perhaps debatable, but HT is a requirement for which we have e2e tests for.

I understand both camps and I don't have strong opinions yet.

We've been observing lately that some tests that involve disabling load
balancing are failing (like 32646) because the expected result does not have
specific anticipated CPUs. After investigation, it turns out that one
factor is the profile configuration of the CPU distribution.

PAO functional tests configure fixed CPU values under the PP. This is considered misconfiguration, especially when the system has more than 4 CPUs, and there is no guarantee that the functionality of the performance profile controller will work adequately with not all cpus reflected in the CPU section in the PP.

To fix this, we check the actual CPU capacity on a worker node and
divide that between reserved and isolated (required cpu fields in PP).

Signed-off-by: shajmakh <shajmakh@redhat.com>
@shajmakh
Copy link
Contributor Author

shajmakh commented Jan 9, 2024

After reassessing the cost of the approach described here, I am closing this in favor of #909

@shajmakh shajmakh closed this Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants