Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide functionality to start infra containers on the specified set of CPUs #4459

Merged

Conversation

cynepco3hahue
Copy link

What type of PR is this?

/kind api-change
/kind bug
/kind ci
/kind cleanup
/kind dependency-change
/kind deprecation
/kind design
/kind documentation
/kind failing-test

/kind feature

/kind flake
/kind other

What this PR does / why we need it:

Infra containers can start on every online CPU under the system that can be a problem for the latency-sensitive workloads because every context switch can result in latency peaks.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Provide a new configuration flag to specify CPUs that will be used to run infra containers

@openshift-ci-robot openshift-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/feature Categorizes issue or PR as related to a new feature. labels Jan 6, 2021
@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 6, 2021
@openshift-ci-robot
Copy link

Hi @cynepco3hahue. Thanks for your PR.

I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

docs/crio.8.md Outdated
Comment on lines 214 to 215
**--infra-containers-cpus**="": CPU set to run infra containers, if not specified the CRI-O will use all online CPUs to run infra containers (default: "").
You can specify CPUs in the Linux CPU list format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I think --infra-cpus would work too
nit: we should describe that these should be the same as the guaranteed-cpus passed to the kubelet, if the user wants their workloads to truly have a unique cpuset

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, above we use infra-ctr so we should use that here --infra-ctrs-cpus

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@mrunalp mrunalp Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about infra-ctr-cpuset?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpus is too broad and could also refer to amount of cpu shares reserved or cpu quota if we want to add those in the future. So, I think it may be better to be explicit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, done

@@ -824,3 +839,15 @@ func StringSliceTrySplit(ctx *cli.Context, name string) []string {

return values
}

// cpuSetParse parses the the string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update this comment a bit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I forgot to do it, done

Comment on lines 450 to 453
func (r *Runtime) GetInfraCPUs() *cpuset.CPUSet {
return r.config.InfraContainersCPUs
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this really makes sense in the runtime object. Either in the server config (pkg/config/config.go) or in a special package makes more sense to me

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this function.

@haircommander Does it ok to have the infra_ctr_cpus parameter under the

[runtime]
...
infra_ctr_cpus=<something>
...

or should I move this parameter under the separate section(probably in the future we can have the same functionality for the memory pinning and more).

Comment on lines 351 to 353
if infraCPUs != nil {
specgen.SetLinuxResourcesCPUCpus(infraCPUs.String())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to set this here I believe. what this is doing is settting a new, non-infra container to use infra cpus, if it has cpus set. What we should do is set infra containers to be in those cpus (in server/sandbox_run_linux.go). Or, we can move the infra container to the cpus when the container is created (similar to how a container is in container_update_resources)

does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about this more, we should unconditionally set the infra container's cpuset if infra-ctr-cpus is set. we'll have to do that in server/sandbox_run_linux.go

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov
Copy link

codecov bot commented Jan 6, 2021

Codecov Report

Merging #4459 (10c9e49) into master (6517447) will decrease coverage by 0.05%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #4459      +/-   ##
==========================================
- Coverage   40.09%   40.03%   -0.06%     
==========================================
  Files         116      116              
  Lines        9378     9391      +13     
==========================================
  Hits         3760     3760              
- Misses       5212     5224      +12     
- Partials      406      407       +1     

@haircommander
Copy link
Member

/ok-to-test

@openshift-ci-robot openshift-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 6, 2021
@@ -225,6 +225,10 @@ the container runtime configuration.
**drop_infra_ctr**=false
Determines whether we drop the infra container when a pod does not have a private PID namespace, and does not use a kernel separating runtime (like kata).
Requies **manage_ns_lifecycle** to be true.

**infra-ctrs-cpus**=""
Determines the CPU set to run infra containers, if not specified the CRI-O will use all online CPUs to run infra containers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Determines the CPU set to run infra containers, if not specified the CRI-O will use all online CPUs to run infra containers.
Determines the CPU set to run infra containers. If not specified, the CRI-O will use all online CPUs to run infra containers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@cynepco3hahue cynepco3hahue force-pushed the provide_infra_cpu_flags branch 3 times, most recently from ce1fd99 to 428ce99 Compare January 7, 2021 16:20
@cynepco3hahue
Copy link
Author

/hold
I am still need to check it locally

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 7, 2021
@cynepco3hahue
Copy link
Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 7, 2021
@cynepco3hahue cynepco3hahue force-pushed the provide_infra_cpu_flags branch 2 times, most recently from 6417cff to fcf4242 Compare January 7, 2021 21:15
@haircommander
Copy link
Member

/retest

@haircommander
Copy link
Member

/retest

@openshift-ci-robot openshift-ci-robot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jan 12, 2021
@openshift-ci-robot openshift-ci-robot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Jan 12, 2021
@cynepco3hahue
Copy link
Author

/retest

1 similar comment
@cynepco3hahue
Copy link
Author

/retest

@@ -225,6 +225,11 @@ the container runtime configuration.
**drop_infra_ctr**=false
Determines whether we drop the infra container when a pod does not have a private PID namespace, and does not use a kernel separating runtime (like kata).
Requies **manage_ns_lifecycle** to be true.

**infra_ctr_cpuset**=""
Determines the CPU set to run infra containers. If not specified, the CRI-O will use all online CPUs to run infra containers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Determines the CPU set to run infra containers. If not specified, the CRI-O will use all online CPUs to run infra containers.
Determines the CPU set to run infra containers. If not specified, CRI-O will use all online CPUs to run infra containers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -794,6 +798,11 @@ func getCrioFlags(defConf *libconfig.Config) []cli.Flag {
EnvVars: []string{"CONTAINER_VERSION_FILE_PERSIST"},
TakesFile: true,
},
&cli.StringFlag{
Name: "infra-ctr-cpuset",
Usage: "CPU set to run infra containers, if not specified the CRI-O will use all online CPUs to run infra containers (default: '').",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Usage: "CPU set to run infra containers, if not specified the CRI-O will use all online CPUs to run infra containers (default: '').",
Usage: "CPU set to run infra containers, if not specified CRI-O will use all online CPUs to run infra containers (default: '').",

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Artyom Lukianov added 4 commits January 12, 2021 17:48
The new flag should provide a way to specify desired set
of CPUs used to run infra containers.

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
@haircommander
Copy link
Member

/retest

1 similar comment
@haircommander
Copy link
Member

/retest

@haircommander
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 12, 2021
@openshift-merge-robot openshift-merge-robot merged commit 3a90a17 into cri-o:master Jan 12, 2021
@haircommander
Copy link
Member

/cherry-pick release-1.20

@openshift-cherrypick-robot

@haircommander: #4459 failed to apply on top of branch "release-1.20":

Applying: Add new infra-containers-cpus to the CLI and config file
Applying: Add shell completion for infra-containers-cpu flag
Using index info to reconstruct a base tree...
M	completions/bash/crio
M	completions/fish/crio.fish
M	completions/zsh/_crio
Falling back to patching base and 3-way merge...
Auto-merging completions/zsh/_crio
CONFLICT (content): Merge conflict in completions/zsh/_crio
Auto-merging completions/fish/crio.fish
Auto-merging completions/bash/crio
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 Add shell completion for infra-containers-cpu flag
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants