Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --cgroups parameter to jailer #2012

Merged
merged 2 commits into from Sep 7, 2020

Conversation

christian7007
Copy link
Contributor

@christian7007 christian7007 commented Jul 6, 2020

Reason for This PR

Increase cgroups integration flexibility for the Jailer.
Fixes #1937
Fixes #1617

Description of Changes

New --cgroups argument have been added to the Jailer. This argument takes care of setting the passed cgroups generically and check that the expected value is correctly written.

The Cgroup type have been modified in order to represent each of the cgroups passed by --cgroups command line argument.

  • This functionality can be added in rust-vmm.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license.

PR Checklist

[Author TODO: Meet these criteria.]
[Reviewer TODO: Verify that these criteria are met. Request changes if not]

  • All commits in this PR are signed (git commit -s).
  • The reason for this PR is clearly provided (issue no. or explanation).
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • Any newly added unsafe code is properly documented.
  • Any API changes are reflected in firecracker/swagger.yaml.
  • Any user-facing changes are mentioned in CHANGELOG.md.

@lauralt
Copy link

lauralt commented Jul 10, 2020

Hi, @christian7007! Thanks for the PR, we will take a closer look at it soon.

@lauralt lauralt added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Jul 10, 2020
Copy link
Contributor

@ioanachirca ioanachirca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1st pass, mostly nits :D will review the functionality in depth a bit later

CHANGELOG.md Outdated Show resolved Hide resolved
docs/jailer.md Outdated Show resolved Hide resolved
docs/jailer.md Outdated Show resolved Hide resolved
docs/jailer.md Outdated Show resolved Hide resolved
docs/jailer.md Outdated Show resolved Hide resolved
})
}

// This writes the cgroup value into the cgroup property file and attach the pid to the cgroup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This writes the cgroup value into the cgroup property file and attach the pid to the cgroup
// This writes the cgroup value into the cgroup property file and attaches the pid to the cgroup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But does it attach the pid? It seems there is a different method that does this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I forgot to change that comment, I'll update that.

src/jailer/src/cgroup.rs Outdated Show resolved Hide resolved
src/jailer/src/cgroup.rs Outdated Show resolved Hide resolved
src/jailer/src/cgroup.rs Outdated Show resolved Hide resolved
src/jailer/src/env.rs Outdated Show resolved Hide resolved
@christian7007
Copy link
Contributor Author

Thanks for the feedback @ioanachirca. I've added all the changes.

@acatangiu
Copy link
Contributor

Hi @christian7007 we're not sure if we should downright remove the --node support or follow some deprecation mechanism.

I am currently reaching out to some of our users to get some customer feedback and we will come back with a decision.

@christian7007
Copy link
Contributor Author

Hello @acatangiu,

No problem, if you need to proceed in a different way with the --node argument deprecation I can upgrade the PR and make it fits with the new Cgroups interface until it can be removed.

Copy link

@iulianbarbu iulianbarbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good overall. I have some concerns tho. How can we be sure that controllers separator, at the moment being comma, can be safely used? I noticed that using comma for separating multiple controllers does not fit when we have comma separated values. Have you looked on existing controllers properties values to see what can be safely used to separate controllers, but not interfere with values separators? We need to make sure that what is of interest for controlling the Firecracker process is fully supported.

CHANGELOG.md Outdated
@@ -68,6 +70,9 @@
`403 BadRequest`.
- Segregated MMDS documentation in MMDS design documentation and MMDS user
guide documentation.
- Jailer `--node` attribute has been replaced by `--cgroups`. In order to achieve the same
functionality, `--cgroups` can be used like this:
`--cgroups cpuset.mems=<node>,cpuset.cpus=$(cat /sys/devices/system/node/node<node>/cpulist)`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation, I see that controller.property=value are separated by comma. I think we want to support comma separated values as well, for a property of a controller. An example is the case for cpuset.cpus, where we can have multiple comma separated values. E.g: 0,2,8-15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @iulianbarbu, you are totally right. I didn't take that case into account. Probably it's better to use a different separator like a semicolon. I'll take a closer look on the supported values and I'll update the PR changing the separator.

Thanks for the feedback!

@iulianbarbu iulianbarbu added Status: Author and removed Status: Awaiting review Indicates that a pull request is ready to be reviewed labels Jul 22, 2020
@acatangiu
Copy link
Contributor

Hello @acatangiu,

No problem, if you need to proceed in a different way with the --node argument deprecation I can upgrade the PR and make it fits with the new Cgroups interface until it can be removed.

We decided to go with a deprecation mechanism here. For the purpose of this PR it is enough to keep both --node and --cgroup parameters functional, and we will add a deprecation mechanism for --node on top of that.

@christian7007
Copy link
Contributor Author

Hello @iulianbarbu and @acatangiu, sorry for the late reply I've been on vacation last days. I'll take care of submitting commits addressing both changes. The support for --node attribute and the separator change.

@iulianbarbu
Copy link

Hello @iulianbarbu and @acatangiu, sorry for the late reply I've been on vacation last days. I'll take care of submitting commits addressing both changes. The support for --node attribute and the separator change.

Cool! Let us know if you need any help or feedback.

@christian7007
Copy link
Contributor Author

Hello @iulianbarbu, I've been thinking and making some test regarding changing the separator. I think that probably using ; as separator could be a bad idea too, as it might be interpreted by bash. I think that maybe we could use a different approach, something like:

jailer ... --cgroup cpuset.mems=0 --cgroup cpuset.cpus=1-3,4 ... 

This way we use different arguments as separator, it seems cleaner to me. I've prepare a commit with a prototype for supporting this into the arg parser: christian7007@979873f

What do you think about it? If you think this solution is good enough I can open a new PR with the parser changes and work over it.

@lauralt
Copy link

lauralt commented Aug 12, 2020

Hello @iulianbarbu, I've been thinking and making some test regarding changing the separator. I think that probably using ; as separator could be a bad idea too, as it might be interpreted by bash. I think that maybe we could use a different approach, something like:

jailer ... --cgroup cpuset.mems=0 --cgroup cpuset.cpus=1-3,4 ... 

This way we use different arguments as separator, it seems cleaner to me. I've prepare a commit with a prototype for supporting this into the arg parser: christian7007@979873f

What do you think about it? If you think this solution is good enough I can open a new PR with the parser changes and work over it.

Hi, @christian7007. I just took a look at this thread, but I might still be missing some context sorry :D. Anyway, regarding the parser changes, I do not quite like being able to set multiple values for an argument by repeating that argument (but it is indeed a pretty subjective opinion). It would become confusing to allow both arguments with just a value (we do not overwrite, but return a DuplicateArgument error if we repeat such argument, e.g. --secomp-level 1 --secomp-level 2) and arguments with more values, in this form: --cgroup pair1 --cgroup pair2.

Basically, you want to pass a list as value to the --cgroups parameter, I think it would look better like this --cgroups pair1 pair2or --cgroups pair1<some_delimiter>pair2 (I prefer the first option).
Also, since it seems the cgroups argument can take many key=value pairs, another idea that would simplify the parsing would be to store these pairs in a file (--cgroups-file some_file).

@iulianbarbu
Copy link

iulianbarbu commented Aug 12, 2020

Hello @iulianbarbu, I've been thinking and making some test regarding changing the separator. I think that probably using ; as separator could be a bad idea too, as it might be interpreted by bash.

Yes, sure. I think a workaround for this would be to require escaping: --cgroups "cpuset.mems=0;cpuset.cpus=1,2,8-10".

I think that maybe we could use a different approach, something like:

jailer ... --cgroup cpuset.mems=0 --cgroup cpuset.cpus=1-3,4 ... 

This way we use different arguments as separator, it seems cleaner to me. I've prepare a commit with a prototype for supporting this into the arg parser: christian7007@979873f

What do you think about it? If you think this solution is good enough I can open a new PR with the parser changes and work over it.

This might work. I do not have a strong preference for one or the other. As @lauralt pointed, we must make sure we handle properly cases which map to DuplicateArgumentError. Such cases can happen for both available alternatives. Other than that, I would gladly appreciate any input from the others reviewers. I am not sure if there is a best way to choose between both alternatives, but I think there are some guidelines which we can use, like:

  • comparing the final user experience for both alternatives in terms of clarity and simplicity.
  • arguments combination corner cases handled properly.
  • significant performance turnoffs, when we compare the two alternatives, since their on the hot path for starting a microVM.
  • overall implementation details.

I guess we want to give it some thought, and then align on it. Let me have it discussed inside the team. I'll come back with an update soon.

@christian7007
Copy link
Contributor Author

Hello @lauralt, I agree that the best solution would be to use a separator, the problem here is that common used separator might interfere with the functionality (e.g it can be used by cgroups for something else or be interpreted by bash). That's why I'm trying to find a different approach.

Yes, sure. I think a workaround for this would be to require escaping: --cgroups "cpuset.mems=0;cpuset.cpus=1,2,8-10".

You're right @iulianbarbu, but IMHO it doesn't look too clean, also these escapes could be a bit annoying when the command generation/execution is automated.

I agree with the guidelines and with giving a thought to make sure that the interface we are defining is good enough to avoid future breaking changes. So, just let me know when you and the team have an update and I'll keep working on this.

@lauralt
Copy link

lauralt commented Aug 12, 2020

Hello @lauralt, I agree that the best solution would be to use a separator, the problem here is that common used separator might interfere with the functionality (e.g it can be used by cgroups for something else or be interpreted by bash). That's why I'm trying to find a different approach.

Hmm, could a " " separator be used by cgroups for something else? I would expect to not have spaces inside a key, value pair.
Also, in case we accept that the value, for example, can contain spaces, wouldn't this be a problem in the --cgroup pair1 --cgroup pair2 solution that you proposed too, i.e. you would keep only the first part of the value?

@christian7007
Copy link
Contributor Author

I'm not sure if the " " is used for something else by cgroups (probably not, but I can try to find out). Anyway, in case we need to support that specific case we can try to find a workaround like:

--cgroups "cpuset.cpus=1 2<delimiter>cpuset.mems=0"

or like:

--cgroup "cpuset.cpus=1, 2" --cgroup cpuset.mems=0

And I guess that once we have read the entire string that shouldn't be a problem any more, as it will fit to the format <cgroup_controller>.<cgroup_property>=<value> being <value> every character after the = including white spaces.

@christian7007
Copy link
Contributor Author

christian7007 commented Aug 14, 2020

Hello @acatangiu, I've just pushed the commit restoring --node functionality (47ea8c1). It takes care of creating the corresponding cgroups to isolate the process in the specified node. I'd like to mention that previously, the jailer, apart from isolating the proces in the NUMA node, it also register the PID for some cgroups controllers (cpu, cpuset and pids) so people can modified the cgroups after the jailer was executed, From the jailer docs:

The jailer then writes the current pid to /sys/fs/cgroup/cpu/firecracker/551e7604-e35c-42b3-b825-416853441234/tasks, /sys/fs/cgroup/cpuset/firecracker/551e7604-e35c-42b3-b825-416853441234/tasks, and /sys/fs/cgroup/pids/firecracker/551e7604-e35c-42b3-b825-416853441234/tasks

With my current changes this is only done when a cgroup is set for a specific controller, so if only --node is used the PID will be only registered for cpuset controller (which is the one used for the NUMA node isolation).

I don't know if we should take care of it too, and follow some deprecation mode for this specific behavior too.

Regarding the CI tests, I'm not sure if the Internal CI error of aarch64 is related with my changes (it doesn't show any error related information). For the x86_64 error, it's related with using , as separator, so it should be fixed once we decide how to proceed on this issue.

@iulianbarbu iulianbarbu added Status: Awaiting review Indicates that a pull request is ready to be reviewed and removed Status: Author labels Aug 17, 2020
@iulianbarbu
Copy link

Hi, @christian7007 !
We've decided to move forward with the alternative where you have multiple --cgroup flags:
E.g --cgroup "cpuset.cpus=1, 2" --cgroup cpuset.mems=0,1 etc.

Let us know if we can help with anything.

@iulianbarbu iulianbarbu removed the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Aug 19, 2020
@christian7007
Copy link
Contributor Author

Hello @iulianbarbu, I've pushed a commit (5c5809c) with the requested changes. Please take a look at it when you have some time, if the changes looks good to you I'll squash all the commits in two.

Copy link

@iulianbarbu iulianbarbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits and a suggestion for changing a comment.

Feel free to create the final two commits. I'll review each of them once more and then approve it if there is no other issue. Good job! 👍

docs/jailer.md Outdated Show resolved Hide resolved
src/jailer/src/cgroup.rs Outdated Show resolved Hide resolved
src/utils/src/arg_parser.rs Outdated Show resolved Hide resolved
tests/framework/jailer.py Outdated Show resolved Hide resolved
tests/framework/jailer.py Outdated Show resolved Hide resolved
Copy link

@iulianbarbu iulianbarbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks. Feel free to compact the changes in those two commits.

@christian7007
Copy link
Contributor Author

Perfect! Thanks for the feedback. I'm running the tests locally after changing this: #2012 (comment) to make sure I haven't break anything (nice catch btw). I'll update the PR as soon as tests are OK.

Copy link

@iulianbarbu iulianbarbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explicitly sign each commit:
git commit -s --amend.

@christian7007
Copy link
Contributor Author

Please explicitly sign each commit:
git commit -s --amend.

Done!

iulianbarbu
iulianbarbu previously approved these changes Aug 28, 2020
gbionescu
gbionescu previously approved these changes Aug 31, 2020
@gbionescu gbionescu mentioned this pull request Sep 1, 2020
3 tasks
@acatangiu
Copy link
Contributor

@christian7007 looks like the PR is ready for merging. There's only a problem of solving a test_coverage.py conflict.
We could do that ourselves if that's fine with you.

@christian7007
Copy link
Contributor Author

@christian7007 looks like the PR is ready for merging. There's only a problem of solving a test_coverage.py conflict.
We could do that ourselves if that's fine with you.

Sure, feel free to fix it.

@gbionescu gbionescu dismissed stale reviews from iulianbarbu and themself via 048bc3b September 4, 2020 14:22
iulianbarbu
iulianbarbu previously approved these changes Sep 7, 2020
@gbionescu
Copy link

Reworded the commit since it mentioned ARM instead of AMD.

iulianbarbu
iulianbarbu previously approved these changes Sep 7, 2020
gbionescu
gbionescu previously approved these changes Sep 7, 2020
Christian González added 2 commits September 7, 2020 15:12
It adds support for arguments which can appear multiple times
(e.g jailer --feature f1 --feature f2). This is useful when
multiple features of the same type needs to be set via CLI
arguments.

Signed-off-by: Christian González <cgonzalez@opennebula.io>
New --cgroup argument has been added to the jailer. It adds more
flexibility on how cgroups are set by the jailer.

   - The Cgroup type has been redefined to wrap a cgroup to be set.
   - The test and documentation have been updated to reflect these
     changes

Signed-off-by: Christian González <cgonzalez@opennebula.io>
@iulianbarbu iulianbarbu merged commit 79d9c3d into firecracker-microvm:master Sep 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More flexible interaction with cgroups Increase NUMA selection options
6 participants