Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make group ID configurable #11924

Merged
merged 6 commits into from
May 23, 2024

Conversation

nico151999
Copy link
Contributor

Problem
Currently, the sidecar containers do not include a runAsGroup attribute. In some companies, there are security guidelines requiring the runAsGroup attribute to be set.

Solution
This PR causes the group ID to be set and adds the option to set it manually. It depends on this other PR.

Validation
I built the proxy init image with the code in my other pull request and loaded it into my k3d cluster in the dev container. I built the images and loaded them into the k3d cluster as well. Then, I installed the CRDs, Linkerd and Linkerd Viz. I deployed Emojivoto without the config.linkerd.io/proxy-gid annotation being set and then with it being set to check if the group ID changes accordingly in the respective containers' security context. Note that I had to tweak my setup a little bit in order to make it work in WSL2 behind an HTTP proxy, so my setup might differ from the reference setup. Still, I assume that my tweaks did not have an impact on the outcome of the tests.

Fixes #11773

Signed-off-by: Nico Feulner nico.feulner@gmail.com

@nico151999 nico151999 requested a review from a team as a code owner January 12, 2024 19:04
@nico151999 nico151999 marked this pull request as draft January 15, 2024 10:09
@nico151999
Copy link
Contributor Author

As stated in the PR description the PR depends on this. I have just marked the PR as draft so that it won't be merged accidently at some point before the updated image tag for proxy-init can be used. We should focus on the proxy-init PR first and get it released. Then, we can focus on this PR.

Copy link

stale bot commented Apr 15, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Apr 15, 2024
@nico151999 nico151999 force-pushed the nico151999/make-group-id-configurable branch from b6f5b36 to 60d6518 Compare April 17, 2024 12:33
@stale stale bot removed the wontfix label Apr 17, 2024
@nico151999 nico151999 force-pushed the nico151999/make-group-id-configurable branch from 60d6518 to 4000715 Compare April 17, 2024 12:45
@nico151999
Copy link
Contributor Author

Now as the changes to proxy-init are merged I would like to give this PR attention again. I have just updated it so that the comits that have been made in the meantime are included. Therefore, I would like to ask the maintainers to review the changes now. @mateiidavid @kflynn I don't know how much you can do here (i.e. if changes to the linkerd2 repos are part of your responsibility) but maybe you can take a look. Thanks!
I would also like to point out that I would recommend not to make a new release before a new version of proxy-init is released. Running the changes in this PR without the updates to proxy-init does not work since the group ID flag is expected to be available. Thanks in advance for having a look!

@nico151999 nico151999 marked this pull request as ready for review April 17, 2024 12:56
Copy link
Member

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nico151999 I put up a PR (#12462) to bump the proxy-init deps in the linkerd2 repo.

Most of the changes look good to me. I mentioned in a comment that my preference would be to have this as something users explicitly opt-in to for their components.

@@ -182,6 +182,8 @@ proxy:
request: ""
# -- User id under which the proxy runs
uid: 2102
# -- Group id under which the proxy runs
gid: 2102
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an alternative solution here where we introduce a gid in the manifests explicitly? Your proxy-init change treats gids as optional. Being a bit more risk adverse, I'd prefer it if gids were not implicitly used here.

The only problem I can see with treating gids as optional is that it results in a bigger configuration surface area. Is there anything we can do to keep the config scope narrow and allow people to explicitly use a gid? For example, setting it as 0 would work, but from a UX standpoint seeing gid: 0 in the values.yaml file is not something I'd be a big fan of. I'm sure Helm supports optional args though in some capacity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a special value like -1 for this purpose and add a comment explaining it. Alternatively, we could do something like

gid:
  customise: true # tells if the gid should be set to the custom value provided
  value: 2102

When I write charts I personally prefer the latter style. So unless you intervene I would implement it this way. What are your thoughts @mateiidavid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to quickly implement it as I suggested. Just take a look and let me know what you think. Thanks!

@nico151999 nico151999 force-pushed the nico151999/make-group-id-configurable branch 2 times, most recently from 552ffd8 to 59fd163 Compare April 24, 2024 07:54
Copy link
Member

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nico151999 I like the way you structured the config. There's a "but" coming :) I think we'd rather avoid having a separate customise: false (or enabled: true) option. I don't think our config is set-up well enough to support it at the moment; it's an unfortunate side effect for this PR. It's also hard to keep things consistent with other similar parts of our config because we want to introduce optionality here...

Here's an idea. Can we use KindIs to infer whether a value has been set for the respective field? We tried it out locally and it seemed to work. A more concrete example:

# values.yaml
proxy:
  gid:

# template
{{- if kindIs "int" .Values.proxy.gid }}
- --gid
- {{.Values.proxy.gid }}

cc @alpeb

Comment on lines 24 to 28
proxyGID:
# -- Tells if the customisation should be applied or not
customise: false
# -- Group id under which the proxy shall be ran
value: 2102
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we nest gid under each specific container/app?

proxy:
  gid:

controller:
  gid:

proxyInit:
  gid:

Comment on lines 438 to 444
if proxy.GID.Customise && proxy.GID.Value != baseProxy.GID.Value {
overrideAnnotations[k8s.ProxyGIDAnnotation] = strconv.FormatInt(proxy.GID.Value, 10)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have an explicit check against 0. Using group: 0 might be a bit more dangerous since packets can be generated by other processes that run under the same group, and they will end not being redirected. This will all be really hard to debug.

@@ -20,6 +20,12 @@ logLevel: info
portsToRedirect: ""
# -- User id under which the proxy shall be ran
proxyUID: 2102
# -- Optional customisation of the group id under which the proxy shall be ran
Copy link
Member

@mateiidavid mateiidavid Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to document this a bit more to prevent people from misusing the field.

  1. We should note that proxy's GID should be unique. No other container should run under the same group as the proxy (including the control plane's containers). Once the group ID is shared, packets can be skipped for processes that are not the proxy and it will result in hard to diagnose issues.
  2. We should consider using the fail function to fail-fast if the proxy's gid == controller gid.

Open to any other suggestions if you have any. Our biggest concern here would be that users unknowingly set the same gid across all containers.

@@ -29,6 +29,7 @@ type (
ControllerImage string `json:"controllerImage"`
ControllerReplicas uint `json:"controllerReplicas"`
ControllerUID int64 `json:"controllerUID"`
ControllerGID *Customisable[int64] `json:"controllerGID"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change gid to be nullable, we could probably just take a pointer to an int here instead of introducing a new type.

@nico151999 nico151999 force-pushed the nico151999/make-group-id-configurable branch from 5f0885c to 6adbccb Compare April 26, 2024 15:29
@mateiidavid
Copy link
Member

@nico151999 thanks for making the changes! I'm afraid I might've led you down the wrong path here. When I tested the change I realised the documentation will look weird... on second thought it might be better to just do gid: -1 and call it a day.

@nico151999 nico151999 force-pushed the nico151999/make-group-id-configurable branch 4 times, most recently from 400d62d to 26cf299 Compare May 14, 2024 17:30
@alpeb alpeb force-pushed the nico151999/make-group-id-configurable branch from b7732db to c765eb5 Compare May 22, 2024 20:15
As discussed with @mateiidavid I refactored the PR so that it now allows to omit the group ID with a negative value instead of null.

Signed-off-by: Nico Feulner <nico.feulner@gmail.com>
@alpeb alpeb force-pushed the nico151999/make-group-id-configurable branch from c765eb5 to 572a6d1 Compare May 22, 2024 22:57
@alpeb
Copy link
Member

alpeb commented May 23, 2024

I took the liberty of rebasing with main and pushing some changes/fixes:

  • Reworded the comment for tap.GID (and the pre-existing tap.UID one that was also wrong)
  • Undid some changes to test/integration/install/inject/inject_test.go for the test to pass, given the control plane is installed with the default proxy.guid=-1 in that context.

I'll do some manual testing tomorrow; looking good!

@nico151999
Copy link
Contributor Author

Thanks a lot @alpeb

@@ -125,6 +127,9 @@ metricsAPI:
# -- UID for the metrics-api resource
UID:

# -- GID for the metrics-api resource
gID:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nitpick but we should go with gID or GID (below) for consistency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yea good catch, @nico151999 you wanna get that addressed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the change so we can include this in today's edge 😉

Copy link
Member

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 thanks @nico151999 and @alpeb

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

@nico151999
Copy link
Contributor Author

Sorry for the late response. I'm on a short hiking trip and can't test the changes but they look good to my bare eyes and the tests seem to pass. So I would say we are good to go. Thanks for the effort today @alpeb

@alpeb alpeb merged commit 3d67459 into linkerd:main May 23, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow configuring runAsGroup attribute of sidecar and init containers
3 participants