Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve how citadel decides which set of namespaces to operate on [Issue #14828] #15503

Merged
merged 1 commit into from Aug 1, 2019
Merged

Improve how citadel decides which set of namespaces to operate on [Issue #14828] #15503

merged 1 commit into from Aug 1, 2019

Conversation

Monkeyanator
Copy link
Contributor

@Monkeyanator Monkeyanator commented Jul 12, 2019


name: Improve how Istio citadel decides which set of namespaces to operate on [Issue #14828]
about: Removed explicit-opt-in flag and introduced new targeting mechanism, based on labeling namespaces with the namespace of the desired Citadel instance

Relevant issue
Relevant design document
PR and comments from personal fork
Recorded Security WG Meeting Discussion on this

This PR introduces two new labels, ca.istio.io/env, which allows users to designate which namespaces (if any) Citadel should operate on, and ca.istio.io/override, which allows users to force Citadel to target or ignore a namespace (supersedes other activation rules). This PR also introduces a new environment variable, enableNamespacesByDefault, which determines the default behavior when the designated label is not found (i.e. whether or not Citadel instance should create secret for added ServiceAccount).

Retroactive secret creation and deletion are also included in this PR (i.e. enabling a namespace will create secrets in the namespace for existing SAs).

Note: this is a hefty PR, and I'd be fine to chop it into multiple, smaller PRs if that seems more manageable/reviewable (separating retroactive secret creation into a different PR / discussion might make sense). Further note: the configuration options are operator facing and will have to be documented.

[ ] Configuration Infrastructure
[ ] Docs
[ ] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Policies and Telemetry
[X] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastrcture

@Monkeyanator
Copy link
Contributor Author

@incfly @elevran @costinm fyi

sc.deleteSecret(name, namespace)
return
}
} else if err != nil { // in the case we couldn't retrieve namespace, we should proceed with cert refresh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nilness: tautological condition: non-nil != nil (from govet)

ktesting.NewCreateAction(nsSchema, "", createNS("false-to-true", map[string]string{})),
ktesting.NewCreateAction(saSchema, "false-to-true", createServiceAccount("test-sa", "false-to-true")),
ktesting.NewListAction(saSchema, schema.GroupVersionKind{}, "false-to-true", metav1.ListOptions{}),
ktesting.NewCreateAction(secretSchema, "false-to-true", ca.BuildSecret("test-sa", "istio.test-sa", "false-to-true", nil, nil, nil, nil, nil, IstioSecretType)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line is 163 characters (from lll)

@@ -14,6 +14,9 @@ citadelHealthCheck: false
# 90*24hour = 2160h
workloadCertTtl: 2160h

# Citadel default behavior if the citadel.istio.io/managed namespace label not found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ca.istio.io?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -14,6 +14,9 @@ citadelHealthCheck: false
# 90*24hour = 2160h
workloadCertTtl: 2160h

# Citadel default behavior if the citadel.istio.io/managed namespace label not found
unlabeledNamespacesManaged: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we opt-in to Citadel or opt-out by default?
I find opt-in behavior easier to reason as I assume existing namespaces would not be labeled accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't opt-out match the current behavior (i.e. manageUnlabeled=true as the default)? As it is, Citadel will operate on all namespaces it gets updates about (i.e. those specified in labeled-namespaces), labeled or not.

@@ -160,11 +158,10 @@ func init() {
flags.StringVar(&opts.listenedNamespaces, "listened-namespaces", metav1.NamespaceAll,
"Select the namespaces for the Citadel to listen to, separated by comma. If unspecified, Citadel tries to use the ${"+
cmd.ListenedNamespaceKey+"} environment variable. If neither is set, Citadel listens to all namespaces.")
flags.BoolVar(&opts.unlabeledNamespacesManaged, "unlabeled-namespaces-managed", true, "Determines whether Citadel"+
"should generate secrets for service accounts created in namespaces without the citadel.istio.io/managed label")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ca.istio.io?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

kubeConfigFile string
readSigningCertOnly bool
listenedNamespaces string
unlabeledNamespacesManaged bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: active var name (e.g., manageUnlabeled) may be a little easier (try both options in an 'if' statement...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@@ -306,11 +303,15 @@ func runCA() {
stopCh := make(chan struct{})
if !opts.serverOnly {
log.Infof("Creating Kubernetes controller to write issued keys and certs into secret ...")

citadelNamespace := env.RegisterStringVar("CITADEL_NS", "",
Copy link
Contributor

@elevran elevran Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is using CITADEL_ prefix commonly used for other env-variables?
nit: also, since the definition and use are so close (in terms of LoC) a short variable name (ns) may be more idiomatic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on ns being more idiomatic, fixed.

Did some digging on what the best prefix for the envvar would be. From here, seems like POD_NAMESPACE would work fine.

unlabeledNamespacesManaged bool

// Used to coordinate with label and check if this instance of Citadel should create secret
citadelNamespace string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self namespace? They are all Citadels...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -188,6 +197,20 @@ func NewSecretController(ca ca.CertificateAuthority, requireOptIn bool, certTTL
UpdateFunc: c.scrtUpdated,
})

namespaceLW := listwatch.MultiNamespaceListerWatcher(namespaces, func(namespace string) cache.ListerWatcher {
return &cache.ListWatch{
ListFunc: func(options metav1.ListOptions) (runtime.Object, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is it possible to pass a label selector to the NS watcher to return only relevant namespaces (at least when not managing unlabeled namespaces)?

@@ -63,11 +63,17 @@ spec:
{{- if .Values.workloadCertTtl }}
- --workload-cert-ttl={{ .Values.workloadCertTtl }}
{{- end }}
- --manage-unlabeled={{ .Values.manageUnlabeled }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If value is set, so it reduces the upgrade risks ( version in the repo not having the flag )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, agreed. I was having a hard time figuring out how to accomplish this, since

          {{- if .Values.manageUnlabeled }}
            - --manage-unlabeled={{ .Values.manageUnlabeled }}
          {{- end }}

would actually leave the CLI flag out altogether in the case that the user set it to false (since Helm does not distinguish between the value not existing and the value existing as false in this context).

Any ideas?

@@ -14,6 +14,9 @@ citadelHealthCheck: false
# 90*24hour = 2160h
workloadCertTtl: 2160h

# Citadel default behavior if the ca.istio.io/managed namespace label not found
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we be consistent with what injector uses ?
Also I think we agreed that it won't be a boolean in the label.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this specifies behavior when no label is found, what citadel is supposed to do.

the actual label is not using boolean.

@@ -160,11 +158,10 @@ func init() {
flags.StringVar(&opts.listenedNamespaces, "listened-namespaces", metav1.NamespaceAll,
"Select the namespaces for the Citadel to listen to, separated by comma. If unspecified, Citadel tries to use the ${"+
cmd.ListenedNamespaceKey+"} environment variable. If neither is set, Citadel listens to all namespaces.")
flags.BoolVar(&opts.manageUnlabeled, "manage-unlabeled", true, "Determines whether Citadel"+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed not adding flags, but instead using a config file that can be updated without restart.
Adding values.yaml or the citadel subsection to a config map would be better than flag, and avoid the complexity of reading config, installer adding flag, restart, etc.

I believe Galley already does this with viper package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this would be ideal, but should it perhaps be done in a separate PR? If we decide we want Citadel to support hot reloading config, we would want to move other flags into the config map as well. Might be a task worthy of its own review

@@ -41,6 +41,10 @@ const (
// The Istio secret annotation type
IstioSecretType = "istio.io/key-and-cert"

// Label which determines whether Citadel should watch namespace
NamespaceManagedLabel = "ca.istio.io/secret-controller-namespace"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Labels and annotations should be in pkg/ something - since they may need to be used from multiple packages.
Possibly in api.

Also - please don't use 'secret-controller-namespace' - doesn't make sense from multiple reasons.
SDS is not a secret controller, it may be remote (not in a namespace), etc.

Maybe ca.istio.io/address or ca.istio.io/provider ( this MUST be discussed cross WG, since the pattern should apply to any other component, we can't use a different way for each WG. ).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For secret-controller Can you suggest better name and explains how it fits well with SDS? If there's some doc settled done, will be even better.

As for choosing namespace, we talked with @howardjohn and learned that existing configuration for installer is using namespace, the full url sounds flexible but does not exist yet.

@@ -41,6 +41,10 @@ const (
// The Istio secret annotation type
IstioSecretType = "istio.io/key-and-cert"

// Label which determines whether Citadel should watch namespace
NamespaceManagedLabel = "ca.istio.io/secret-controller-namespace"
NamespaceIgnoreValue = "NONE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this and why ? Comments, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the NamespaceManagedLabel has the value of NamespaceIgnoreValue, all Citadels will ignore the namespace. Note that the Citadels will ignore the namespace if the label has any value that doesn't match their own namespaces, but it makes sense to have a designated one as well.

This extra magic-value for disabling a NS across all Citadels is admittedly a bit clunky, would be open to suggestions.

manageUnlabeled bool

// Used to coordinate with label and check if this instance of Citadel should create secret
selfNamespace string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we received feedback on injector that we need to support finer granularity - please use either serviceName or URL to identify the CA that manages a namespace, and match against it.

That would allow a canary CA in same namespace as the prod one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reiterating the comment from @incfly here wrt using a URI as the granularity of selection.

As for using serviceName, is there a good mechanism for a Pod to know what Service it's running under? Don't think it's as straightforward to expose this to the pod as it is to expose the namespace it's running under

@@ -188,6 +197,20 @@ func NewSecretController(ca ca.CertificateAuthority, requireOptIn bool, certTTL
UpdateFunc: c.scrtUpdated,
})

namespaceLW := listwatch.MultiNamespaceListerWatcher(namespaces, func(namespace string) cache.ListerWatcher {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this pattern of using namespace labels to manage or not is common - maybe the code should be in pkg/ (if it's not already there ?). Would avoid more fragmentation and confusion...
@geeknoid

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me, but might be done in a separate PR.

@@ -209,40 +233,27 @@ func GetSecretName(saName string) string {

// Determine if the object is "enabled" for Istio.
// Currently this looks at the list of watched namespaces and the object's namespace annotation
func (sc *SecretController) istioEnabledObject(obj metav1.Object) bool {
if _, watched := sc.namespaces[obj.GetNamespace()]; watched || !sc.explicitOptIn {
func (sc *SecretController) istioManagedObject(obj metav1.Object) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't make sense - an object can be managed by Istio, just not by this particular instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, renamed to citadelManagedObject.

Copy link
Contributor

@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks ok - API is the primary concern, in particular consistency across WG.
Please make sure the actual label and semantics are discussed/approved in TOC or API WG, but it can't be one-off for citadel.

Also I strongly suggest moving the selection logic to common.

@costinm
Copy link
Contributor

costinm commented Jul 16, 2019 via email

@costinm
Copy link
Contributor

costinm commented Jul 16, 2019 via email

@costinm
Copy link
Contributor

costinm commented Jul 16, 2019 via email

@Monkeyanator
Copy link
Contributor Author

There definitely needs to be consistency between this configuration story and the sidecar injector's. Quoting from https://github.com/istio/installer#auto-injection, the documented behavior for the sidecar injector is:

If installed, namespaces can select the injector by setting the istio-env label on the namespace.

This corresponds to the ca.istio.io/secret-controller-namespace label, which we can rename to ca.istio.io/env or something more appropriate as discussed. As it stands, this points to a namespace, both in the sidecar injector from the installer and here, and when the time comes to use a better specifier (e.g. URI), we should migrate them together.

Only one auto-injector environment should have enableNamespacesByDefault=true, which will apply that environment to any namespace without an explicit istio-env label.

This corresponds to the manageUnlabeled CLI flag from this PR, which describes what to do with unlabeled namespaces. We could rename this to match enableNamespacesByDefault for consistency.

If istio-system has set enableNamespaceByDefault you must set istio-inject: disabled label to prevent istio-system from taking over. In this case, it is recommended to first install istio-control autoinject with the default disabled, test it, and move the default from istio-system to istio-control.

This aligns with the NamespaceIgnoreValue = "NONE" special value proposed in this PR. The alternative would be to do what the sidecar injector does and add another special override label, something like ca.istio.io:{enabled/disabled}.

I think this covers all the inconsistencies between the two configuration stories, let me know if I'm missing anything. (to the point of the complexity: the interactions between the labels and defaults is definitely tedious, but the truth table for the sidecar injector follows the same behavior)

@istio-testing istio-testing added needs-rebase Indicates a PR needs to be rebased before being merged size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR needs to be rebased before being merged labels Jul 17, 2019
@geeknoid geeknoid added cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. and removed cla: yes_old labels Jul 30, 2019
@Monkeyanator
Copy link
Contributor Author

Monkeyanator commented Jul 30, 2019

@costinm fixed the name of the option (back to enableNamespacesByDefault) 👍 as for the prefix change, I think that might take some additional thought/design- since this PR is already large enough, I say we make a separate PR for that work.

@costinm
Copy link
Contributor

costinm commented Jul 30, 2019 via email

@Monkeyanator
Copy link
Contributor Author

/test istio-unit-tests-master

1 similar comment
@Monkeyanator
Copy link
Contributor Author

/test istio-unit-tests-master

@Monkeyanator
Copy link
Contributor Author

/test istio-unit-tests-master

Copy link

@incfly incfly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

some nitpicks on comments, otherwise good to go.
also feel free to address as a separate PR though.

@@ -14,6 +14,16 @@ citadelHealthCheck: false
# 90*24hour = 2160h
workloadCertTtl: 2160h

# Determines Citadel default behavior if the ca.istio.io/env or ca.istio.io/override
# namespace are not found on a given namespace.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/namespace/label?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xxx and yyy label are not found a given namespac.e

// whether or not a given Citadel instance should operate on a namespace. The behavior is as follows:
// 1) If NamespaceOverrideLabel exists and is valid, follow what this label tells us
// 2) If not, check NamespaceManagedLabel. If the value matches the Citadel instance's NS, then it should be active
// 3) If NamespaceManagedLabel nonexistent or invalid, follow enableNamespacesByDefault environment variable
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow enableNamespaceByDefault, controlled by CITALDE_XXX environment variable.

the first is the field name, the second us user facing config, the actual env var name.

ns, err := sc.core.Namespaces().Get(obj.GetNamespace(), metav1.GetOptions{})
if err != nil || ns == nil { // @todo handle errors? Unit tests mocks don't create NS, only secrets
return enabled
if err != nil || ns == nil { // if we can't retrieve namespace details, fall back on default value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, before this PR we don't check namespace label details, thus always 100% to proceed with handling secrets/key generation.

Fallback to default behavior seems align with existing behavior better.

@Monkeyanator
Copy link
Contributor Author

@incfly since they're so minor I'll go ahead and fix the nits now

/hold

Introduce new designated labels ca.istio.io/env and ca.istio.io/override. Check for presence of these options before creating secrets for a
ServiceAccount, and ensure the Citadel instance is in the namespace
designated in the label value or has override label set. The enableNamespacesByDefault option determines the default behavior
in the case that these labels are not found.

Added logic for retroactive creation and deletion of secrets in
activated / deactivated namespaces.
@Monkeyanator
Copy link
Contributor Author

/hold cancel

@incfly
Copy link

incfly commented Jul 31, 2019

/lgtm
/approve

@istio-testing
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: costinm, incfly, Monkeyanator

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Monkeyanator
Copy link
Contributor Author

/test istio-pilot-e2e-envoyv2-v1alpha3-master

@istio-testing
Copy link
Collaborator

istio-testing commented Aug 1, 2019

@Monkeyanator: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
prow/istio-lint.sh c25d9fc788830fa42e17ce5d8acfe0958b709490 link /test istio-lint-master
prow/e2e-bookInfoTests-v1alpha3.sh c25d9fc788830fa42e17ce5d8acfe0958b709490 link /test e2e-bookInfoTests-envoyv2-v1alpha3-master
prow/integ-mixer-k8s-presubmit-tests.sh c25d9fc788830fa42e17ce5d8acfe0958b709490 link /test integ-mixer-k8s-presubmit-tests-master
prow/integ-new-installer-k8s-presubmit-tests.sh c25d9fc788830fa42e17ce5d8acfe0958b709490 link /test integ-new-install-k8s-presubmit-tests-master
integ-new-install-k8s-presubmit-tests-master bc0b1dc link /test integ-new-install-k8s-presubmit-tests-master

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Monkeyanator
Copy link
Contributor Author

/test istio-racetest-master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants