-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for --identity-external-issuer=true #3552
Add support for --identity-external-issuer=true #3552
Conversation
5a500a5
to
d2c4a64
Compare
049b28c
to
d0576a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zaharidichev, I started looking and everything makes sense to me so far 👍
So far I've only spotted one possible improvement described below. I'll follow up later with more feedback.
cli/cmd/install.go
Outdated
@@ -359,6 +362,12 @@ control plane.`, | |||
os.Exit(1) | |||
} | |||
|
|||
shouldExist := options.identityOptions.identityExternalIssuer | |||
if err := errIfLinkerdIdentityissuerSecretMissing(shouldExist); err != nil && !options.ignoreCluster { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could wrap that condition with if !options.ignoreCluster && shouldExist
and then have errIfLinkerdIdentityissuerSecretMissing()
have no args and only check when shouldExist
is true, to avoid the network calls when shouldExist
is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a matter of fact, I'm unable to run a naked linkerd install
because it's forcing me to have the namespace already. This is probably also why the integration tests are failing.
cli/cmd/install.go
Outdated
@@ -458,6 +467,11 @@ func (options *installOptions) recordableFlagSet() *pflag.FlagSet { | |||
flags.StringVarP(&options.controlPlaneVersion, "control-plane-version", "", options.controlPlaneVersion, "(Development) Tag to be used for the control plane component images") | |||
flags.MarkHidden("control-plane-version") | |||
|
|||
flags.BoolVar( | |||
&options.identityOptions.identityExternalIssuer, "identity-external-issuer", options.identityOptions.identityExternalIssuer, | |||
"Whether to use an external identity issuer (false by default)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the standard followed by other commands is (default false)
cli/cmd/install.go
Outdated
|
||
secret, err := kubeAPI.CoreV1().Secrets(controlPlaneNamespace).Get(k8s.IdentityIssuerSecretName, metav1.GetOptions{}) | ||
if (secret == nil || kerrors.IsNotFound(err)) && shouldExist { | ||
return fmt.Errorf("'linkerd-identity-issuer' needs to exist if --identity-external-issuer=true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you can be a bit more explicit with the 'linkerd-identity-issuer' secret needs to exist if...
?
Ref #3531 |
cli/cmd/install.go
Outdated
creds, err := tls.ValidateAndCreateCreds(externalIssuerData.issuerCrt, externalIssuerData.issuerKey) | ||
|
||
if err != nil { | ||
log.Fatalf("Failed to read CA from %s: %s", consts.IdentityIssuerSecretName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, log.Fatalf
isn't producing any output in the CLI so when there's an error the command just fails with an empty output. I would replace this (and the one below) with something like return nil, fmt.Errorf("Failed to read CA from %s: %s", consts.IdentityIssuerSecretName, err)
controller/identity/creds_watcher.go
Outdated
return nil, fmt.Errorf("failed to verify issuer credentials for '%s' with trust anchors: %s", fscw.expectedName, err) | ||
} | ||
|
||
log.Infof("Loaded issuer cert:\nCert: %s\nKey: %s", creds.EncodeCertificatePEM(), creds.EncodePrivateKeyPEM()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we did this elsewhere, but better not log the private key IMO
@zaharidichev Might wanna branch off the linkerd/linkerd2 repo, instead of a fork. There are a few things that CI doesn't run on forks, including Docker build and integration test, even though they show up with ✔️. |
cli/cmd/install.go
Outdated
@@ -359,6 +362,12 @@ control plane.`, | |||
os.Exit(1) | |||
} | |||
|
|||
shouldExist := options.identityOptions.identityExternalIssuer | |||
if err := errIfLinkerdIdentityissuerSecretMissing(shouldExist); err != nil && !options.ignoreCluster { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to add this check to newCmdInstallConfig
which corresponds to the linkerd install config
subcommand. You can learn more about Linkerd's multi-stage installation here.
The challenge is that recordableFlagSet
can't be part of linkerd install config
. So you will have to add --identity-external-issuer
to allStageFlagSet
. Once moved, in order for this new flag to be persisted in the linkerd-config
config map, you will have to update the proto/config.proto
pb definition, by adding a new field to IdentityContext
, which is a nested field inside Global
.
You can update the protobuf dependencies and auto-generated code with:
bin/protoc-go.sh
cli/cmd/install.go
Outdated
return nil, fmt.Errorf("error fetching external issuer config: %s", err) | ||
} | ||
|
||
_, err = kubeAPI.CoreV1().Namespaces().Get(controlPlaneNamespace, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not needed, since errIfLinkerdIdentityissuerSecretMissing()
already perform this check.
cli/cmd/install.go
Outdated
} | ||
|
||
secret, err := kubeAPI.CoreV1().Secrets(controlPlaneNamespace).Get(k8s.IdentityIssuerSecretName, metav1.GetOptions{}) | ||
if secret == nil || kerrors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just simplify this to if err != nil
to communicate the intent of returning err
if it isn't nil
, regardless of what the error is.
cli/cmd/install.go
Outdated
|
||
secret, err := kubeAPI.CoreV1().Secrets(controlPlaneNamespace).Get(k8s.IdentityIssuerSecretName, metav1.GetOptions{}) | ||
if secret == nil || kerrors.IsNotFound(err) { | ||
return nil, fmt.Errorf("'linkerd-identity-issuer' needs to exist if --identity-external-issuer=true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer that we don't bury the original err
, but to include it in the return value. Why not just:
return nil, err
IIRC if secret
doesn't exist, won't err
just be something like:
Error from server (NotFound): secrets "foo" not found
At least, that's the behaviour of kubectl
.
cli/cmd/upgrade.go
Outdated
@@ -246,7 +246,7 @@ func (options *upgradeOptions) validateAndBuild(stage string, k kubernetes.Inter | |||
} | |||
configs.GetGlobal().IdentityContext = toIdentityContext(identity) | |||
} else { | |||
identity, err = fetchIdentityValues(k, idctx) | |||
identity, err = fetchIdentityValues(k, idctx, options.identityOptions.identityExternalIssuer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading this correctly, options.identityOptions.identityExternalIssuer
is reading the value from the CLI. We need to retrieve the value from the linkerd-config
config map. You should be able to retrieve it from the configs.GetInstall().GetFlags()
slice.
(Note that this will change if we change --identity-external-issuer
to a global field in the protobuf definition as suggested above, in order to support linkerd install config
.)
@@ -369,7 +370,15 @@ func fetchIdentityValues(k kubernetes.Interface, idctx *pb.IdentityContext) (*ch | |||
}, nil | |||
} | |||
|
|||
func fetchIssuer(k kubernetes.Interface, trustPEM string) (string, string, time.Time, error) { | |||
func fetchIssuer(k kubernetes.Interface, trustPEM string, externalIssuer bool) (string, string, time.Time, error) { | |||
crtName := k8s.IdentityIssuerCrtName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's too bad that we can't just change k8s.IdentityIssuerCrtName
and k8s.IdentityIssuerKeyName
to what cert-manager produces, so that we won't need the externalIssuer
boolean here and in the identity service. Changing these constants will require an update to the variables defined in values.yaml
, which will be a breaking change for some users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, whats even worse is that there is the possibility for this to change over time and if some people are running older/newer versions of cert-manager the names of their secret keys might be different. So I wonder whether we should make all of this configurable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think keeping it the way you have it is fine. Making it configurable can be a double-edged sword because the last thing we want is to support cases where users can accidentally provide some unsupported secret keys in the k/v pairs. If cert-manager changed their API, then it's a conscious effort for us and our users to upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than passing a boolean in, can we just check for both filenames?
controller/identity/creds_watcher.go
Outdated
} | ||
|
||
// StartWatching starts watching the filesystem for cert updates | ||
func (fscw *FsCredsWatcher) StartWatching() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we are trying to do too much in this method. How about instead of sending *tls.CA
to the channel, we just send it an "event", and then let identity.Service
do most of the TLS-related stuff?
I think there might be a way to make this and the changes to pkg/identity/services.go
simpler. I am experimenting with it in this incomplete gist. The initializing and switching of the issuer
object will be interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at the diff. I though about something similar in the beginning. We can go both ways but my initial idea was that this makes the service harder to test in case we want to add some extensive testing to it.
Since the service is has quite a bit of logic in there that has to do with issuing certs, we might want to exercise some of these scenarios by having it call into tls.CA
created from different certs. If that is the case, we will have to put the issuer certs on disk and have different ones for every test case while pinging the service some dummy event. Supplying the tls.CA
via the channel allows us to mock it in a much easier way I think.
Wrt to initializing and swapping the tls.CA
, I dont see much of a problem as long as the lock is held. Maybe I am missing smth here? In any way, we can go either direction. Do not really have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats more interesting to me however is all the error handling in case we have errors from the error channel or invalid issuer certs written to the file system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of moving the TLS code out of the CredsWatcher is that we can convert it into a generic file watcher package, say pkg/watcher/file
.
In terms of testing the identity.Service
, I think certain unexported functions (like loadCredentials
if we choose to move it out of the file watcher) can be mock by using function pointers. There are some examples in cli/cmd/install.go
. Testing Certify()
will be difficult even in its current state.
If that is the case, we will have to put the issuer certs on disk and have different ones for every test case while pinging the service some dummy event.
Sounds like a good integration test. We can create a separate issue for that, as I assume there will be quite a bit of new test code.
Whats more interesting to me however is all the error handling in case we have errors from the error channel or invalid issuer certs written to the file system
Good point. My first thought is first of all, the file watcher shouldn't worry about how to handle it; the identity service should. Then, the old cert shouldn't be rotated and the identity service shouldn't go down. As to how we notify the users in addition to generate logs, we can consider publishing k8s events. I don't know if we need to tackle this here, but controller/proxy-injector/webhook.go has some good examples on recording k8s events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all great points. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work so far! Some high-ish level questions:
- It would be really great if we did not have to persist this flag value. If think if we check for both file names, as in my comment above, we will no longer need to persist it.
- Why does the flag need to apply to
install config
? as far as I can tell, it affects control-plane concerns only, not cluster concerns. Can it just be ainstall control-plane
flag? - If I understand correctly, to use an external issuer, you MUST do a multistage install:
- linkerd install config (to create the linkerd namesapce)
- external issuer create secret in linkerd namespace
- linkerd install control-plane reads the issuer secret
Does that sound right?
Let me know if I've misunderstood anything.
@@ -4,7 +4,7 @@ | |||
### | |||
### Identity Controller Service | |||
### | |||
{{ if .Identity.Issuer -}} | |||
{{ if and (.Identity.Issuer) (eq .Identity.Issuer.External false) -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still learning about helm template functions... could this be not .Identity.Issuer.External
instead?
cli/cmd/install.go
Outdated
@@ -259,6 +262,13 @@ resources for the Linkerd control plane. This command should be followed by | |||
os.Exit(1) | |||
} | |||
|
|||
shouldExist := options.identityOptions.identityExternalIssuer | |||
if !options.ignoreCluster && shouldExist { | |||
if err := errIfLinkerdIdentityIssuerSecretMissing(); err != nil && !options.ignoreCluster { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think options.ignoreCluster
shouldn't be changed at this point, and it was just checked on line 266
cli/cmd/install_test.go
Outdated
@@ -187,7 +187,7 @@ func TestRender(t *testing.T) { | |||
} | |||
} | |||
|
|||
func testInstallOptions() (*installOptions, error) { | |||
func testInstallOptions(addCertDataOptions bool) (*installOptions, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about always including the certDataOptions but manually unsetting fields in the returned options when necessary?
@@ -369,7 +370,15 @@ func fetchIdentityValues(k kubernetes.Interface, idctx *pb.IdentityContext) (*ch | |||
}, nil | |||
} | |||
|
|||
func fetchIssuer(k kubernetes.Interface, trustPEM string) (string, string, time.Time, error) { | |||
func fetchIssuer(k kubernetes.Interface, trustPEM string, externalIssuer bool) (string, string, time.Time, error) { | |||
crtName := k8s.IdentityIssuerCrtName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than passing a boolean in, can we just check for both filenames?
Can you elaborate on this? To my reasoning: if it's specified on the commandline, it should be persisted. |
Not necessarily. Some users already create their own
Yes.
Yes, you are right. Looking closer at the Helm templates, the flag isn't used until the |
Ah, let me rephrase: simply doing a single step install with
As far as I can tell, the
These are both install behaviors and the The exception to this is that the names of the cert and key files that cert-manager creates are different from the ones that Linkerd creates. Rather than using the |
My thought around this is that, from a support perspective, there are times when I'll like to know how the control plane is installed, along with the install options used. With this flag persisted, I can tell for sure at one point in an older release, the user had decided to use cert-manager, which will be helpful with certs-related issues, as per #3578 (comment), it sounds like cert-manager has some sort of revert/reconciliation mechanism, if the user tries to override the managed cert. |
@ihcsim I see. I thinkg @adleong had some very good points around the lack of need for persisting that as we can deduce even during upgrade whether cert-manager was used by simply looking at the contents of the secret. That being said though, you comments also make sense. So should we persist it after all or not. I guess it does not hurt. Should I modify it to include it in the |
@adleong ok, i see your point. I'm not 100% sold that it's worth the optimization. I don't see any harm in making the persisted flags explicitly reflect the install mechanism (i.e. for diagnostics, etc). Is there a reason that persisting the flag is harmful? |
Persisting the flag is not harmful but I think the abstraction boundary is more clear if we don't. i.e. Linkerd simply acts on the issuer secret without knowing how it was created. And as @zaharidichev said, we can reverse engineer if cert-manager was used by looking at the file names. |
I... have no idea. Try pushing a new commit? |
@zaharidichev yup it's the conflicts. ci merges your PR into master for testing. |
@zaharidichev Let's not worry about persisting the flag for now. I am more keen on getting this tested, and merged so that users can try it out in the next edge. If we see the needs to add it in the future, we just need to change its flagset to
In addition to the conflicts, as a reminder, because your branch is on a fork, certain CI steps (like Docker build and integration tests) won't run even though GitHub Action still gives you the ✔️. |
@@ -259,6 +261,7 @@ func (options *upgradeOptions) validateAndBuild(stage string, k kubernetes.Inter | |||
return nil, nil, fmt.Errorf("could not build install configuration: %s", err) | |||
} | |||
values.Identity = identity | |||
values.InstallNamespace = !identity.Issuer.External |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? My understanding of our earlier conversation around omitting the persistence of the --identity-external-issuer
flag is that its value isn't needed during upgrade. If during upgrade, the code needs to make decisions based on whether an issuer is external, then I don't think what the isExternal()
function below is doing is reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ihcsim I see. I think @adleong had some very good points around the lack of need for persisting that as we can deduce even during upgrade whether cert-manager was used by simply looking at the contents of the secret.
Well if this is not set, then an attempt to include the namespace in the produced output will be made, right? Just like I noted here, this can be deduced without persisting. If thats unreliable, lets persist it then. The isExternal()
function is doing pretty much the same thing that the identity service is doing when looking at the contents of the secret.
I dont have the full context around the potential use and edge cases here, so its quite difficult for me to decide for sure. But why would it be unreliable? If the secret contains contents from cert manager and the flag is set to not use external issuer, it will not work at all after all. Again, up to you what we do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well if this is not set, then an attempt to include the namespace in the produced output will be made, right?
My thought is that kubectl apply
will leave it alone, unless there are some differences. But yes, if the users created their own linkerd
namespace, then linkerd upgrade
should leave it alone, like what you did here.
lets persist it then.
Sure. Thanks. Sorry for the back-and-forth.
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
a8b659b
to
9a7c2de
Compare
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
closing in favor of #3600 |
This PR adds support for using cert-manager as the source of issuer certs. When we specify
--identity-external-issuer=true
the cert, key and trust anchors are loaded from the k8s secret that should already be present. As cert manager rolls certificates, the identity service monitors the path where the secret is mounted and automatically refreshes them.Signed-off-by: zaharidichev zaharidichev@gmail.com