Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Add optional auth token #736

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
7 participants
@cgwalters
Copy link
Contributor

commented May 10, 2019

Since the bootstrap node serves Ignition to master, and today we
don't support scaling up/down masters, just disable access to
the master Ignition config in the in-cluster MCS by default.

@openshift-ci-robot openshift-ci-robot added size/L and removed size/M labels May 10, 2019

@cgwalters cgwalters changed the title server: Disable access to the master Ignition by default server: Disable access to the master Ignition by default, add optional auth token May 10, 2019

@cgwalters cgwalters force-pushed the cgwalters:mcs-master-worker-separation branch from 1af70b9 to 3bf7431 May 10, 2019

@runcom

This comment has been minimized.

Copy link
Member

commented May 10, 2019

Code lgtm, guess this can't hurt BYO either now since we just support workers there. Looks sane

@cgwalters cgwalters force-pushed the cgwalters:mcs-master-worker-separation branch from 3bf7431 to cd24821 May 10, 2019

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

Just for general info, I just discovered Ignition will do a retry loop infinitely on http status >= 500 but if you give it a 403 it gives up, so you need to de-provision the machine (e.g. oc -n openshift-machine-api delete machine/$x) if you were testing auth.

aws ec2 --region=us-east-2 get-console-output --instance-id i-0bc13fd82d10aee7a is handy to debug things like this.

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

Just writing up some more steps for testing/using this. After you do something like this:
oc -n openshift-machine-config-operator create secret generic --from-literal=worker=$(pwgen 32 1) ignition-auth

Your next step is to oc -n openshift-machine-api edit secret/worker-user-data - base64 decode it to get the JSON, then ensure that the Ignition request has the same auth key you generated in the secret, e.g.:

{"ignition":{"config":{"append":[{"source":"https://api-int.mycluster.example.com:22623/config/worker?auth=pxzbz5n2zgfcd4kff8krv78z8wwdndvj","verification":{}}]}, ...
@@ -31,10 +31,30 @@ type kubeconfigFunc func() (kubeconfigData []byte, rootCAData []byte, err error)
// appenderFunc appends Config.
type appenderFunc func(*ignv2_2types.Config) error

// Error is returned by the GetConfig API
type Error struct {

This comment has been minimized.

Copy link
@runcom

runcom May 10, 2019

Member

we might want to call it GetConfigError or something like that (doesn't need uppercase to be exported also, does it?)

This comment has been minimized.

Copy link
@kikisdeliveryservice

kikisdeliveryservice May 10, 2019

Member

(@runcom I thought upper to be an exported type)

This comment has been minimized.

Copy link
@kikisdeliveryservice

kikisdeliveryservice May 10, 2019

Member

like this naming suggestion btw.

This comment has been minimized.

Copy link
@runcom

runcom May 10, 2019

Member

yeah, I realized only after commenting that I uppercased my suggestion 😄

This comment has been minimized.

Copy link
@ashcrow

ashcrow May 13, 2019

Member

Or possibly configError to be explicit.

@kikisdeliveryservice

This comment has been minimized.

Copy link
Member

commented May 10, 2019

aws route53 issues

/retest

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

Now a much stronger elaboration of this PR would be for the MCO to support one-time-use tokens.

Rather than having a secret, an admin (or automation) would do e.g.:

oc -n openshift-machine-config-operator create secret generic --from-literal=auth=$(pwgen 32 1) ignition-auth-worker-us-east-2a-sks8v

The secret name is the concatenation of ignition-auth- with the machineAPI object name.

Then the HTTP request would have e.g. ?n=worker-us-east-2a-sks8v&auth=<secret>.

The MCS would check for this secret's existence, and if it existed, accept the request and then delete the secret. This would also mean access to the EC2 metadata API from a pod couldn't gain a useful token.

The machine API would need to learn how to generate a per-instance userdata.

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 11, 2019

/retest

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 12, 2019

OK, all tests passed on this one, confirming my manual testing before submitting the PR that nothing depends on access to the /config/master endpoint. And we have the ability to gate access to /worker, but turning it on by default will require an installer PR.

I'd like to consider landing this as we're pretty sure it's not going to break anything, and gives us the mechanism to gate access to /config/worker.

cgwalters added a commit to cgwalters/installer that referenced this pull request May 12, 2019

Generate an `ignition-auth` key and provide it to the MCS
This is an optional hardening for access to Ignition; the installer
generates a random key (separately for master/worker pool) and installs
it into the `openshift-machine-config-operator` namespace.  If the MCS
finds an `ignition-auth` secret with the `master/worker` keys, it will use it:
openshift/machine-config-operator#736

This PR just generates those secrets, so we can land it before the
MCO PR as well.
@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 12, 2019

Installer PR: openshift/installer#1740

@runcom

This comment has been minimized.

Copy link
Member

commented May 12, 2019

/approve

This lgtm, can this land separately from the installer PR? do we need a go-on from Auth team?

@ashcrow
Copy link
Member

left a comment

nits but those are nits 😄

@openshift-ci-robot

This comment has been minimized.

Copy link

commented May 13, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, cgwalters, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [ashcrow,cgwalters,runcom]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

func (cs *clusterServer) GetConfig(cr poolRequest) (*ignv2_2types.Config, error) {
mp, err := cs.machineClient.MachineConfigPools().Get(cr.machineConfigPool, metav1.GetOptions{})
func (cs *clusterServer) GetConfig(cr poolRequest, auth string) (*ignv2_2types.Config, error) {
authSecretObj, err := cs.kubeClient.CoreV1().Secrets(componentNamespace).Get(ignitionAuth, metav1.GetOptions{})

This comment has been minimized.

Copy link
@squeed

squeed May 13, 2019

Contributor

The ability to configure multiple secrets would make later rotation much easier.


// If there's a secret, require that it was passed as a query parameter.
if authEnabled {
authSecret := string(authSecretObj.Data[name])

This comment has been minimized.

Copy link
@squeed

squeed May 13, 2019

Contributor

If an administrator forgets to configure an authsecret for a new machine pool, this will fail open.

This comment has been minimized.

Copy link
@cgwalters

cgwalters May 13, 2019

Author Contributor

That's intentional for backcompat reasons.

@crawford

This comment has been minimized.

Copy link
Member

commented May 13, 2019

/hold

This doesn't but us much in terms of security and it makes disaster recovery more difficult.

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 13, 2019

This doesn't but us much in terms of security

If access to the EC2 metadata endpoint is shut off (which it needs to be anyways), or in cases where the bootstrap config isn't accessible at all, I think this is fairly strong security. It entirely shuts down the problem.

How is this not buying much?

and it makes disaster recovery more difficult.

Slightly...and if desired we can switch this to just landing code to enable auth tokens for both master/worker without actually enabling it by default. In other words, drop the change to disable the master pool.

Further, keep in mind that this code is a step towards further integration with machineAPI.

@cgwalters cgwalters force-pushed the cgwalters:mcs-master-worker-separation branch from cd24821 to e9db2c5 May 13, 2019

server: Add optional authorization to MCS
This is a currently optional hardening, an admin (or the installer)
can do e.g.:

```
oc -n openshift-machine-config-operator create secret generic --from-literal=worker=$(pwgen 32 1) ignition-auth
```

From there, the PXE setup, or cloud-init requests will need to provide
an `auth` query parameter when requesting `MachineConfig` objects from
the MCS.

@cgwalters cgwalters force-pushed the cgwalters:mcs-master-worker-separation branch from e9db2c5 to 5e6a17b May 13, 2019

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 13, 2019

OK dropped the master pool change; this PR now is pure optional auth handling if the secret exists, otherwise it does nothing per this comment.

@kikisdeliveryservice

This comment has been minimized.

Copy link
Member

commented May 13, 2019

uhoh, is there due to the branching that just happened?
could not resolve inputs: could not determine inputs for step [input:base]: could not resolve base image: imagestreamtags.image.openshift.io "4.2" not found

@kikisdeliveryservice

This comment has been minimized.

Copy link
Member

commented May 13, 2019

/retest

@cgwalters cgwalters changed the title server: Disable access to the master Ignition by default, add optional auth token server: Add optional auth token May 15, 2019

@cgwalters

This comment has been minimized.

Copy link
Contributor Author

commented May 15, 2019

/retest

@openshift-ci-robot

This comment has been minimized.

Copy link

commented May 15, 2019

@cgwalters: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws 5e6a17b link /test e2e-aws
ci/prow/e2e-aws-op 5e6a17b link /test e2e-aws-op

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot

This comment has been minimized.

Copy link

commented May 15, 2019

@cgwalters: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request May 21, 2019

server: Deny serving Ignition to provisioned nodes
Ignition may contain secret data; pods running on the cluster
shouldn't have access.

A previous attempt at this was to have an [auth token](openshift#736);
but this fix doesn't require changing the installer and people's PXE setups.

The downside is that we enumerate all nodes every time someone hits the endpoint,
but that today will only occur when a new node is provisioned.

As the `TODO` says down the line we can change the node controller to
write out this data in a form for the MCS to easily read, but this
approach is quite simple/direct and solves the problem.

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request May 21, 2019

server: Deny serving Ignition to provisioned nodes
Ignition may contain secret data; pods running on the cluster
shouldn't have access.

A previous attempt at this was to have an [auth token](openshift#736);
but this fix doesn't require changing the installer and people's PXE setups.

The downside is that we enumerate all nodes every time someone hits the endpoint,
but that today will only occur when a new node is provisioned.

As the `TODO` says down the line we can change the node controller to
write out this data in a form for the MCS to easily read, but this
approach is quite simple/direct and solves the problem.

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request May 24, 2019

server: Deny serving Ignition to active nodes and from pods
Ignition may contain secret data; pods running on the cluster
shouldn't have access.

This PR closes of access from any IP addresses of existing nodes.
It also denies any traffic which exits from the SDN (e.g. is targeting
the node's own `tun0`).

A previous attempt at this was to have an [auth token](openshift#736);
but this fix doesn't require changing the installer and people's PXE setups.

The downside is that we check the SDN config and enumerate all nodes
every time someone hits the endpoint, but that today will only occur
when a new node is provisioned; and Ignition has retries anyways.

As the `TODO` says down the line we can change the node controller to
write out this data in a form for the MCS to easily read, but this
approach is quite simple/direct and solves the problem.

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request May 24, 2019

server: Deny serving Ignition to active nodes and from pods
Ignition may contain secret data; pods running on the cluster
shouldn't have access.

This PR closes of access from any IP addresses of existing nodes.
It also denies any traffic which exits from the SDN (e.g. is targeting
the node's own `tun0`).

A previous attempt at this was to have an [auth token](openshift#736);
but this fix doesn't require changing the installer and people's PXE setups.

The downside is that we check the SDN config and enumerate all nodes
every time someone hits the endpoint, but that today will only occur
when a new node is provisioned; and Ignition has retries anyways.

As the `TODO` says down the line we can change the node controller to
write out this data in a form for the MCS to easily read, but this
approach is quite simple/direct and solves the problem.

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request May 24, 2019

server: Deny serving Ignition to active nodes and from pods
Ignition may contain secret data; pods running on the cluster
shouldn't have access.

This PR closes of access from any IP addresses of existing nodes.
It also denies any traffic which exits from the SDN (e.g. is targeting
the node's own `tun0`).

A previous attempt at this was to have an [auth token](openshift#736);
but this fix doesn't require changing the installer and people's PXE setups.

The downside is that we check the SDN config and enumerate all nodes
every time someone hits the endpoint, but that today will only occur
when a new node is provisioned; and Ignition has retries anyways.

As the `TODO` says down the line we can change the node controller to
write out this data in a form for the MCS to easily read, but this
approach is quite simple/direct and solves the problem.

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request May 24, 2019

server: Deny serving Ignition to active nodes
Ignition may contain secret data; pods running on the cluster
shouldn't have access.

This PR closes of access to any IP that responds on port 22, as that
is a port that is:

 - Known to be active by default
 - Not firewalled

A previous attempt at this was to have an [auth token](openshift#736);
but this fix doesn't require changing the installer and people's PXE setups.

In the future we may reserve a port in the 9xxx range and have the
MCD respond on it so that admins who disable/firewall SSH don't
have indirectly reduced security.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.