Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

describe one-stop-shopping for exposing customized routes #577

Merged
merged 3 commits into from
Feb 24, 2021

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Jan 8, 2021

Summary

Add CustomizeableRoute{Spec,Status} to IngressSpec in a way that allows multiple operators to provide information about Routes (or Ingresses, or maybe even something else), that a cluster-admin wants to provide custom names and/or custom serving certificates for.

Motivation

Some customers do not allow wildcard serving certificates for the openshift ingress to use, so they have to provide individual
serving certificates to each component. Those customers need a way to see all the routes they need to customize and a way to configure them. A stock installation has the following routes which may need customization of both names and serving cert/keys

  1. openshift-authentication oauth-openshift
  2. openshift-console console
  3. openshift-console downloads
  4. openshift-monitoring alertmanager-main
  5. openshift-monitoring grafana
  6. openshift-monitoring prometheus-k8s
  7. openshift-monitoring thanos-querier

It is just as easy to create a generic solution that provides one stop shopping for cluster-admins as it is to produce a series of one-off solutions. This proposes a single point of configuration because we like cluster-admins and we dislike writing multiple pages of slightly different documentation.

@sttts @Miciah @ironcladlou @stlaz @mfojtik

Copy link
Contributor

@Miciah Miciah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. Is allowing configuration using installer manifests a goal or non-goal?

4. openshift-monitoring alertmanager-main
5. openshift-monitoring grafana
6. openshift-monitoring prometheus-k8s
7. openshift-monitoring thanos-querier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the image registry (which already has an API to configure its routes).


### Non-Goals

1. provide a way to specify SNI serving certificates for operands.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the distinction between specifying SNI serving certificates for operands and specifying ServingCertKeyPairSecret?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the distinction between specifying SNI serving certificates for operands and specifying ServingCertKeyPairSecret?

Yes. The cert/key pair is a single serving certificate, but an SNI configuration requires mapping ServerName to cert/key pairs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that the ServerName is taken from the cert, but there is no custom configuration beyond that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that the ServerName is taken from the cert, but there is no custom configuration beyond that?

I wouldn't even take that. I would use this as the default certificate for serving and configure SNI for the certificates that an operator controls, like service serving cert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik today you would have to create a separate Route to host a different SNI certificate pointing pointing to a different hostname.

The reality though is that (in case of monitoring) we are deploying just one Route object and thus allow only tweaking only one hostname.

So keeping a 1:1 relation between a Route and a CustomizeableRoutes makes sort of sense. If you want to add additional SNI hosts, you'd have to support multiple routes in the original operator.

// .spec.customizeableRoutes[index].{namespace,name}.
// To determine the set of possible keys, look at .status.customizeableRoutes where participating operators place
// current route status keyed the same way.
CustomizeableRoutes []CustomizeableRouteSpec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customizeable seems redundant. Maybe ComponentRoutes? (This naming suggestion is inspired by CVO's ComponentOverride. Naming things is hard.)


type CustomizeableRouteSpec struct{
// namespace is the namespace of the route to customize. It must be a real namespace.
Namespace string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Namespace needed? It seems unnecessary and confusing because (a) it cannot be customized, (b) the reader may naturally pair it with Name, and (c) the reader may naturally pair it with ServingCertKeyPairSecret, so the most correct interpretation (which is basically to ignore it) is the least probable (IMO).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to modify a route in a given namespace, no? Having namespace therefore seems natural to me. My 2c

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have two routes with the same logical CustomizeableRoute name but in different namespaces?

Copy link
Contributor Author

@deads2k deads2k Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have two routes with the same logical CustomizeableRoute name but in different namespaces?

api or apiserver seemed like natural ones that would conflict if you don't use namespaces, but when namespace qualified do not conflict.

Namespaces are a way to subdivide our operators and operands today to ensure no conflicts.

Also, the idea of being able to allow OLM installed operators to participate means (I think) that the same operator may be installed multiple times in different namespaces.

// this name.
Name string
// desiredHost is the name that a cluster-admin wants to specify
DesiredHost string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in spec, so isn't "desired" redundant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is a name? Be more precise. A name is two lines above already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a similar logic in the console-operator where, in case of a custom route a hostname and a secret reference has to be specified.
Check: https://github.com/openshift/api/blob/master/operator/v1/types_console.go#L51-L63

Would drop the desired and just use hostname as we use in https://github.com/openshift/api/blob/master/operator/v1/types_console.go#L53

// this name.
Name string
// defaultHost is the normal name of this route.
DefaultHost string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? We don't usually have status fields to say what value we would use if spec didn't specify a value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? We don't usually have status fields to say what value we would use if spec didn't specify a value

Recognizeability. One mapping is namespace/name. Another likely mapping is "this is the name of the thing I want to customize". Displaying both in status allows a user to find names based on either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

When console is accessed via console.foo.com, how does it know which auth and which api name it should use? I understand it would pick DefaultHost here and forward the user to the right auth URL or to use the right api URL in the kubeconfig the user can download.

@deads2k is this your understanding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also means that if you have multiple domains, e.g. console.public-internet.com and console.intranet, the console would have no way to know the matching auth and api URL. @deads2k is that correct?

@Anandnatraj do we have the requirement for such a "polymorphic" mapping of URLs, depending on which console URL the user accesses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When console is accessed via console.foo.com, how does it know which auth and which api name it should use? I understand it would pick DefaultHost here and forward the user to the right auth URL or to use the right api URL in the kubeconfig the user can download.

@deads2k is this your understanding?

I would probably choose the first .status.currentHosts value.

This also means that if you have multiple domains, e.g. console.public-internet.com and console.intranet, the console would have no way to know the matching auth and api URL. @deads2k is that correct?

I don't think this makes that worse one way or the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this makes that worse one way or the other.

the choice is important if not all URLs can be accessed from the internet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this makes that worse one way or the other.

the choice is important if not all URLs can be accessed from the internet.

The route API doesn't know which it is either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not talking about the route, but about console and oauth-server choosing the right URL.


// conditions are available, degraded, progressing. This allows consistent reporting back and feedback that is well
// structured. These particular conditions have worked very well in ClusterOperators.
Conditions []ConfigCondition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to re-use status condition types, but we need to define their semantics in this context. Does the owning operator (and only that operator) manage status conditions (e.g., console-operator exclusively manages status of the "console" and "console-downloads" entries)? Is the same operator expected to perform health probes and set Degraded=True if the route is unreachable or has misconfigured TLS?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it would be nice to have a section describing what available/degraded/progressing mean in this context

It would also be possible (though I'm less sure someone would use it), to provide a route injection option for serving cert/key
pairs.

Library-go is also an ideal spot to provide functionality to provide request route hosts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean "to provide request route hosts"?

// CustomizeableRoutes is a list of routes that a cluster-admin wants to customize. It is logically keyed by
// .spec.customizeableRoutes[index].{namespace,name}.
// To determine the set of possible keys, look at .status.customizeableRoutes where participating operators place
// current route status keyed the same way.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an entry in spec that does not have a corresponding entry in status dead config? Will an admission controller inhibit such entries in spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an entry in spec that does not have a corresponding entry in status dead config? Will an admission controller inhibit such entries in spec?

I was thinking dead, not prohibited. If you prohibit, then you have an ordering problem. I will update the enhancement.


// CustomizeableRoutes is where participating operators place the current route status for routes which the cluster-admin
// can customize hostnames and serving certificates.
// How the operator uses that serving certificate is up to the individual operator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an operator responsible for deleting any entries it has added here when it is removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an operator responsible for deleting any entries it has added here when it is removed?

Yes, will clarify.

observer that reads the ingresses.spec, handles the specified names, sets up the specific list/watch, and takes care of
copying the secret so it can be mounted.

It would also be possible (though I'm less sure someone would use it), to provide a route injection option for serving cert/key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that the only option if your routes use reencrypt/edge termination?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that for passthrough routes, the operator (which has a serviceaccount in ConsumingUsers) reads the certificate and configures the operand to use it as the serving certificate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was rather reacting to the note though I'm less sure someone would use it, since I think the authentication operator is the only component using passthrough routes I would expect that actually everyone else is going to use route injection.

type IngressSpec struct {
// other fields

// CustomizeableRoutes is a list of routes that a cluster-admin wants to customize. It is logically keyed by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language corner: doesn't English usually omit the "e" when the "-able" suffix gets used?

Copy link
Contributor Author

@deads2k deads2k Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language corner: doesn't English usually omit the "e" when the "-able" suffix gets used?

heh, now I'm slightly embarrassed. ;)


type CustomizeableRouteSpec struct{
// namespace is the namespace of the route to customize. It must be a real namespace.
Namespace string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to modify a route in a given namespace, no? Having namespace therefore seems natural to me. My 2c


// conditions are available, degraded, progressing. This allows consistent reporting back and feedback that is well
// structured. These particular conditions have worked very well in ClusterOperators.
Conditions []ConfigCondition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it would be nice to have a section describing what available/degraded/progressing mean in this context

Conditions []ConfigCondition

// relatedObjects allows listing resources which are useful when debugging or inspecting how this is applied.
// They may be aggregated into an overal status RelatedObjects to be automatically by oc adm inspect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: to be automatically shown (?) by oc adm inspect


#### library-go usage
Because this will be leveraged by the authentication operator (and possibly the console operator), we can provide a config
observer that reads the ingresses.spec, handles the specified names, sets up the specific list/watch, and takes care of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about validation of the specific spec fields, would library-go serve as the model implementation even for operators that don't use it? Consider adding a section on validation of the specific fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if you want consumers to do any validation, make sure it is documented, not just codified in library-go (which many core operators do not use).

1. Which components should migrate?
2. What timeline should that migration happen?
3. Which components can start from this unified API?
4. Do we allow OLM operators to participate?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have doubts about having both auto-RBAC assign and allowing OLM in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but technically nothing stops them, just permissions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this time, OLM does not support shipping routes in Operator Bundles, so the operator would need to implement support for this feature.

OLM could certainly add support mind you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean for security? If I can write Ingress status, I can let me give permissions to any openshift-config secret? @deads2k this needs a discussion why this is ok or how to mitigate the problem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That point is called out in the enhancement here but doesn't seem to have garnered much conversation.

This means that the power to mutate ingresses.config.openshift.io and ingresses.config.openshfit.io/status
will imply the power to read secrets in openshift-config. If we decide we do not wish to do this, then it will incumbent upon the cluster-admin to create the role and rolebinding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that you need to write both ingresses.config.openshift.io and ingresses.config.openshift.io/status.

- "@danmace"
- "@miciah"
- "@standa"
- "@spadgett"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sur @lilic as well

5. openshift-monitoring grafana
6. openshift-monitoring prometheus-k8s
7. openshift-monitoring thanos-querier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to also include the openshift-logging kibana route? we currently don't allow any configuration (both with route name and cert)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sttts How do we resolve the open question on that one? Is there any reason why OLM operators would be treated differently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For components that did not try to solve this in a one-off way, the upgrade is easy, the new feature becomes available.

For components that created a one-off solution, the upgrade will vary depending on the steps they used.
I seem to recall that monitoring allowed direct route manipulation and the console created its own fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMO doesn't support any kind of route customization. IIRC it was supported by Tectonic but never fully implemented in OpenShift. cc @s-urbaniak for confirmation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, CMO does not support route customization as of today.

// CustomizeableRoutes is where participating operators place the current route status for routes which the cluster-admin
// can customize hostnames and serving certificates.
// How the operator uses that serving certificate is up to the individual operator.
CustomizeableRoutes []CustomizeableRouteStatus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing details on how an operator would update the route's status. Would it be via a strategic merge patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just via normal update. Read, change, Update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. I wasn't sure initially about conflicting updates but IIUC it's implied that each operator deals with potential conflict resolution. Correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's the expectation for kube clients.

// That means it could be embedded into the route or used by the operand directly.
// Operands should take care to ensure that if they use passthrough termination, they properly use SNI to allow service
// DNS access to continue to function correctly.
ServingCertKeyPairSecret SecretNameReference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k will this replace the existing cert? I.e. if the installer creates console.asdfhzwdsf4322.openshift.com as external cluster URL, and on the second day the admin sets console.my-pretty-cluster.com by providing a matching cert, the old name is gone?

And if this cert is broken, he would have lost access to the cluster?

And if the admin wants both the pretty and the installer created URL to access the console, he would have to provide a multi-host certificate (via SANs)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k this is still open

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k will this replace the existing cert? I.e. if the installer creates console.asdfhzwdsf4322.openshift.com as external cluster URL, and on the second day the admin sets console.my-pretty-cluster.com by providing a matching cert, the old name is gone?

I don't think we expect operators to handle the creation of multiple routes. This cert would be used for serving content via the route and if the cluster-admin wants to adjust DNS to use a CNAME, they still can, right?

1. provide a way for cluster-admins to see all the customizeable routes in the cluster.
2. provide a single API that is consistent for every route they need to configure
3. provides a way to have an operator with scope limited permissions read the serving cert/key pairs
4. allow a cluster-admin to specify a different DNS name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. allow a cluster-admin to specify a different DNS name, but keep the installer provided one
  2. provide a fall-back in some way for the admin to access the cluster even if the cert is broken

@Anandnatraj what about 5 and 6?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #6 is up to the admin to decide whether to manually accept a certificate


### Goals

1. provide a way for cluster-admins to see all the customizeable routes in the cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does customize mean? Be precise. Is it only about giving a custom tls bundle? Or is it more? You wording sounds like the operator stops touching its routes objects and the user can do whatever he likes.

4. openshift-monitoring alertmanager-main
5. openshift-monitoring grafana
6. openshift-monitoring prometheus-k8s
7. openshift-monitoring thanos-querier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the api in this list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the api in this list?

Currently uses a different API. I'm open to collapsing, but we would have to solve how to specify SNI certificates to do so.

@pweil-
Copy link

pweil- commented Jan 18, 2021

@jhadvig FYI for console/downloads.

DefaultHost string
// ConsumingUsers is a slice of users that need to have read permission on the secrets in order to use them.
// This will usually be an operator service account.
ConsumingUsers []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in status? Can this be configured? How?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the ingress-operator will be setting those statuses ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section "Control loop to manage secret read permissions in openshift-config" references this field. As I understand it, the operator that manages the route and its associated secrets sets this field to specify that its or its operand's service account needs access to the secret so that the operator can read the certificate and inject it into the route or into the operand. For example, the console operator needs to read the secret in order to build the console route, and if the console route were passthrough, the console operator might also need inject the certificate into the console pod. So the console operator needs to specify the console operator's service account in ConsumingUsers for the console CustomizeableRoute.

// this name.
Name string
// desiredHost is the name that a cluster-admin wants to specify
DesiredHost string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a similar logic in the console-operator where, in case of a custom route a hostname and a secret reference has to be specified.
Check: https://github.com/openshift/api/blob/master/operator/v1/types_console.go#L51-L63

Would drop the desired and just use hostname as we use in https://github.com/openshift/api/blob/master/operator/v1/types_console.go#L53

### Goals

1. provide a way for cluster-admins to see all the customizeable routes in the cluster.
2. provide a single API that is consistent for every route they need to configure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that after this is implemented, we can get rid of ConsoleConfigRoute field that is configuring a custom route for the console ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that after this is implemented, we can get rid of ConsoleConfigRoute field that is configuring a custom route for the console ?

Yes, but at your leisure. I don't have a plan that requires immediate refactoring.

DefaultHost string
// ConsumingUsers is a slice of users that need to have read permission on the secrets in order to use them.
// This will usually be an operator service account.
ConsumingUsers []string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the ingress-operator will be setting those statuses ?

// name is the *logical* name of the route to customize. It does not have to be the actual name of a route resource.
// Keep in mind that this is your API for users to customize. You could later rename the route, but you cannot rename
// this name.
Name string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these names should be unique. Currently we have two routes in the console:

  • console (default route name)
  • console-custom (custom route name)

Should there be a list of these well-known routes for each concerned operator, so the given operator can match a behaviour based on the set CustomizeableRouteSpec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a list of these well-known routes for each concerned operator, so the given operator can match a behaviour based on the set CustomizeableRouteSpec?

We could write an e2e test that effectively has the list based on reading the state of the cluster at the end. I'm ok documenting a list as best we can, but I think the most effective solution is just to look at the .status.componentRoutes

@deads2k
Copy link
Contributor Author

deads2k commented Jan 18, 2021

Pushed updates to comments. Clarifications where requested and some renames.

@sttts appears concerned about SNI overall, anyone else share those concerns?

// CurrentCABundle []byte
}
```

Validation rules to be specified in the openshift/api PR, but basic restrictions on strings as you'd expect.
Copy link
Contributor

@stlaz stlaz Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also expect checks about:

  • does the referenced secret exists
  • is cert/key really a cert/key
  • does the cert match the key
  • does the cert allow serving the desired hostname (or, at least, is TLS not broken with the cert on the given hostname)
  • what have you

These all are not necessarily API validation of the object, but something that should still exist in the operator before it pushes whatever suspicious configuration it is supplied. Some of these checks might make more sense even after the configuration is pushed on.

I think we've seen operators carelessly pushing whatever to their workloads in the past and it took us some persuading to at least make them report that in a condition, so I'd like to prevent that by having a basic set of required checks in the enhancement already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all are not necessarily API validation of the object, but something that should still exist in the operator before it pushes whatever suspicious configuration it is supplied. Some of these checks might make more sense even after the configuration is pushed on.

You cannot do these checks at API validation time, since references can change and the system is supposed to be eventually consistent with creation in any order. I'm willing to recommend, "don't apply stuff that's broken", but it must not be done at API validation time because other changes (like secret updates) can take the content in an unworkable direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, my point is that we should add these recommendations as to how an operator should treat this new configuration before and after applying it, and how to report problems it finds.

5. openshift-monitoring grafana
6. openshift-monitoring prometheus-k8s
7. openshift-monitoring thanos-querier
8. image-registry (this isn't appearing in my installation, but @Miciah says it's there)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adambkaplan might know more?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmage @ricardomaraschini can comment on registry related items 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the registry doesn't have routes by default, but you can enable it though config.imageregistry

observer that reads the ingresses.spec, handles the specified names, sets up the specific list/watch, and takes care of
copying the secret so it can be mounted.

It would also be possible (though I'm less sure someone would use it), to provide a route injection option for serving cert/key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was rather reacting to the note though I'm less sure someone would use it, since I think the authentication operator is the only component using passthrough routes I would expect that actually everyone else is going to use route injection.

reviewers:
- "@danmace"
- "@miciah"
- "@standa"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: even all the way back when I started with GitHub, this handle was already taken, so I had to go with @stlaz

// How the operator uses that serving certificate is up to the individual operator.
// An operator that creates entries in this slice should clean them up during removal (if it can be removed).
// An operator must also handle the case of deleted status without churn.
ComponentRoutes []ComponentRouteStatus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So all operator that want to report about their routes will update status of the same resource? It can cause conflicts as the number of operators grow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So all operator that want to report about their routes will update status of the same resource? It can cause conflicts as the number of operators grow.

well, operators are controllers with conflict resolution, so I don't expect any issues.

// DNS access to continue to function correctly.
ServingCertKeyPairSecret SecretNameReference

// possible future, we could add a set of SNI mappings. I suspect most operators would not properly handle it today.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct me if I'm wrong: TLS termination is not done by the operators but still done by the router. The only capability operators have to offer is a) reconcile CustomizeableRoute objects and b) generate or modify existing Route resources they usually deploy as part of their manifests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the component. If you use terminating routes now, your operator has to wire the cert into the route.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense, thanks! We use reencrypt routes pointing to services so I guess the service CA is being used implicitly then as we don't set it explicitly in cluster-monitoring-operator.

// desiredHost is the name that a cluster-admin wants to specify
DesiredHost string
// Hostname is the host name that a cluster-admin wants to specify
Hostname string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still open question: what if there are multiple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still open question: what if there are multiple?

Routes can only specify a single name (see https://github.com/openshift/api/blob/master/route/v1/types.go#L80). I don't understand the question. Are you proposing an API that coerces operators into the creation of multiple routes?

Copy link
Contributor

@s-urbaniak s-urbaniak Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought about the same and (sorry for xrefing) as commented here #577 (comment) a 1:1 relationship between a Route and CustomizeableRoute seems reasonable.

If the user wants multiple hostnames for a given use case (i.e. multiple external hostnames for an externally visible Prometheus), the underlying operator a) would need to support creating multiple routes (one for each hostname) and b) support for allowing customizing each of them. This is equivalent to what the user (to my knowledge) needs to do, namely create seperate route per hostname.

Does that sound reasonable?

@s-urbaniak
Copy link
Contributor

Sorry for the late reply from my side, but could we add some more guidelines for operators how they are supposed to consume this API? From what I understand, based on cluster-monitoring-operator as an example:

a) Document all well-known logical CustomizeableRoute names in the Monitoring documentation, i.e. prometheus, thanos-querier, grafana, etc. This is the "API" part from the perspective of CMO.
a) Watch for those well-known CustomizeableRoute resources in the openshift-monitoring namespace.
b) Modify existing already deployed Route objects by leveraging the metadata in the logical CustomizeableRoute and more or less "copying" the relevant fields. Especially for this point it would be great to have a bit more guidance in this OEP.

Generally: the operator watching this API (in our case cluster-monitoring-operator) does not handle any low level things like TLS termination, or even reading out secrets, but just copying secret references only.

Please correct me if I'm saying anything silly here :-)

@deads2k
Copy link
Contributor Author

deads2k commented Jan 28, 2021

Sorry for the late reply from my side, but could we add some more guidelines for operators how they are supposed to consume this API? From what I understand, based on cluster-monitoring-operator as an example:

a) Document all well-known logical CustomizeableRoute names in the Monitoring documentation, i.e. prometheus, thanos-querier, grafana, etc. This is the "API" part from the perspective of CMO.
a) Watch for those well-known CustomizeableRoute resources in the openshift-monitoring namespace.
b) Modify existing already deployed Route objects by leveraging the metadata in the logical CustomizeableRoute and more or less "copying" the relevant fields. Especially for this point it would be great to have a bit more guidance in this OEP.

That is the flow proposed here. I didn't/don't have plans to make operators expose multiple routes to access their components. Though such a thing could be built, the requirement as I understand it is to create a way to avoid wildcard certs, allow vanity DNS names, and allow serving certificate customization. Supporting multiple routes appears to be outside the scope of the need.

Generally: the operator watching this API (in our case cluster-monitoring-operator) does not handle any low level things like TLS termination, or even reading out secrets, but just copying secret references only.

The route directly embeds the secret content, so the operator is responsible for doing reasonable validation on input, reflecting status back into this API, and injecting secrets where they are required.

@deads2k
Copy link
Contributor Author

deads2k commented Feb 2, 2021

/hold

waiting for information about whether we need to support multiple routes per logical component. Such an API would likely look different and significantly raise the bar for implementation by operators. At the very least, we're likely to key it differently and have a different multiplicity.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 2, 2021
@Anandnatraj
Copy link

/hold

waiting for information about whether we need to support multiple routes per logical component. Such an API would likely look different and significantly raise the bar for implementation by operators. At the very least, we're likely to key it differently and have a different multiplicity.

@deads2k @sttts based on customer requirements, we only need to only replace default route with a custom route, there is no need to support multiple routes.

@sttts
Copy link
Contributor

sttts commented Feb 9, 2021

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 9, 2021
@sttts
Copy link
Contributor

sttts commented Feb 9, 2021

@danmace @Miciah @standa @stlaz @sur @lilic can everybody please do a final pass so that we can finalize the enhancement?

Comment on lines +133 to +134
// ConsumingUsers is a slice of users that need to have read permission on the secrets in order to use them.
// This will usually be an operator service account.
Copy link
Contributor

@awgreene awgreene Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding the secret will be created in the openshift-config namespace. The consumer of this secret needs to be granted RBAC to read this secret. The only namespace information we get from the ComponentRouteStatus is the namespace of the Route. How will the namespace of the ServiceAccount we plan to grant read privs to be identified? Is there a requirement for the ServiceAccount to exist in the same namespace as the route?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a serviceaccount username is system:serviceaccount:<namespace>:<sa-name>

Is there a requirement for the ServiceAccount to exist in the same namespace as the route?

no.

@deads2k deads2k mentioned this pull request Feb 22, 2021
7 tasks
- [ ] Test plan is defined
- [ ] Operational readiness criteria is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it needs some checks here (follow-up is fine)

// That means it could be embedded into the route or used by the operand directly.
// Operands should take care to ensure that if they use passthrough termination, they properly use SNI to allow service
// DNS access to continue to function correctly.
// SANs in the certificate are ignored, but SNI can be used to make operator managed certificates (like internal load balancers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "SANs in the cert are ignore" mean? Please clarify.

// ensures that no two components will conflict and the same component can be installed multiple times.
Namespace string
// name is the *logical* name of the route to customize. It does not have to be the actual name of a route resource.
// Keep in mind that this is your API for users to customize. You could later rename the route, but you cannot rename
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misplaced in API docs. You don't talk to the dev here, but the admin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misplaced in API docs. You don't talk to the dev here, but the admin.

I'm sure it will work itself out in the API PR. Tihs expresses the intent pretty clearly.

// This API does not include a mechanism to distribute trust, since the ability to write this resource would then
// allow interception. Instead, if we need such a mechanism, we can talk about creating a way to allow narrowly scoped
// updates to a configmap containing ca-bundle.crt for each ComponentRoute.
// CurrentCABundle []byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this? Scratch pad? Future enhancement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this? Scratch pad? Future enhancement?

Something to consider. Again, this will work itself out in the api PR.


#### Cluster-admin wants to customize a route
To use this, a cluster-admin would
1. Either read docs to find the namespace,name tuples or look at an existing cluster and read ingresses.config.openshift.io.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ,<space>

@sttts
Copy link
Contributor

sttts commented Feb 23, 2021

Some nits, and style issues left (first person voice is irritating in this document). Other than that SGTM.

@Miciah ready to merge?

@sttts
Copy link
Contributor

sttts commented Feb 23, 2021

@deads2k can you fix the first person voice? This is strange in a technical document. This document is not the voice of a person, but a technical document describing what teams are building and why.

@deads2k
Copy link
Contributor Author

deads2k commented Feb 24, 2021

@deads2k can you fix the first person voice? This is strange in a technical document. This document is not the voice of a person, but a technical document describing what teams are building and why.

updated for you.

@sttts
Copy link
Contributor

sttts commented Feb 24, 2021

/lgtm
/approve

Further changes as follow-up.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 24, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 24, 2021
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Feb 24, 2021
@sttts
Copy link
Contributor

sttts commented Feb 24, 2021

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet