Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routes created with wrong dns suffix when using router shards #16797

Closed
cldmnky opened this issue Oct 11, 2017 · 13 comments
Closed

Routes created with wrong dns suffix when using router shards #16797

cldmnky opened this issue Oct 11, 2017 · 13 comments
Assignees
Labels
component/routing kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2

Comments

@cldmnky
Copy link
Contributor

cldmnky commented Oct 11, 2017

We are using 3 router shards with route labels, serving different subdomains:
*.internal.example.com, *.trusted.example.com and *.public.example.com

  • The internal router shard is created with: ROUTE_LABELS="shard notin (public, trusted)" and ROUTER_SUBDOMAIN="${name}-${namespace}.internal.example.com"

  • The public router shard is created with: ROUTE_LABELS="shard=public" and ROUTER_SUBDOMAIN="${name}-${namespace}.public.example.com"

  • The trusted router shard is created with: ROUTE_LABELS="shard=trusted" and ROUTER_SUBDOMAIN="${name}-${namespace}.trusted.example.com"

The default RouterConfig subdomain in our master-config's is set to internal.example.com.

Setting ROUTER_OVERRIDE_HOSTNAME="true" on the routers makes everything work as expected:

  • A route created without any labels is claimed by the internal router shard and admitted as ${name}-${namespace}.internal.example.com

  • A route created with labels shard=public is claimed by the public router and admitted as ${name}-${namespace}.public.example.com.

But with the ROUTER_OVERRIDE_HOSTNAME we cannot create our own hostnames.

If we set ROUTER_OVERRIDE_HOSTNAME="false" on our routers all "empty" routes are admitted as
${name}-${namespace}.internal.example.com on our routers. So setting a route label to shard=public will create the route on the public router but with a hostname of ${name}-${namespace}.internal.example.com, thus making it impossible to direct traffic to the route.

We would expect that the route controller respects the ROUTER_SUBDOMAIN of the router when admitting empty routes claimed by a router shard when ROUTER_OVERRIDE_HOSTNAME="false" is set.

Version
kubernetes v1.6.1+5115d708d7
features: Basic-Auth

Server https://console.prod.xxxx.io:443
openshift v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7
Steps To Reproduce

Create 2 or more router shards with different ROUTER_SUBDOMAIN's set.

@pweil- pweil- added component/routing kind/bug Categorizes issue or PR as related to a bug. priority/P2 labels Oct 11, 2017
@cldmnky
Copy link
Contributor Author

cldmnky commented Oct 17, 2017

@knobunc We did like this on the release-3.6 branch to workaround the issue:
We added a --ignore-domain flag to the openshift-router to mimic the behaviour for ROUTER_OVERRIDE_HOSTNAME="true" but only for a specific domain.

--- a/pkg/cmd/infra/router/router.go
+++ b/pkg/cmd/infra/router/router.go
@@ -30,6 +30,7 @@ type RouterSelection struct {
        ResyncInterval time.Duration

        HostnameTemplate string
+       IgnoreDomain     string
        OverrideHostname bool

        LabelSelector string
@@ -65,6 +66,7 @@ type RouterSelection struct {
 func (o *RouterSelection) Bind(flag *pflag.FlagSet) {
        flag.DurationVar(&o.ResyncInterval, "resync-interval", 10*time.Minute, "The interval at which the route list should be fully refreshed")
        flag.StringVar(&o.HostnameTemplate, "hostname-template", cmdutil.Env("ROUTER_SUBDOMAIN", ""), "If specified, a template that should be used to generate the hostname for a route without spec.host (e.g. '${name}-${namespace}.myapps.mycompany.com')")
+       flag.StringVar(&o.IgnoreDomain, "ignore-domain", cmdutil.Env("ROUTER_IGNORE_DOMAIN", ""), "If specified, a domain suffix that will make the router use the hostname-template value instead. Useful for changing routes created with the default subdomain")
        flag.BoolVar(&o.OverrideHostname, "override-hostname", cmdutil.Env("ROUTER_OVERRIDE_HOSTNAME", "") == "true", "Override the spec.host value for a route with --hostname-template")
        flag.StringVar(&o.LabelSelector, "labels", cmdutil.Env("ROUTE_LABELS", ""), "A label selector to apply to the routes to watch")
        flag.StringVar(&o.FieldSelector, "fields", cmdutil.Env("ROUTE_FIELDS", ""), "A field selector to apply to routes to watch")
@@ -85,7 +87,7 @@ func (o *RouterSelection) RouteSelectionFunc() controller.RouteHostFunc {
                return controller.HostForRoute
        }
        return func(route *routeapi.Route) string {
-               if !o.OverrideHostname && len(route.Spec.Host) > 0 {
+               if !o.OverrideHostname && len(route.Spec.Host) > 1 && !(len(o.IgnoreDomain) > 1 && strings.HasSuffix(route.Spec.Host, o.IgnoreDomain)) {
                        return route.Spec.Host
                }
                // GetNameForHost returns the ingress name for a generated route, and the route route
@@ -164,6 +166,9 @@ func (o *RouterSelection) Complete() error {
        if len(o.HostnameTemplate) == 0 && o.OverrideHostname {
                return fmt.Errorf("--override-hostname requires that --hostname-template be specified")
        }
+       if len(o.IgnoreDomain) > 0 && len(o.HostnameTemplate) == 0 {
+               return fmt.Errorf("--ingnore-domain requires that --hostname-template be specified")
+       }
        if len(o.LabelSelector) > 0 {
                s, err := labels.Parse(o.LabelSelector)
                if err != nil {

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 23, 2018
@codrinbucur
Copy link

Can this be transformed in a pull request and bring the change into OCP 3.6 or 3.7?

@cldmnky
Copy link
Contributor Author

cldmnky commented Mar 13, 2018

@codrinbucur sure, will create a PR for this, not sure though if our hack is that generic, will try to enhance it.

@codrinbucur
Copy link

@cldmnky - Thanks!

@ramr
Copy link
Contributor

ramr commented Apr 12, 2018

@cldmnky there's two router options that control blacklisting and whitelisting route domains.
Documentation at: https://docs.openshift.org/latest/architecture/networking/routes.html#architecture-core-concepts-routes-deny-allow

So doing something like:

oc set env dc/public-router-dc ROUTER_DENIED_DOMAINS="internal.example.com"

should work for you on your public and/or trusted router.
Its doing the same check as ignore domains - just on a set of domains. In a similar vein, there's also domain whitelist (allowed domains) that you can use if you know the list of domains on routes that you want a specific router to serve.

Please let us know if that works. Thx

@cldmnky
Copy link
Contributor Author

cldmnky commented Apr 17, 2018

@ramr I just tried with the ROUTER_DENIED_DOMAINS, but now the route just gets rejected by the router's admission checks:

oc describe deploymentconfig staging-router -n default
Name:		staging-router
Namespace:	default
...
    Environment:
      ....
      ROUTER_OVERRIDE_HOSTNAME:			false
      ...
      ROUTER_SERVICE_NAME:			staging-router
      ROUTER_SERVICE_NAMESPACE:			default
      ROUTER_SUBDOMAIN:				${name}-${namespace}.staging.example.com
      ROUTE_LABELS:				tier=staging
      ...
      ROUTER_DENIED_DOMAINS:			internal.example.com

And creating the route with an empty hostname and labels: tier=staging:

test-xpr-bloggar-test.internal.example.com
 Rejected by router 'staging-router' 9 minutes ago

@cldmnky
Copy link
Contributor Author

cldmnky commented Apr 17, 2018

The default router domain must be set in the master-config.yaml, and when creating a route without a hostname the controller sets the hostname to the default router domain, thus never allowing us to use the ROUTER_SUBDOMAIN env template on the router...

@ramr
Copy link
Contributor

ramr commented Apr 18, 2018

@cldmnky aah my bad. I misread this initially as you wanted to reject the routes with the *.internal.example.com names in your prod router.
But from your comments above and re-reading this, it looks like you want to process routes which were created with no host names [and got defaulted] and set them specific to the router environment you are running under.

ramr added a commit to ramr/origin that referenced this issue Apr 18, 2018
…ced with

the subdomain/hostname-template the router environment is running under.
Fixes issue openshift#16797
@ramr
Copy link
Contributor

ramr commented Apr 18, 2018

@cldmnky I took a stab at making this more generic. I still need to add some tests to it but if you can take a look at PR: #19418 and see if that addresses your problem as well. Thx

ramr added a commit to ramr/origin that referenced this issue Apr 20, 2018
with the subdomain/hostname-template that the router environment is
running under.
Fixes issue openshift#16797
@cldmnky
Copy link
Contributor Author

cldmnky commented Apr 20, 2018

The PR #19418 works fine!

And while were at it, oc create route doesn't allow us to set route labels. That would be nice, I'll open a separate issue for that!

ramr added a commit to ramr/origin that referenced this issue Apr 24, 2018
with the subdomain/hostname-template that the router environment is
running under.
Fixes issue openshift#16797
ramr added a commit to ramr/origin that referenced this issue Apr 30, 2018
with the subdomain/hostname-template that the router environment is
running under. And fix typos as per @knobunc review
Fixes issue openshift#16797
ramr added a commit to ramr/origin that referenced this issue Apr 30, 2018
with the subdomain/hostname-template that the router environment is
running under. And fix typos as per @knobunc review
Fixes issue openshift#16797
ramr added a commit to ramr/origin that referenced this issue Apr 30, 2018
with the subdomain/hostname-template that the router environment is
running under. And fix typos as per @knobunc review
Fixes issue openshift#16797
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 20, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/routing kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2
Projects
None yet
Development

No branches or pull requests

7 participants