Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1896977: Enhance API host name validation #167

Conversation

candita
Copy link
Contributor

@candita candita commented Dec 14, 2020

Change route validation to apply the stricter validation rules from IsFullyQualifiedDomainName on host name when OpenShiftAllowNonDNSCompliantHostAnnotation is false or missing: host name must have at least two labels, with each label less than 64 chars, no disallowed characters, total length less than 254 chars, and trailing dots are allowed. If OpenShiftAllowNonDNSCompliantHostAnnotation is true then use the former lenient validation of just IsDNS1123Subdomain: total length less than 254, and no disallowed characters.

Add new annotation: OpenShiftAllowNonDNSCompliantHostAnnotation = "route.openshift.io/allow-non-dns-compliant-host" to allow configuations that suppress the stricter validation. This may validate routes that cannot be admitted (same behavior as before).

Add/update unit tests.

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Dec 14, 2020
@openshift-ci-robot
Copy link

@candita: This pull request references Bugzilla bug 1896977, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1896977: [WIP] Enhance API host name validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@candita
Copy link
Contributor Author

candita commented Dec 14, 2020

Companion PR in openshift/router: openshift/router#238

} else {
errs := kvalidation.IsFullyQualifiedDomainName(specPath.Child("host"), route.Spec.Host)
if len(errs) != 0 {
result = append(result, field.Invalid(specPath.Child("host"), route.Spec.Host, fmt.Sprintf("host must conform to DNS 1123 subdomain conventions: %v", errs)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the else branch's IsFullyQualifiedDomainName error gets associated with DNS 1123 subdomain conventions in the error message, while the previous if branch's IsDNS1123Subdomain error gets the generic DNS subdomain conventions conventions. Wouldn't IsDNS1123Subdomain be the one looking for DNS 1123 subdomain violations?

Copy link
Contributor Author

@candita candita Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsFullyQualifiedDomain calls IsDNS1123Subdomain, IsDNS1123Label, and performs several other checks, making it more closely conform to RFC1123.

expectedErrors: 1,
},
{
name: "Non-dns compliant host with override annotation",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dns -> DNS here and in other human-oriented strings.

To: createRouteSpecTo("serviceName", "Service"),
},
},
expectedErrors: 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably out of scope for this PR, but I prefer tests for error generation to place some constraints on the error strings. For an example, see here, or, without stretchr/testify, here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat in this space, because it's a bit wiggly without actual error string comparison: do we want explicit checks for "really long subdomain, but overall hostname is short enough" showing we error on that (or not, with the override annotation)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expectedErrors: 1,
},
{
name: "Generated host name too long",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Spec.Host is explicitly set here, so "Generated host name too long" -> "Specified host name too long"? Spec.Host could have been set by default-host-generating logic, or could have been set by whoever created the Route in the first place, and this test-case doesn't need to distinguish between those two cases.

@candita candita force-pushed the BZ-1896977-APIHostnameValidation branch from 2c5d8f0 to d1eff36 Compare December 15, 2020 20:43
@candita
Copy link
Contributor Author

candita commented Dec 16, 2020

/test e2e-aws-upgrade

@candita
Copy link
Contributor Author

candita commented Dec 16, 2020

/hold for openshift/origin#25770

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 16, 2020
if len(route.Spec.Host) > 0 {
if len(kvalidation.IsDNS1123Subdomain(route.Spec.Host)) != 0 {
result = append(result, field.Invalid(specPath.Child("host"), route.Spec.Host, "host must conform to DNS 952 subdomain conventions"))
if lenient, ok := route.Annotations[oapi.OpenShiftAllowNonDNSCompliantHostAnnotation]; ok && lenient == "true" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot tighten this for updates that are not changing the route.spec.host. That means you'll need to find a way to split up the update validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @deads2k , that sounds reasonable. I have made a change and updated some unit tests. PTAL when you get a chance.

@deads2k
Copy link
Contributor

deads2k commented Dec 17, 2020

as long as we don't break route REST updates and the only impact is to

  1. routes that don't work today
  2. routes that use custom routers today that may accept names our router rejects and the NE team has good reason to believe this is a very narrow use-case

I think this reasonable.

@candita
Copy link
Contributor Author

candita commented Dec 18, 2020

/unhold

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 18, 2020
@openshift-merge-robot
Copy link
Contributor

openshift-merge-robot commented Dec 18, 2020

@candita: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-cmd 11bf5e1 link /test e2e-cmd

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@candita
Copy link
Contributor Author

candita commented Dec 18, 2020

/test e2e-aws

@candita
Copy link
Contributor Author

candita commented Dec 18, 2020

/test e2e-cmd

1 similar comment
@candita
Copy link
Contributor Author

candita commented Dec 18, 2020

/test e2e-cmd

@candita
Copy link
Contributor Author

candita commented Dec 18, 2020

/retest

@candita
Copy link
Contributor Author

candita commented Dec 21, 2020

/test e2e-cmd

@candita
Copy link
Contributor Author

candita commented Dec 21, 2020

https://bugzilla.redhat.com/show_bug.cgi?id=1895107 describes the issue with the failing e2e-cmd test.

@candita
Copy link
Contributor Author

candita commented Dec 22, 2020

/test e2e-cmd

@candita
Copy link
Contributor Author

candita commented Dec 22, 2020

/hold for openshift/library-go#975

@candita
Copy link
Contributor Author

candita commented Jan 25, 2021

/assign @mfojtik

func ValidateRouteUpdate(route *routeapi.Route, older *routeapi.Route) field.ErrorList {
allErrs := validation.ValidateObjectMetaUpdate(&route.ObjectMeta, &older.ObjectMeta, field.NewPath("metadata"))
allErrs = append(allErrs, validation.ValidateImmutableField(route.Spec.WildcardPolicy, older.Spec.WildcardPolicy, field.NewPath("spec", "wildcardPolicy"))...)
if route.Spec.Host != older.Spec.Host {
allErrs = append(allErrs, ValidateHost(route)...)
}
allErrs = append(allErrs, ValidateRoute(route)...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks improper based on our previous conversation. The rules for updating are different than the rules for creating. This is a long-standing pattern in kube where we have three distinct methods with different rules

  1. validateObj - this validates objects for create semantics
  2. validateObjUpdate - this validates differently than validateObj because create calls can be tighter than update calls. Also, this should ignore status.
  3. validateObjectStatusUpdate - this validates differently than validateObjUpdate because status updates should never fail on valid specs.

Your validateRoute function should fully validate a route for the purposes of create. Because the update semantics don't match the create semantics, you cannot call one from the other. Making your validateRoute function not fully validate a route isn't a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed this, and Miciah's similar comment #167 (comment), in this PR instead of a followup PR. @deads2k can you PTAL?

@candita candita force-pushed the BZ-1896977-APIHostnameValidation branch from 132f77c to 4ddf12e Compare January 29, 2021 20:35
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 29, 2021
@candita candita force-pushed the BZ-1896977-APIHostnameValidation branch from 4ddf12e to 402c0fd Compare January 29, 2021 20:42
@candita
Copy link
Contributor Author

candita commented Jan 29, 2021

/test e2e-aws

2 similar comments
@candita
Copy link
Contributor Author

candita commented Jan 29, 2021

/test e2e-aws

@candita
Copy link
Contributor Author

candita commented Feb 1, 2021

/test e2e-aws

@candita
Copy link
Contributor Author

candita commented Feb 1, 2021

/test e2e-aws-upgrade

@candita
Copy link
Contributor Author

candita commented Feb 1, 2021

/test e2e-aws-serial

@candita
Copy link
Contributor Author

candita commented Feb 1, 2021

/test e2e-cmd

@deads2k
Copy link
Contributor

deads2k commented Feb 2, 2021

I don't think I'd personally express it this way, but the public API is acceptable.

/approve

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2021
@candita
Copy link
Contributor Author

candita commented Feb 2, 2021

/lgtm

@openshift-ci-robot
Copy link

@candita: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Miciah
Copy link
Contributor

Miciah commented Feb 2, 2021

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 2, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: candita, deads2k, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 45abde2 into openshift:master Feb 2, 2021
@openshift-ci-robot
Copy link

@candita: All pull requests linked via external trackers have merged:

Bugzilla bug 1896977 has been moved to the MODIFIED state.

In response to this:

Bug 1896977: Enhance API host name validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gabemontero
Copy link
Contributor

Hey @candita

So we recently started seeing errors in our openshift samples testing (stuff that has been around since 3.x of openshift) where creation of simple routes where the host is not specified can fail. Based on the timing and when this merged, and the nature of this change that I can gleam, I suspect it stems from these changes.

A recent run is at https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-samples-operator/355/pull-ci-openshift-cluster-samples-operator-master-e2e-aws-image-ecosystem/1358801474489946112

Here is the key snippet in the test logs:

error: Route.route.openshift.io "rails-postgresql-example" is invalid: spec.host: Invalid value: "rails-postgresql-example-e2e-test-rails-postgresql-repo-test-n22hx.apps.ci-op-x3rbtwji-fe9c0.origin-ci-int-aws.dev.rhcloud.com": host must conform to DNS 1123 naming conventions: [spec.host: Invalid value: "rails-postgresql-example-e2e-test-rails-postgresql-repo-test-n22hx": must be no more than 63 characters]

It definitely looks like we are getting the default hostname generation from https://github.com/openshift/openshift-apiserver/blob/master/pkg/route/apiserver/simplerouteallocation/plugin.go#L50-L55 where it takes the route name, namespace, and shard.DNSSuffix and concatenates them together.

In this test case, the name is rails-postgresql-example, the namespace is e2e-test-rails-postgresql-repo-test-n22hx, and the rest must be the shard.DNSSuffix from the plugin.... though that sucker alone is almost 63 characters.
But put together they definitely surpass that limit.

My question is (and I did try to see if this was captured in prior PR discussion, but my skimming did not find anything - apologies if you have to regurgitate old discussion) should we not expect our generation logic to be cognizant of the host field restrictions. ... i.e. start stripping characters from the name and/or namespace etc. to get things under 63?

Or is the recommendation that is going to come out from all this is that users are going to have to start explicitly configuring the host field or something of that ilk?

@adambkaplan @bparees FYI

@candita
Copy link
Contributor Author

candita commented Feb 8, 2021

...should we not expect our generation logic to be cognizant of the host field restrictions. ... i.e. start stripping characters from the name and/or namespace etc. to get things under 63?
Or is the recommendation that is going to come out from all this is that users are going to have to start explicitly configuring the host field or something of that ilk?

Yes, the generation logic will not do the name-shrinking for you. The workaround is to set a custom host name or use the new annotation which allows invalid host names (which make invalid routes).

The solution we decided on was to make the route creation fail faster, without introducing host names that departed from the known formula. If your test here is creating routes, they would have been invalid routes because the generated host name label is too long, and that would not have been discovered at the route-creation validation, but at the route-admission validation.

If you require a valid route, you will need to use a custom host name. If you do not require a valid route, you can use a new route annotation to override the strict validation: route.openshift.io/allow-non-dns-compliant-host, described in https://github.com/openshift/api/blob/670ac3fc997c2f1d19b8c29ef04f70d6e3d4a59e/route/v1/types.go#L287

@openshift-ci-robot
Copy link

@candita: Bugzilla bug 1896977 is in an unrecognized state (VERIFIED) and will not be moved to the MODIFIED state.

In response to this:

Bug 1896977: Enhance API host name validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gabemontero
Copy link
Contributor

gabemontero commented Feb 8, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants