Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate that SDN API object CIDRs are in canonical form #13508

Merged
merged 2 commits into from Apr 7, 2017

Conversation

danwinship
Copy link
Contributor

@danwinship danwinship commented Mar 22, 2017

Eg, if you want ClusterNetwork to be "10.128.0.0/14", you have to say "10.128.0.0/14", not "10.128.0.1/14" or "10.128.32.99/14". (net.ParseCIDR() accepts the latter two, because they are valid ways of
referring to hosts within that network, but they aren't valid ways of referring to the network itself, which is what we want here).

Tagging needs-api-review because, in particular, this could cause previously-considered-valid EgressNetworkPolicy objects to start being rejected (existing objects when they're updated, or new objects if the admin is creating them automatically from a buggy script/template or something). I'm claiming that this is a bug fix rather than an API break though, on the grounds that we are currently probably doing the wrong thing with policies like "allow traffic to 192.168.1.15/24" right now; currently we pass it to OVS as-is, and OVS interprets it as meaning "allow traffic to 192.168.1.0/24", but the user probably actually meant "allow traffic to 192.168.1.15/32". (Meaning, they wanted to allow traffic to a single host, and we're allowing traffic to the entire subnet instead.) (It's not clear that anyone is actually making this mistake in production, but if they did, this is what would happen.)

(The same concern theoretically also applies to ClusterNetwork and HostSubnet objects, which are also affected by this patch, but ClusterNetwork would only be checked when restarting master, and it should be obvious what needs to be fixed then; and the HostSubnet.Subnet field is always populated by OpenShift itself, with a correct value, so the extra validation shouldn't hurt anything there.)

@danwinship
Copy link
Contributor Author

[test][testextended][extended:networking]

@danwinship
Copy link
Contributor Author

@openshift/networking PTAL

Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think being strict here is good... but if the API review rejects this, is there a way to warn instead?

@danwinship
Copy link
Contributor Author

but if the API review rejects this, is there a way to warn instead?

We can glog.Warningf() and then not return an error, but that will just result in a warning in the master logs that they probably won't see. There's no way to return a warning directly to the person trying to create the bad policy.

Copy link
Contributor

@dcbw dcbw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@danwinship
Copy link
Contributor Author

@openshift/api-review can you give a thumbs up/thumbs down on the breakage of previously-allowed-but-possibly-misinterpreted EgressNetworkPolicy objects as described in the initial comment?

@smarterclayton
Copy link
Contributor

So if this is added, a previously working master could fail to start during an upgrade?

This is generally a good example of something we would not tighten validation on (impact to a cluster). If there was a prevalidation check in Ansible it would be less impactful (or at least, can prevent a serious break).

Since there are nominally security implications, and this is unlikely to surprise someone who understands networks, I think we could probably get away with it if we ensure we don't break prior to an upgrade (and document this as a breaking change).

@danwinship
Copy link
Contributor Author

So if this is added, a previously working master could fail to start during an upgrade?

Actually, it turns out, no; we were already silently fixing bad clusterNetworkCIDR and serviceNetworkCIDR values before trying to create a ClusterNetwork object from them.

The new commit in the latest push changes this to a warning rather than a silent error, but still allows it. So, the PR will not cause any upgrade failures; it only causes problems if you try to modify "bad" objects post-upgrade.

@smarterclayton
Copy link
Contributor

smarterclayton commented Apr 3, 2017 via email

@dcbw
Copy link
Contributor

dcbw commented Apr 5, 2017

Updates LGTM

@knobunc
Copy link
Contributor

knobunc commented Apr 5, 2017

[merge]

Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@smarterclayton
Copy link
Contributor

smarterclayton commented Apr 5, 2017 via email

Copy link
Contributor

@pecameron pecameron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Previously you could say things like

  clusterNetworkCIDR: "10.128.32.99/14"

and the "extra" bits in the address would just be ignored and that
would count as "10.128.0.0/14". Make it warn about this now (but still
accept it).
Eg, if you want ClusterNetwork to be "10.128.0.0/14", you have to say
"10.128.0.0/14", not "10.128.0.1/14" or "10.128.32.99/14".

All OpenShift-generated objects already did this correctly, but this
might cause previously-considered-valid EgressNetworkPolicy objects to
start failing to validate.
@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 3a10959

@openshift-bot
Copy link
Contributor

Evaluated for origin testextended up to 3a10959

@stevekuznetsov
Copy link
Contributor

Unit tests failed to compile:

# github.com/openshift/origin/pkg/sdn/api/validation
pkg/sdn/api/validation/validation_test.go:4: imported and not used: "strings"

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 3a10959

@openshift-bot
Copy link
Contributor

openshift-bot commented Apr 7, 2017

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/275/) (Base Commit: 42c3bfa) (Image: devenv-rhel7_6121)

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/627/) (Base Commit: 42c3bfa)

@danwinship
Copy link
Contributor Author

Unit tests failed to compile:

yeah, I kept missing that in the output and thinking it was just some sort of infrastructure flake or something. fixed in the latest push so it should go through this time

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/testextended SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended/128/) (Base Commit: 42c3bfa) (Extended Tests: networking)

@knobunc
Copy link
Contributor

knobunc commented Apr 7, 2017

[merge]

@openshift-bot openshift-bot merged commit 8125090 into openshift:master Apr 7, 2017
@danwinship danwinship deleted the cidr-validation branch May 31, 2017 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants