Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-1705: Trim ACL names according to RFC1123 #1281

Merged
merged 1 commit into from Sep 28, 2022

Conversation

tssurya
Copy link
Contributor

@tssurya tssurya commented Sep 27, 2022

We didn't consider this tiny hack that we do

name = fmt.Sprintf("%.63s", name)
there when we wrote #1259 and unit tests don't actually scream loudly for longer names.

Without this PR we break users who have namespace names longer than 45 characters.

Signed-off-by: Surya Seetharaman suryaseetharaman.9@gmail.com
Co-authored-by: Patryk Diak pdiak@redhat.com
(cherry picked from commit b10ed7b)

/cc @trozet and @jcaamano

We didn't consider this tiny hack that we do
https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60
there when we wrote openshift#1259
and unit tests don't actually scream loudly for longer names.

Without this PR we break users who have namespace names
longer than 45 characters.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Co-authored-by: Patryk Diak <pdiak@redhat.com>
(cherry picked from commit b10ed7b)
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 27, 2022

@tssurya: GitHub didn't allow me to request PR reviews from the following users: and.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

We didn't consider this tiny hack that we do

name = fmt.Sprintf("%.63s", name)
there when we wrote #1259 and unit tests don't actually scream loudly for longer names.

Without this PR we break users who have namespace names longer than 45 characters.

Signed-off-by: Surya Seetharaman suryaseetharaman.9@gmail.com
Co-authored-by: Patryk Diak pdiak@redhat.com
(cherry picked from commit b10ed7b)

/cc @trozet and @jcaamano

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tssurya tssurya changed the title OCBUGS-1705: Trim ACL names according to RFC1123 OCPBUGS-1705: Trim ACL names according to RFC1123 Sep 27, 2022
@openshift-ci-robot
Copy link
Contributor

@tssurya: This pull request references Jira Issue OCPBUGS-1705, which is invalid:

  • expected the bug to target the "4.12.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

We didn't consider this tiny hack that we do

name = fmt.Sprintf("%.63s", name)
there when we wrote #1259 and unit tests don't actually scream loudly for longer names.

Without this PR we break users who have namespace names longer than 45 characters.

Signed-off-by: Surya Seetharaman suryaseetharaman.9@gmail.com
Co-authored-by: Patryk Diak pdiak@redhat.com
(cherry picked from commit b10ed7b)

/cc @trozet and @jcaamano

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Sep 27, 2022
@tssurya
Copy link
Contributor Author

tssurya commented Sep 27, 2022

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Sep 27, 2022
@openshift-ci-robot
Copy link
Contributor

@tssurya: This pull request references Jira Issue OCPBUGS-1705, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.12.0) matches configured target version for branch (4.12.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 27, 2022
@trozet
Copy link
Contributor

trozet commented Sep 27, 2022

/approved

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 27, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: trozet, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 27, 2022
@trozet
Copy link
Contributor

trozet commented Sep 27, 2022

/retest-required

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD e130625 and 2 for PR HEAD 54381b0 in total

@tssurya
Copy link
Contributor Author

tssurya commented Sep 28, 2022

/retest

2 similar comments
@tssurya
Copy link
Contributor Author

tssurya commented Sep 28, 2022

/retest

@tssurya
Copy link
Contributor Author

tssurya commented Sep 28, 2022

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 28, 2022

@tssurya: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-e2e-gcp-ovn 54381b0 link false /test okd-e2e-gcp-ovn
ci/prow/e2e-openstack-ovn 54381b0 link false /test e2e-openstack-ovn
ci/prow/e2e-hypershift 54381b0 link false /test e2e-hypershift

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 337a94c into openshift:master Sep 28, 2022
@openshift-ci-robot
Copy link
Contributor

@tssurya: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-1705 has been moved to the MODIFIED state.

In response to this:

We didn't consider this tiny hack that we do

name = fmt.Sprintf("%.63s", name)
there when we wrote #1259 and unit tests don't actually scream loudly for longer names.

Without this PR we break users who have namespace names longer than 45 characters.

Signed-off-by: Surya Seetharaman suryaseetharaman.9@gmail.com
Co-authored-by: Patryk Diak pdiak@redhat.com
(cherry picked from commit b10ed7b)

/cc @trozet and @jcaamano

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jcaamano
Copy link
Contributor

jcaamano commented Oct 3, 2022

@tssurya

In respect to #1282 (comment)

It looks like the problem we were having is actually saving us from bigger problems.

The whole logic of updateStaleDefaultDenyACLNames assumes that we have full un-cropped ACLs names:

	p := func(item *nbdb.ACL) bool {
		return item.ExternalIDs[defaultDenyPolicyTypeACLExtIdKey] == string(npType) && // default-deny-policy-type:Egress or default-deny-policy-type:Ingress
			strings.Contains(item.Match, gressSuffix) && // Match:inport ==	@ablah80448_egressDefaultDeny or Match:inport == @ablah80448_ingressDefaultDeny
			!strings.Contains(*item.Name, arpAllowPolicySuffix) && // != name: namespace_ARPallowPolicy
			!strings.Contains(*item.Name, gressSuffix) // filter out already converted ACLs (we still continue to pick the cases where names were >45 chars)
	}

We need to assume that the name might be cropped.
A more proper predicate would check if the name ends (portion after the _) with a substring of the policy name.

Additionally (not directly relevant to this PR), I think that we should short-term (and to be back-portable):

  • stop using the name (and external ids) in isEquivalentACL
  • stop using DeleteACLs (ACLs should always be removed via garbage collection after removing them from PGs and switches, and never directly)
    cc @npinaeva I don't know if your ACL work is scoped in 4.12 but we should get this done for 4.12 at least and probably backport it. Let me know your plans!

The units tests don't actually account for a long namespace name :/

@npinaeva
Copy link
Member

npinaeva commented Oct 4, 2022

Oh I didn't know we are testing long namespaces, there should be more that 1 bug for that case.
My plan was to get rid of isEquivalentACL which is a big change, I don't know yet if I need to backport it
I can try to make some immediate changes to prevent cropped name bugs, need to investigate some aspects
About deleting ACLs, I think in general it's better to be explicit that ACLs need to be deleted, and not just garbage-collected. I need to check if it will throw an error if ACL I am trying to delete still has a reference? If not, there is not much difference

@jcaamano
Copy link
Contributor

jcaamano commented Oct 4, 2022

Oh I didn't know we are testing long namespaces, there should be more that 1 bug for that case. My plan was to get rid of isEquivalentACL which is a big change, I don't know yet if I need to backport it I can try to make some immediate changes to prevent cropped name bugs, need to investigate some aspects About deleting ACLs, I think in general it's better to be explicit that ACLs need to be deleted, and not just garbage-collected. I need to check if it will throw an error if ACL I am trying to delete still has a reference? If not, there is not much difference

They will throw an error since they are strong references. And we are currently not able to verify if an ACL is referenced in more than one place. And an efficient implementation would probably reuse ACLs in different places if possible but keeping track of where would be complex from a client perspective, making this garbage collection a meaningful server feature to have and rely on.
Anyway, if we are able to verify ACLs are only referenced once, I would still not export any DeleteACLs method from libovsdbops and handle it internally, even if explicit.

@jcaamano
Copy link
Contributor

jcaamano commented Oct 4, 2022

My plan was to get rid of isEquivalentACL

For 4.12?

@npinaeva
Copy link
Member

npinaeva commented Oct 4, 2022

And an efficient implementation would probably reuse ACLs in different places if possible

ovn is taking care of that, it will create only 1 logical_flow for equivalent ACLs

I would still not export any DeleteACLs method from libovsdbops and handle it internally, even if explicit

What kind of interface do you mean here? I still would expect to have DeleteACL if I have CreateACL function?

For 4.12?

I don't have a timeline yet, it wasn't a high priority issue :( We may need to discuss if it is safe and makes sense to merge that to 4.12

@jcaamano
Copy link
Contributor

jcaamano commented Oct 4, 2022

ovn is taking care of that, it will create only 1 logical_flow for equivalent ACLs

This is not the same thing. Less NB DB space consumed and less northd processing time.

What kind of interface do you mean here? I still would expect to have DeleteACL if I have CreateACL function?

You can expect it, but you don't necessarily need to have it. Or we can have it and panic with the appropriate explanation so that no one is tempted to add it.

I don't have a timeline yet, it wasn't a high priority issue :( We may need to discuss if it is safe and makes sense to merge that to 4.12

No, in that case I think we should have a separate bug that specifically targets this issue and can be assigned.

@tssurya
Copy link
Contributor Author

tssurya commented Oct 4, 2022

I don't have a timeline yet, it wasn't a high priority issue :( We may need to discuss if it is safe and makes sense to merge that to 4.12

No, in that case I think we should have a separate bug that specifically targets this issue and can be assigned.

I have opened https://issues.redhat.com/browse/OCPBUGS-1958 -> for the bug fix we need to take the approach we decided which is either remove the second case match in isEquivalentACL or fix the externalID part to be unique.

The long term fix which is a bit different needs to be discussed in the team meeting.

@npinaeva
Copy link
Member

npinaeva commented Oct 5, 2022

I am working on a long term fix for at least a month already: https://issues.redhat.com/browse/SDN-3447
If we need a faster bug fix, I would prefer this bug to be a part of the same effort ^, so we don't develop 2 different ways to fix the same problem.

@tssurya
Copy link
Contributor Author

tssurya commented Oct 5, 2022

I am working on a long term fix for at least a month already: https://issues.redhat.com/browse/SDN-3447 If we need a faster bug fix, I would prefer this bug to be a part of the same effort ^, so we don't develop 2 different ways to fix the same problem.

We can't backport the externalIDs fix and this current bug needs to be backportable, the externalIDs effort around telling who is the CMS is different from this bug. Thanks for creating the JIRA card! The bug needs to be fixed nevertheless. Let's talk more in the meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants