Improve Policy discoverability #2128

youngnick · 2023-06-19T13:37:48Z

What type of PR is this?
/kind cleanup
/kind documentation
/kind gep

What this PR does / why we need it:
This PR updates GEP-713 - Policy Attachment - to clarify how implementations using the pattern should handle the discoverability problems it creates.

Some solutions need further GEPs - these are marked as EXPERIMENTAL DO NOT USE (to prevent similar problems from arising in the future).

Additionally, this PR removes most content from the existing "Policy Attachment" documentation page and adds a link to this GEP so that there's only one place to keep up to date.

Which issue(s) this PR fixes:

Updates #2014

Does this PR introduce a user-facing change?:

The Policy Attachment pattern has new requirements and recommendations to help with Policy discoverability.

youngnick · 2023-06-19T13:40:40Z

/hold for lots of review on this one.

youngnick · 2023-06-19T13:41:35Z

Most of the new content is best viewed in the deployment preview:

https://deploy-preview-2128--kubernetes-sigs-gateway-api.netlify.app/geps/gep-713/#status-and-the-discoverability-problem

k8s-triage-robot · 2023-06-19T16:42:56Z

Unknown CLA label state. Rechecking for CLA labels.

Send feedback to sig-contributor-experience at kubernetes/community.

/check-cla
/easycla

geps/gep-713.md

youngnick · 2023-06-20T16:20:54Z

I've just removed the change to the site page about Policy Attachment, I'll split that into a seperate PR.

frankbu · 2023-06-21T21:01:34Z

geps/gep-713.md

+
+**Ana**: _Thanks for the help. Wish we’d found better answers._ 😢
+
+### The Problem, restated


Seems to me the problem is that Ana deployed a buggy version of the service, which needs to be rolled back. The only role that the policy attachment(s) played was to somewhat mitigate the effect of the bad service. Maybe the problem would have been detected more quickly without the retry policy? Given that Ana didn't explicitly set a retry policy for her service and was therefore relying on the implementation default for her service (however that's determined) does she really need to know where exactly the default comes from? Would the easy "fix" have been to explicitly turn off the retries for her service (which is what she was expecting), which would then expose her bug more clearly?

Maybe there's a better parable that would show how a policy attachment can actually break a working service and the service owner can't disable it?

I'm just poking at the question of does it really matter where an inherited policy attachment is, that is causing problems, only that it's affecting the default value of some field?

The problem isn't that the Policy object mitigated the effects, it's that Ana had no way of knowing that's what was happening.

As I say below, there are really three things that any owner of an object that's affected by a Policy wants to know (and the later ones mostly imply that you know the earlier ones:

That some Policy is affecting a particular object

Which Policy is (or Policies are) affecting a particular object

What settings in the Policy are affecting the object.

As we go on to discuss later, there are issues with just dropping these details into the status of objects, and so this is trying to give us a range of options based on how bad the need is, and what resources are available.

frankbu · 2023-06-21T21:30:07Z

geps/gep-713.md

+the Policy Admin, to Gateway API’s persona list, and without a solution to the
+discoverability problem, their actions are largely invisible to the Application
+Developer. Not only that, but their concerns cut across the previously established
+levels.


I'm wondering if this problem would be less severe if there were certain restrictions on where and how policy attachments are used?

Seems like one thing that should always be possible is that the owner of a resource should be able to override any inherited policy attachments. Does it really make sense that a policy attachment can override an explicit setting in a resource? What is the use case for having explicit settings in the user's resource that get ignored and changed under the covers because of some policy attachment somewhere?

I'm not sure what properties of a resource the spec says can be controlled by inherited policy attachments, but I would say there "should be" only two things:

default values of fields in the resource (first class API fields)

fields that can be set with direct policy attachments (these are fields that presumably have been added to support things that are not directly supported in the resources API itself)

If that was the case, then the owner of a resource would always be able to explicitly override any inherited policy attachments that don't work, either in the resource itself or in a direct policy attachment sitting next to the resource.

Being able to override specific settings can be very useful in clusters with strict rules about things, although I agree that it's probably better in those cases to reject the create or change that would have an undesirable setting using a Validating Admission Controller - faster feedback is much better than silently changing a value.

But, this override need comes from direct experience @robscott and I have had with previous iterations of similar ideas, sysadmins (in Gateway API's personas, Cluster Admins or even Infra Admins) sometimes want to override settings so that users can't take certain actions, and this is a mechanism to allow that to happen.

frankbu · 2023-06-21T21:32:49Z

geps/gep-713.md

+
+(There’s also another use case to consider, in that Charlie should have been able
+to see that the Policy on the namespace was in use in many places before deleting
+it.)


Shouldn't that be a default assumption if there are other resources in the namespace?

Yes, but again, it's easy to miss in a stressful situation. Explicitly telling people this is better than leaving it implicit.

youngnick · 2023-06-26T08:53:27Z

Thanks for the comments @frankbu, I'm about to go on leave until July 18, so I will address them when I'm back.

I'd encourage others to take a look at this proposal too in the meantime.

youngnick · 2023-07-26T07:03:07Z

Updated this one, it's ready for more review.

david-martin

Great to see the latest updates here.

This is most definitely a separate GEP, but I'm wondering about the concept of a policy implementer/provider that is separate from the gateway provider.
Would that decoupling make sense? Perhaps this has been discussed before.

Asking in the context of seeing the GatewayClass extension types mentioned and how that might work.
For example, if I'm an expert in API caching, I'd like to write a controller for reconciling CachePolicy resources, but I don't want to/not able to provide a full gateway api impelmentation with my caching policy feature.
I'd like to register my controller in a standard way with a Gateway API provider, and then it lists the CachePolicy CRD in the GatewayClass status.
The gateway provider then has a standard mechanism for integrating with my controller (or vice versa).

youngnick · 2023-08-01T00:22:17Z

/check-cla

youngnick · 2023-08-02T06:06:49Z

/check-cla

shaneutt

The reality of this effort is that we must move forward because we are in a predicament for delivery. Given the reality and the need and the fact that this change is overall an improvement, I think we should accept this as the next iteration and keep iterating from here.

/approve

robscott

Thanks @youngnick! Lots of great updates here.

geps/gep-713.md

robscott · 2023-08-04T22:13:59Z

geps/gep-713.md

+#### On targeted resources
+
+Implementations that use Policy objects MUST ensure that a Condition exists in
+the `status.Conditions` of any objects affected by a Policy. If that Condition is
+already set to `status: true`, then the implementation must ensure that
+the `observedGeneration` field matches the `metadata.generation` field of the
+object.


I think this specific solution might be best represented by a follow-up GEP. There are a lot of edge cases here that are going to be hard to handle. For example it will be fairly straightforward for implementations to determine when they should set this condition, but near impossible for them to determine when they should unset the condition, assuming more than one implementation is active in the cluster.

I've updated the wording here and in the Solution Cookbook section, but I'll also update both with the "this needs a followup GEP" wording and change to Experimental.

geps/gep-713.md

Signed-off-by: Nick Young <nick@isovalent.com>

robscott

Thanks @youngnick! You've got some great ideas here, really good to seem them all articulated. I've got a few comments, mostly non-blocking. The overall theme is that I think this GEP has gotten too big and I think it would be good to split it out ASAP or at least add new content in new places. That's more of an organizational thing, but I think I'm aligned with all the major points in here.

/approve

geps/gep-713.md

robscott · 2023-08-10T03:20:53Z

geps/gep-713.md

@@ -1,4 +1,4 @@
-# GEP-713: Metaresources and Policy Attachment_metaresource_
+# GEP-713: Metaresources and Policy Attachment


A general note, this GEP is massive. I think the overall size of it makes an already complicated topic even more overwhelming to understand and/or contribute to. One potential way to mitigate this problem would be to have a separate page that collects and documents problems with policy attachment and links out to potential solutions and how we're trying to improve things here. For example, we could keep something like #2012 as an interactive forum for proposing solutions/improvements that would eventually turn into GEPs of their own, but the problem statements could be clearly stated somewhere in our docs...

This is not blocking feedback, I just think we're going to need find some ways to either split this GEP up and/or cut some of the content out so this can be a bit more manageable + approachable.

I've logged #2307, and assigned it to myself.

geps/gep-713.md

robscott · 2023-08-10T03:27:30Z

geps/gep-713.md

@@ -649,6 +1182,33 @@ const (
 )
 ```

+#### On targeted resources


Given the size of this GEP and the provisional nature of this idea, maybe we can just link out to somewhere here instead of duplicating the content.

I agree, but I'd prefer to get this merged, so that we have the text stored somewhere, and then move it around after. This PR does not finish the work to tidy up Policy Attachment, but it's become a bit unwieldy, so I'd like to concentrate on getting it merged and fixing afterwards.

geps/gep-713.md

k8s-ci-robot · 2023-08-10T03:35:48Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: robscott, shaneutt, youngnick

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~geps/OWNERS~~ [robscott,shaneutt,youngnick]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Nick Young <nick@isovalent.com>

robscott

Thanks @youngnick!

/lgtm

robscott · 2023-08-15T17:39:02Z

/hold cancel

k8s-ci-robot requested review from bowei and robscott June 19, 2023 13:37

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 19, 2023

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 19, 2023

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 19, 2023

alexsnaps mentioned this pull request Jun 20, 2023

RateLimitPolicy conditions don't indicate that the limit is applied Kuadrant/kuadrant-operator#140

Closed

maleck13 reviewed Jun 20, 2023

View reviewed changes

geps/gep-713.md Outdated Show resolved Hide resolved

youngnick force-pushed the policy-discoverability branch from a591567 to 73e1f55 Compare June 20, 2023 16:20

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 20, 2023

youngnick mentioned this pull request Jun 20, 2023

Remove Policy Attachment details from site docs, point to GEP #2134

Merged

frankbu reviewed Jun 21, 2023

View reviewed changes

kflynn mentioned this pull request Jul 11, 2023

GEP-2014: Declarative Policy #2015

Closed

youngnick force-pushed the policy-discoverability branch from 73e1f55 to 8d1535e Compare July 26, 2023 05:37

k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 26, 2023

david-martin reviewed Jul 26, 2023

View reviewed changes

youngnick mentioned this pull request Jul 27, 2023

Initial GAMMA docs take #2238

Merged

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Aug 2, 2023

shaneutt approved these changes Aug 2, 2023

View reviewed changes

robscott reviewed Aug 4, 2023

View reviewed changes

youngnick added 6 commits August 7, 2023 11:19

Merge parable and add discoverability info

599bb47

Signed-off-by: Nick Young <nick@isovalent.com>

Finish solutions, update names to Charlie and Ana

706b6b9

Signed-off-by: Nick Young <nick@isovalent.com>

Add The Diagram with Policy Admin

f48ae87

Signed-off-by: Nick Young <nick@isovalent.com>

Update document title

88465db

Signed-off-by: Nick Young <nick@isovalent.com>

Update details about standard conditions to add to targeted resources

2ff981a

Signed-off-by: Nick Young <nick@isovalent.com>

Fix review comments

1c80569

Signed-off-by: Nick Young <nick@isovalent.com>

youngnick force-pushed the policy-discoverability branch from b1baa1a to 1c80569 Compare August 7, 2023 11:19

Add naming details

7e68ce9

Signed-off-by: Nick Young <nick@isovalent.com>

This was referenced Aug 9, 2023

GEP: Add Name to HTTPRouteRule and HTTPRouteMatch #995

Open

Update TargetREf in Policy GEP #2147

Closed

robscott reviewed Aug 10, 2023

View reviewed changes

robscott mentioned this pull request Aug 11, 2023

Adding SectionName to PolicyTargetReference #2283

Merged

youngnick mentioned this pull request Aug 15, 2023

Split up Policy Attachment GEP #2307

Open

Fix hopefully the last round of PR comments

d6ebba0

Signed-off-by: Nick Young <nick@isovalent.com>

robscott reviewed Aug 15, 2023

View reviewed changes

k8s-ci-robot assigned robscott Aug 15, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 15, 2023

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 15, 2023

k8s-ci-robot merged commit a975fd6 into kubernetes-sigs:main Aug 15, 2023
9 checks passed

guicassolato mentioned this pull request Aug 30, 2023

Reconcile resources Kuadrant/kuadrant-operator#240

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Policy discoverability #2128

Improve Policy discoverability #2128

youngnick commented Jun 19, 2023

youngnick commented Jun 19, 2023

youngnick commented Jun 19, 2023

k8s-triage-robot commented Jun 19, 2023

youngnick commented Jun 20, 2023

frankbu Jun 21, 2023

youngnick Jul 26, 2023

frankbu Jun 21, 2023

youngnick Jul 26, 2023

frankbu Jun 21, 2023

youngnick Jul 26, 2023

youngnick commented Jun 26, 2023

youngnick commented Jul 26, 2023

david-martin left a comment

youngnick commented Aug 1, 2023

youngnick commented Aug 2, 2023

shaneutt left a comment

robscott left a comment

robscott Aug 4, 2023

youngnick Aug 7, 2023

robscott left a comment

robscott Aug 10, 2023

youngnick Aug 15, 2023

robscott Aug 10, 2023

youngnick Aug 15, 2023

k8s-ci-robot commented Aug 10, 2023

robscott left a comment

robscott commented Aug 15, 2023


		Ana: _Thanks for the help. Wish we’d found better answers._ 😢

		### The Problem, restated

		@@ -1,4 +1,4 @@
		# GEP-713: Metaresources and Policy Attachment_metaresource_
		# GEP-713: Metaresources and Policy Attachment

Improve Policy discoverability #2128

Improve Policy discoverability #2128

Conversation

youngnick commented Jun 19, 2023

youngnick commented Jun 19, 2023

youngnick commented Jun 19, 2023

k8s-triage-robot commented Jun 19, 2023

youngnick commented Jun 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youngnick commented Jun 26, 2023

youngnick commented Jul 26, 2023

david-martin left a comment

Choose a reason for hiding this comment

youngnick commented Aug 1, 2023

youngnick commented Aug 2, 2023

shaneutt left a comment

Choose a reason for hiding this comment

robscott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robscott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Aug 10, 2023

robscott left a comment

Choose a reason for hiding this comment

robscott commented Aug 15, 2023