-
-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: first API draft and generation #315
Conversation
Exposing nothing by default to keep our package API slim and later decide what to expose via `pkg`.
I think the first API designs are ready now, so that we can discuss/merge this PR and concentrate on other details, like |
@zepatrik are the API definitions okay so far? Do ask me anything thats unclear to you. I would also jump on a call. |
I'll review it in ~1h |
Take your time, no pressure 🍰 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I definitely need to learn a bit more about protobuf 😅
Makefile
Outdated
# Generate after linting succeeded | ||
# | ||
.PHONY: buf | ||
buf: buf-lint buf-gen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please always add the newlines in the end of files 😉
api/google/api/client.proto
Outdated
@@ -0,0 +1,99 @@ | |||
// Copyright 2020 Google LLC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this file? I could not find any documentation in GAPIC that would explain the need for this. Also, the Copyright Google is probably a copy/paste error?
If this is required for gapic, I'd prefer adding it together with gapic client generation in one PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll remove and put it back for gapic PR then.
api/keto/README.md
Outdated
# Notes | ||
|
||
> ORY Keto is still a `sandbox` project. | ||
This makes the included api version `v1` subject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe make this v0
instead? The first alpha and beta releases of keto will also be v0.x.x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I wasn't sure whether we release next gen keto with a stable v1 directly, but this is not how it goes, since users will be using the service very early I guess.
In protobuf we can use v1alpha1
, the first alpha of v1, then v1beta1
and then eventually v1
for stable definitions.
If we go that path we will always need to write version mappings of the messages versions on server side (map v1alpha1
-> v1beta1
-> v1
), deprecate old versions and eventually remove the support at some point.
Sure, we can do this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be possible to go with v0
that can have breaking changes and then move on to v1
once we release Keto v1.x.x? Otherwise, v1alpha
and v1beta
are good 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove support for old versions immediately and guarantee compatibility only at a point where we are confident that not much will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on how long we have the alpha/beta versions public and how many clients are using an older version before we should remove them. Generally the goal is to reach v1 as soon as possible.
This is how versioning is usually seen for ProtoBuf APIs:
v1alpha
- Highly subject to breaking changes, very early stagev1beta
- Pretty reliable, may change. Should have a long evaluation time before goingv1
v1
- Stable, API (RPCs, fields, messages) can only be added without breaking existing functionality- ...perhaps a new
v2...
cycle when needing to breakv1
, stillv1
has loooong time support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's what I think as well. So we just start with v1alpha
and move on when we think it is appropriate.
|
||
// The admin service for doing administrative tasks in the ACL system. | ||
service AdminService { | ||
option (google.api.default_host) = "keto.exampleapis.com"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would everyone have to adapt this to their setup? What is the benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is the default service host that GAPIC requires to be set for every service. GAPIC generated clients have this host as default for their respective service, but our users would have to pass in their correct API endpoint they deployed and pass it to client. Google uses it to "hard code" their services hostname directly to their clients.
This is not too problematic, I think.
The only error a user would see if he creates a new client and forgets to pass in his correct host would be something like a "hostname keto.exampleapis.com not found" / DNS lookup failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so we add this with GAPIC in a separate PR right? Also, I would prefer to then be something like
option (google.api.default_host) = "you.have.to.set.hostname.to.your.instance";
😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right!
option go_package = "github.com/ory/keto/api/keto/acl/admin/v1;admin"; | ||
option csharp_namespace = "Ory.Keto.Acl.Admin.V1"; | ||
option java_multiple_files = true; | ||
option java_outer_classname = "AdminServiceProto"; | ||
option java_package = "sh.ory.keto.acl.admin.v1"; | ||
option php_namespace = "Ory\\Keto\\Acl\\Admin\\V1"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these options following some best practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, everything does! :)
As a sample reference.
In case, do you want to use another java_package
domain for ORY? I guess not since ORY uses sh.ory.<project>...
as java path (e.g. as here).
The API path will perfectly go hand in hand in case we want to make a Keto java client where we will have the generated proto code under package sh.ory.keto.acl...
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you nailed it already 😉
https://github.com/ory/sdk/tree/master/clients/hydra/java#getting-started
api/keto/acl/v1/check_service.proto
Outdated
// The relation this check. | ||
string relation = 2; | ||
// The concrete subject id to check. | ||
string subject_id = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of a subject set, would this mean that it has to be encoded? Wouldn't it be better if this was of type Subject
? I am not sure what the use case would be but it is really just a definition thing, so we can allow it without any drawbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sure!
I'll put that in, so one can Check
on a SubjectSet
(without expantion I guess).
What I just thought is that it might confuse API users if they see they can do a check on a SubjectSet
.
- I mean one could understand that the semantic is that the server expands the
SubjectSet
to subject ids, therefor making aCheck
request on multiple subject ids. Like WHAT?! :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW do we also want to add an skip_expand
bool to the CheckRequest.
skip_expand
defaults tofalse
so expand checks are done by default (probably what the user expects?)- setting
skip_expand=true
skips SubjectSet expanding and any rewrite rules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it the same if you expand a SubjectSet
or not?
And isn't skip_expand
the same as using read to see whether this exact tuple is stored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean the expand API (2.4.5 Expand
in paper), whether if we simply add a skip_expand
field to the CheckRequest
that the client can explicitly say whether SubjectSets and Rewrite Rules apply for a check request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not be a breaking change so let's add it later. Tracked as #323
api/keto/acl/v1/check_service.proto
Outdated
// *It is recommended to perform checks using slightly stale | ||
// data (e.g. token older than 3-10 seconds) for minimum latency | ||
// and where the application is allowed to accept slightly off checks. | ||
bytes snaptoken = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be better to add them once we have them in the actual code? As they are not strictly required, we can first implement everything without snaptokens/zookies/... and add them in the next step? Do you think it should be in the API definition already anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I do think having them in the API right from the beginning is better.
Users need to learn and adopt the tokens beforehand so they don't need to rewrite their code logics too much.
For now they do not have any special effect since the server always uses the newest tuples.
We could make the server always return a coming_soon
snaptoken for now. Not an empty one!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good idea
api/keto/acl/v1/check_service.proto
Outdated
// result of this check. | ||
// | ||
// Leave this field blank if... | ||
// - your application strictly requires to act on up-to-date data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the read-api:
If the request doesn’t contain a zookie, Zanzibar will choose a reasonably recent snapshot, possibly offering a lower latency response than if a zookie were provided.
Do you think the same applies to check or is a check request without a zookie automatically a "content-change" check that returns the new versions zookie? From my understanding it would be possible to have a check without a zookie that then is evaluated at a "reasonably recent snapshot", but that returns no new zookie. Maybe my understanding is wrong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I'm about to redesign the CheckRequest a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
api/keto/acl/v1/watch_service.proto
Outdated
|
||
// The service to watch for changes in the system, | ||
// such as for Access Control Lists and namespace configs. | ||
service WatchService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The watch service will require a lot of background setup first, lets add the API definition together with that functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove it.
api/keto/acl/v1/write_service.proto
Outdated
|
||
// Write-delta for a WriteRelationTuplesRequest. | ||
message RelationTupleWriteDelta { | ||
enum Action { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you add this action instead of having different rpcs? Is there a best practice regarding this defined somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple reasons for this design:
- There would be quite a few and similar looking RPCs to cover each action.
- We can allow clients to do a transaction on multiple tuples in a single RPC call
- This is very convenient for users in case the user wants all actions to succeed or none!
- ...while also supporting behaviours like delete tuple A and only insert tuple B if it does not already exists, otherwise I want complete transaction to fail and give me an error
- would be pretty easy to implement with SQL (I've done that :))
Here are some similar refs:
- IAM googleapis
- GKE googleapis
- "Actions" can even be whole messages sometimes as in Spanner googleapis
We can always add RPC later if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or any other example where 2 or more modifications must succeed.
Imagine this (and I had this scenario learned the hard way):
- Some system inserts 3 tuples in response to an event (e.g. an order, user registration, ...)
- The system continuously fails at the 3rd insert rpc.
- Now the system would need to rollback somehow and delete the first 2 inserted tuples for clean up.
- ...this is just a mess, could be solved client side, but only with much overhead to get right ...You get what I mean :)
In these often cases, transactions should be offered server side.
And if the client does not need a transaction he will just make multiple RPCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 agree, would really be awesome to have transactional requests here
enum Action { | ||
// Unspecified. | ||
// The `WriteRelationTuples` rpc ignores this | ||
// RelationTupleWriteDelta if an action was unspecified. | ||
ACTION_UNSPECIFIED = 0; | ||
|
||
// Like INSERT with the exception that if the RelationTuple | ||
// already exists performs an UPDATE instead. | ||
UPSERT = 1; | ||
|
||
// Insertion of a new RelationTuple. | ||
// | ||
// The `WriteRelationTuples` rpc errors if the | ||
// specified RelationTuple already exists. | ||
INSERT = 2; | ||
|
||
// Update of the existing RelationTuple with | ||
// the intend to refresh its snapshot token. | ||
// | ||
// The `WriteRelationTuples` rpc errors if the | ||
// specified RelationTuple was not found. | ||
UPDATE = 3; | ||
|
||
// Deletion of the RelationTuple. | ||
// The `WriteRelationTuples` rpc returns NO error | ||
// if the specified RelationTuple was not found. | ||
DELETE = 4; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only makes sense if we can identify relation tuples somehow, but I don't think it is really possible. An update essentially means
- delete the old tuple
- insert the new one
We can add that as a separate atomic operation RPC later on, but for now clients can just do these steps themselves. As the parameters will therefore be different for the action types, we should rather separate this into individual RPCs. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, e.g. the UPDATE action makes little to no sense with the Tuple model.
I'd like to have a single RPC still for transaction support upfront and that we have less Tuple mutation RPCs as possible.
How about having these actions:
- INSERT - not errors tx if already exists
- DELETE - not errors tx if not found
Later we can add actions like:
- INSERT_NEW - errors tx if already exists
- DELETE_FOUND - errors tx if not found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a real update that allows deletion and insertion atomically in one call does make sense but is not a hard requirement, so we can add it later. The actions you proposed are perfect and can all be used in one RPC with this action enum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked as #328
Should we maybe remove the etag
s and solve everything consistently? Would be a better approach IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
The snaptoken now is of type string to ease logging/debugging/console output. |
@zepatrik if I did not miss anything, we can squash and merge this baby. :) |
Nice 🎉 |
Related issue
#311
Please consider first merging #313.
Proposed changes
The designs of the APIs are in active discussion with @zepatrik, where we decided that I am the "assignee" who takes more role in maintenance of the APIs because of my former knowledge working with APIs using ProtoBuf / gRPC / Client generation.
@zepatrik and I settled down that the two first major parts are API definitions and database schema that affect development of the ACL system, which makes this PR to get merged ASAP in order to proceed.
After this PR got merged I'll develop on the gapic client generation which users will be using as well as ACL nodes internally for their intercommunications as well as replace the current protos so @zepatrik and I can proceed developing using the new protos and the gapic generated clients.
Access control for Keto client requests itself
As per the discussion with @aeneasr and @zepatrik from the meeting, API endpoints of Keto are only authenticated using mTLS and/or authorized through an identity & access proxy (like oathkeeper).
Therefore I decided to split each "access concern" by the following gRCP services, so that administrators can also configure access by using path matching with mesh proxies like in Istio/Linkerd/...
Services http paths:
/keto.acl.v1.WriteService/<method>
/keto.acl.v1.ReadService/<method>
/keto.acl.v1.CheckService/<method>
/keto.acl.v1.WatchService/<method>
/keto.acl.admin.v1.AdminService/<method>
Example network policies:
AdminService
service.ReadService
&CheckService
service.WatchService
&AdminService
service.This design does not only allow broad access control by service, but it is also possible to further restrict method access as normal:
/keto.acl.v1.ReadService/ListRelationTuples
service method./keto.acl.v1.WriteService/DeleteRelationTuples
service method./keto.acl.v1.WatchService/Watch
&/keto.acl.v1.WriteService/WriteRelationTuples
service methods.Broadly allowing to match the service by the path prefix (e.g.
/keto.acl.v1.WriteService/*
) is more scalable than possible if we would have a mono service (e.g.AclService
) that would have all read/write/watch/... RPCs in it.This is because a mono service does not separate concern and would not support administrators to broadly allow clients in a network to a single one access concern. They do also not need to know of every single RPC in advance nor update their network policies in case we add additional RPCs to a service concern.
Opting for a mono service with all RPCs would also open a high security vulnerability because administrators do not know which RPCs we might add in the future.
Design details / practices
The API takes several other well known API design practices as specified in AIP (API Improvement Proposals).
List...
RPC responsesList...
RPC responsesImplementation note
Note, that implementing mechanisms such as filtering, ordering or zookie do not need to get implemented right away, but should be the goal to be in in the Keto
v1.0.0
release, where thev1
APIs gets marked as fixed and stable!