feat: Add mapping table for encoding subject into UUIDs #809

hperl · 2022-01-09T10:49:24Z

This PR contains first work towards supporting automatic subject/object encodings into UUIDs, as discussed in #792.

Related issue(s)

#792

Checklist

I have read the contributing guidelines.
I have referenced an issue containing the design document if my change
introduces a new feature.
I am following the
contributing code guidelines.
I have read the security policy.
I confirm that this pull request does not address a security
vulnerability. If this pull request addresses a security. vulnerability, I
confirm that I got green light (please contact
security@ory.sh) from the maintainers to push
the changes.
I have added tests that prove my fix is effective or that my feature
works.
I have added or changed the documentation.

Future work

In order to complete the work for #792, the following items still need to be adressed (in this or future PRs):

RFC: Should the new encoding be enabled unconditionally, or opt-in as suggested in Automatic Subject/Object encoding #792 (comment)?
Use the new mapping methods in the API when reading or writing relation tuples.

hperl · 2022-01-09T11:08:23Z

@zepatrik the Cockroach migration test failed with std_err: Could not apply migrations: ERROR: relation "keto_uuid_mappings" already exists (SQLSTATE 42P07), although the table is only created once in the migration scripts. Did I misunderstand how you do migrations?

https://app.circleci.com/pipelines/github/ory/keto/1737/workflows/da82ec86-7471-4ff4-ab64-77d32d1a91d3/jobs/11054?invite=true#step-110-785

zepatrik

Thanks, this is a good starting point but I think some requirements were not clearly defined in the issue. Now that I see the first implementation iteration, I clarified some parts 😉

internal/driver/config/config.schema.json

internal/persistence/sql/uuid_mapping_test.go

internal/persistence/sql/uuid_mapping.go

zepatrik · 2022-01-10T13:08:06Z

...rsistence/sql/migrations/sql/20220107205800000000_create-uuid-mapping-table.cockroach.up.sql

@@ -0,0 +1,10 @@
+CREATE TABLE keto_uuid_mappings


We use https://github.com/gobuffalo/fizz as a DDL tool to create migrations. See other example create table definitions: https://github.com/ory/kratos/blob/c455fbf2f4792cb3fd2f846a5e699e837d8c98a4/persistence/sql/migrations/templates/20191100000001_identities.up.fizz
You can then run make migrations-render-replace to render the fizz template to SQL files.

Further, we will need to migrate the existing relation tuple table as well. In case the ID in there already is a UUID (maybe using a regex to find those?) nothing has to be done. In case it is not, we will have to add an entry to this mapping table. That migration process needs good tests to ensure everything works. Migration tests are here: https://github.com/ory/keto/blob/fabf1a06d90139f6678fc7bc1703ee92a5fc9d69/internal/persistence/sql/migrations/migratest/migration_test.go

internal/namespace/definitions.go

internal/persistence/sql/uuid_mapping.go

hperl · 2022-01-10T21:12:03Z

Thanks for the review. I addressed most of the points (I think), but I am still not sure where the best place is to actually do the migration as well as add the business logic that inserts a UUID in the mapping table if a non-UUID subject ID should be inserted.

aeneasr · 2022-01-14T15:53:37Z

@zepatrik friendly ping :)

zepatrik

@hperl thanks for addressing the feedback 👍 and sorry for the late reply 🙈

I have moved the mapping manager test to the right location, I think it is easier to show by commit than to explain 😉
This is now pretty much what I expected as the first draft.
For the migration, we would ideally implement that in SQL. If it gets too complicated there (especially for cross-DB compatibility), we can also do it in go. There is already one migration process, the SingleTableMigrator that runs in go. You can give it a try in SQL if you want 😉

In all the handlers/... we would then change the subject and object fields to be one of subject_uuid or subject_string. In the latter case, we would apply the mapping. But I would do that in a separate follow up PR if you don't want to invest all that work.

Oh and btw, for that we will also need a string to UUID function, and therefore also an (unique) index on the string column. It would be great if you can add that in this PR.

hperl · 2022-01-17T19:28:52Z

@zepatrik, thanks for moving stuff to the correct place. It feels a little strange to have the test function defined outside of a _test file, though.

As for the SQL migration, that would still go into 20220110200400_create-uuid-mapping-table.up.sql, correct?

(btw, I quickly added the index in the migration)

A new table was added that stores a mapping between an arbitraty string and a UUIDv5 of that string, namespaced by a namespace UUID that the user can set in the namespace configuration.

zepatrik · 2022-01-18T08:47:35Z

It feels a little strange to have the test function defined outside of a _test file, though.

We use this pattern for the persister tests only. Main reason being that we can avoid circular dependencies with that. Also, we don't usually spin up all databases for all tests, but only have this one persister test that uses ALL supported databases.

And yes, you can try to add the migration in that file 😉

hperl · 2022-01-19T19:20:08Z

Having thought about writing the migration in SQL, I think Go is the better choice. The migration would need to parse a UUID that conforms to the Go implementation (https://pkg.go.dev/github.com/gofrs/uuid?utm_source=gopls#UUID.UnmarshalText) as well as generate UUIDs (and at least MySQL generates v1 UUIDs only, which are not random, see https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid).

With both generation and parsing being hard to do consistently across SQL dialects, I'd go with Go. @zepatrik, WDYT?

rdurell · 2022-01-20T19:57:21Z

If mapping subject to UUID, why not increase the supported subject length as well? We're looking at using somewhat complex subject strings which would benefit by being larger then 64 characters.

hperl · 2022-01-20T20:11:31Z

@rdurell sure, I can increase it to 1024. Do you have more information on typical subject ID sizes?

zepatrik · 2022-01-21T10:25:56Z

With both generation and parsing being hard to do consistently across SQL dialects, I'd go with Go.

Sounds reasonable 👍

why not increase the supported subject length as well?

Definitely, maybe we can even make the strings unbound? I know that at least postgres and cockroach don't really care. If you happen to store big strings, you will face the performance implications of that. But we should not force you to do that.

zepatrik

Thank you, just some remarks on the migrator.

internal/persistence/sql/migrations/single_table_test.go

internal/persistence/sql/migrations/uuid_mapping.go

zepatrik · 2022-01-21T10:36:33Z

internal/persistence/sql/migrations/uuid_mapping_test.go

+					SubjectSetRelation:    dbsql.NullString{String: "rel", Valid: true},
+					CommitTime:            time.Now(),
+				},
+				expectMapping: false,


Suggested change

expectMapping: false,

expectMapping: true,

hperl · 2022-01-23T11:56:25Z

Definitely, maybe we can even make the strings unbound? I know that at least postgres and cockroach don't really care. If you happen to store big strings, you will face the performance implications of that. But we should not force you to do that.

We can either use a VARCHAR(x) with a large x, or even TEXT, but in both cases MySQL will limit us, because indices are limited in size (and we put an index on both the UUID and the string representation). So even for TEXT the index would then need to take a prefix length, see https://dev.mysql.com/doc/refman/8.0/en/create-index.html.

For storing arbitrary large strings, I would not want to put an index on them or even lookup by the string representation then, but either:

add another string column that holds a hash of the string representation, put an index on that and hash the strings before lookup
Generate the UUIDs with UUIDv5 directly from the string representation and a constant or empty namespace (<-- preferred)

Maybe we can quickly go over the last open points of the design in-band, I think we could quickly resolve them.

internal/persistence/sql/uuid_mapping.go

internal/uuidmapping/definitions.go

zepatrik

I am really sorry for the long review time. This looks very good now.
IMO we could merge it as is and open new PRs to finish the feature? This would block us however on creating releases...
Alternatively, we can create a feature branch that we then open more PRs against and in the end rebase master on that.

zepatrik · 2022-02-11T09:34:54Z

internal/persistence/sql/migrations/uuid_mapping_test.go

+						assert.Equal(t, tc.rt, newRt)
+					}
+				})
+			}


I would like to add a test case here that ensures the pagination works. There is a wrapper for the relation tuple manager where you can overwrite the default pagination:

keto/internal/relationtuple/definitions.go

Lines 644 to 660 in 1e15176

type ManagerWrapper struct {

Reg ManagerProvider

PageOpts []x.PaginationOptionSetter

RequestedPages []string

}

var (

_ Manager = (*ManagerWrapper)(nil)

_ ManagerProvider = (*ManagerWrapper)(nil)

)

func NewManagerWrapper(_ *testing.T, reg ManagerProvider, options ...x.PaginationOptionSetter) *ManagerWrapper {

return &ManagerWrapper{

Reg: reg,

PageOpts: options,

}

}

You can initialize and use it in the test similar to

keto/internal/check/engine_test.go

Lines 34 to 42 in 1e15176

reg := driver.NewSqliteTestRegistry(t, false)

require.NoError(t, reg.Config().Set(config.KeyNamespaces, namespaces))

mr := relationtuple.NewManagerWrapper(t, reg, pageOpts...)

return &deps{

ManagerWrapper: mr,

configProvider: reg,

loggerProvider: reg,

}

keto/internal/check/engine_test.go

Line 441 in 1e15176

t.Run("case=paginates", func(t *testing.T) {

Thanks, I added a test for pagination. I did not use the ManagerWrapper, because I was not retrieving the tuples via the manager anyways, but I used that code as inspiration nonetheless 😉.

zepatrik · 2022-02-11T09:45:50Z

internal/persistence/sql/migrations/templates/20220110200400_create-uuid-mapping-table.up.sql

+    string_representation    TEXT NOT NULL CHECK (string_representation <> ''),
+
+    PRIMARY KEY (id)
+);


I just realized that we of course have to change the column types in the relation tuple table to UUID as well. I am afraid that we have to create a new table and insert into that instead of the old one. Or do you have other ideas?
The migrator we use can theoretically be extended to add go migrations btw, I think that would definitely make sense. Then we also don't need the command 🤔
What do you think about that @aeneasr?

https://github.com/ory/x/blob/fdc336251a395d29c7a242b611dc3fe3740620f8/popx/migration_box.go

Creating a new table might not be necessary. Assuming that object and subject_id already hold UUIDs, we can, in Postgres, do:

ALTER TABLE keto_relation_tuples ALTER object TYPE UUID using object::uuid;

(and similar for subject_id). MySQL does not have a UUID type, so we will stick with VARCHAR(64). The same statement works in Cockroachdb when using SET enable_experimental_alter_column_type_general = true.

hperl · 2022-02-16T16:50:19Z

I am really sorry for the long review time. This looks very good now. IMO we could merge it as is and open new PRs to finish the feature? This would block us however on creating releases... Alternatively, we can create a feature branch that we then open more PRs against and in the end rebase master on that.

I'm not a big fan of long-living feature branches, but I don't know the release process well enough. If we could prevent the migration from running, nothing user-visible changed. Is that an option?

zepatrik · 2022-02-16T17:13:53Z

I'm not a big fan of long-living feature branches, but I don't know the release process well enough.

Agree, but we can only release master as is easily, so this would block us. You are right that currently it does not affect the previous code, but it will once we start integrating the mapping into handlers, ...
Just keeping this off of the main branch will make it a bit easier.

Co-authored-by: Patrik <zepatrik@users.noreply.github.com>

With this change Keto now maps strings to UUIDv5 on the storage layer. This change allows unlimited strings to be used while maintaining good performance. Further, it reduces the likeliness of database hot-spots. The migration that applies this mapping might take some time, so please confirm that your migration strategy works for you. BREAKING CHANGE: `keto namespace migrate ...` commands were removed. To migrate from v0.6.0-alpha.1, please first migrate the legacy namespaces using v0.8.0-alpha.2 BREAKING CHANGE: The protobuf API was bumped to `v1alpha2`. Please upgrade your client dependency to that version. `v1alpha1` is still supported for now, but might be dropped soon. BREAKING CHANGE: Some payload keys are now (not) required anymore. The generated SDKs will likely have breaking changes. Co-authored-by: Patrik <zepatrik@users.noreply.github.com> Co-authored-by: hperl <34397+hperl@users.noreply.github.com>

hperl requested a review from zepatrik as a code owner January 9, 2022 10:49

hperl changed the title ~~Add mapping table for encoding subject into UUIDs~~ feat: Add mapping table for encoding subject into UUIDs Jan 9, 2022

zepatrik requested changes Jan 10, 2022

View reviewed changes

hperl force-pushed the auto-subject-object-encoding branch from 8966431 to 17c8489 Compare January 10, 2022 21:07

hperl force-pushed the auto-subject-object-encoding branch 2 times, most recently from c3ad030 to f519a5c Compare January 12, 2022 20:05

zepatrik reviewed Jan 17, 2022

View reviewed changes

hperl and others added 2 commits January 17, 2022 20:36

feat: Add mapping for encoding subject into UUIDs

dc2fc54

A new table was added that stores a mapping between an arbitraty string and a UUIDv5 of that string, namespaced by a namespace UUID that the user can set in the namespace configuration.

test: move UUID mapping manager test to the right place

bd9b2e0

hperl force-pushed the auto-subject-object-encoding branch from e915f87 to bd9b2e0 Compare January 17, 2022 19:37

hperl force-pushed the auto-subject-object-encoding branch 5 times, most recently from 6d3254d to 78d153e Compare January 20, 2022 19:26

hperl force-pushed the auto-subject-object-encoding branch from c66f85b to 2af3d91 Compare January 20, 2022 20:30

feat: Add migration for UUID mapping

fb7af04

hperl force-pushed the auto-subject-object-encoding branch from 2af3d91 to fb7af04 Compare January 21, 2022 08:12

feat: Increase size of mapped UUID column

acbe5de

zepatrik requested changes Jan 21, 2022

View reviewed changes

feat: Add pagination to the migrator

20925a6

feat: Add mapping migration for objects

7438c28

hperl force-pushed the auto-subject-object-encoding branch from afb5c7d to 7438c28 Compare January 23, 2022 12:15

hperl requested a review from zepatrik January 23, 2022 12:32

feat: map subjects/objects using UUIDv5

cb8a996

zepatrik reviewed Feb 3, 2022

View reviewed changes

internal/persistence/sql/uuid_mapping.go Show resolved Hide resolved

internal/persistence/sql/uuid_mapping.go Show resolved Hide resolved

internal/uuidmapping/definitions.go Show resolved Hide resolved

hperl force-pushed the auto-subject-object-encoding branch 2 times, most recently from 4dcdfda to 7438c28 Compare February 3, 2022 16:34

hperl requested a review from zepatrik February 8, 2022 09:18

zepatrik reviewed Feb 11, 2022

View reviewed changes

zepatrik mentioned this pull request Feb 15, 2022

feat(popx): allow running custom go migrations ory/x#468

Merged

test: Add pagination test

c3858a2

zepatrik approved these changes Feb 17, 2022

View reviewed changes

zepatrik changed the base branch from master to feat/uuid-mapping February 17, 2022 09:26

zepatrik merged commit 79cfd6e into ory:feat/uuid-mapping Feb 17, 2022

hperl deleted the auto-subject-object-encoding branch February 17, 2022 13:31

hperl added a commit that referenced this pull request May 9, 2022

feat: add table for mapping strings to UUIDs (#809)

26428f6

Co-authored-by: Patrik <zepatrik@users.noreply.github.com>

hperl added a commit that referenced this pull request May 16, 2022

feat: add table for mapping strings to UUIDs (#809)

75fb3ae

Co-authored-by: Patrik <zepatrik@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add mapping table for encoding subject into UUIDs #809

feat: Add mapping table for encoding subject into UUIDs #809

hperl commented Jan 9, 2022

hperl commented Jan 9, 2022

zepatrik left a comment

zepatrik Jan 10, 2022

hperl commented Jan 10, 2022

aeneasr commented Jan 14, 2022

zepatrik left a comment •

edited

Loading

hperl commented Jan 17, 2022 •

edited

Loading

zepatrik commented Jan 18, 2022

hperl commented Jan 19, 2022

rdurell commented Jan 20, 2022

hperl commented Jan 20, 2022 •

edited

Loading

zepatrik commented Jan 21, 2022

zepatrik left a comment

zepatrik Jan 21, 2022

hperl commented Jan 23, 2022

zepatrik left a comment

zepatrik Feb 11, 2022

hperl Feb 16, 2022

zepatrik Feb 11, 2022

hperl Feb 16, 2022

hperl commented Feb 16, 2022

zepatrik commented Feb 16, 2022

	type ManagerWrapper struct {
	Reg ManagerProvider
	PageOpts []x.PaginationOptionSetter
	RequestedPages []string
	}

	var (
	_ Manager = (*ManagerWrapper)(nil)
	_ ManagerProvider = (*ManagerWrapper)(nil)
	)

	func NewManagerWrapper(_ testing.T, reg ManagerProvider, options ...x.PaginationOptionSetter) ManagerWrapper {
	return &ManagerWrapper{
	Reg: reg,
	PageOpts: options,
	}
	}

	reg := driver.NewSqliteTestRegistry(t, false)
	require.NoError(t, reg.Config().Set(config.KeyNamespaces, namespaces))
	mr := relationtuple.NewManagerWrapper(t, reg, pageOpts...)

	return &deps{
	ManagerWrapper: mr,
	configProvider: reg,
	loggerProvider: reg,
	}

feat: Add mapping table for encoding subject into UUIDs #809

feat: Add mapping table for encoding subject into UUIDs #809

Conversation

hperl commented Jan 9, 2022

Related issue(s)

Checklist

Future work

hperl commented Jan 9, 2022

zepatrik left a comment

Choose a reason for hiding this comment

zepatrik Jan 10, 2022

Choose a reason for hiding this comment

hperl commented Jan 10, 2022

aeneasr commented Jan 14, 2022

zepatrik left a comment • edited Loading

Choose a reason for hiding this comment

hperl commented Jan 17, 2022 • edited Loading

zepatrik commented Jan 18, 2022

hperl commented Jan 19, 2022

rdurell commented Jan 20, 2022

hperl commented Jan 20, 2022 • edited Loading

zepatrik commented Jan 21, 2022

zepatrik left a comment

Choose a reason for hiding this comment

zepatrik Jan 21, 2022

Choose a reason for hiding this comment

hperl commented Jan 23, 2022

zepatrik left a comment

Choose a reason for hiding this comment

zepatrik Feb 11, 2022

Choose a reason for hiding this comment

hperl Feb 16, 2022

Choose a reason for hiding this comment

zepatrik Feb 11, 2022

Choose a reason for hiding this comment

hperl Feb 16, 2022

Choose a reason for hiding this comment

hperl commented Feb 16, 2022

zepatrik commented Feb 16, 2022

zepatrik left a comment •

edited

Loading

hperl commented Jan 17, 2022 •

edited

Loading

hperl commented Jan 20, 2022 •

edited

Loading