doc: Add design doc for RBAC #17702

jkosh44 · 2023-02-16T19:52:20Z

Part of #11579

Motivation

Design doc.

Checklist

This PR has adequate test coverage / QA involvement has been duly considered.
This PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way) and therefore is tagged with a T-proto label.
If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
This PR includes the following user-facing behavior changes:

Part of MaterializeInc#11579

maddyblue · 2023-02-16T20:31:58Z

doc/developer/design/20230216_role_based_access_control.md

+| `EXPLAIN`                            | usually `OBJECT(r)`                                                         |
+| `INSERT INTO ... VALUES`             | `OBJECT(a)`                                                                 |
+| `INSERT INTO ... SELECT`             | `CLUSTER(U)`,`OBJECT(a)` usually `OBJECT(r)`                                |
+| `{SELECT, SHOW, SUBSCRIBE}`          | `CLUSTER(U)`, usually `OBJECT(r)`                                           |


As Cara pointed out, this could prevent SHOW and other introspection queries which might not be desired. Do we want to special case those somehow? This one is tricky to get right.

I have one idea that has been shot down before, but might warrant reconsideration in this context. All SHOW commands are run in the mz_introspection cluster, and all roles are automatically given CLUSTER(U) for mz_introspection and OBJECT(r) for builtin objects.

It always seemed a little strange to make users hop clusters to run introspection queries, so that option is worth considering for ux alone.

Flagging that there are some mz_introspection use cases we missed in the initial spec, which I'm adding there too:

Support uses mz_introspection clusters to debug client environments

They need to be able to to SELECT across all mz_* tables and execute EXPLAIN and DESCRIBE, ideally across clusters

Current permissions on this cluster are hardcoded in environmentd?

What about select queries on objects that don't inherently exist on a cluster?
Would you need to be in a cluster that you have CLUSTER(U) privileges for that cluster when you run the SELECT?

Aka we're saying that anyone who has access to Materialize should have usage access to at least one cluster.

Would you need to be in a cluster that you have CLUSTER(U) privileges for that cluster when you run the SELECT?

For this design yes. The rationale is that your select will use some of the compute resources in the current cluster, and therefore you will need usage privileges on that cluster.

Yeah it makes sense.
@ggnall FYI - for our comms / as we think about our guidance

doc/developer/design/20230216_role_based_access_control.md

jkosh44 · 2023-02-21T17:39:28Z

I was thinking that we'd be able to leave pre-defined roles out of this initial project, but after a conversation on slack I think that we'll need to include them. I'll update the design to add a section for pre-defined roles. I won't change any of the existing sections, so it shouldn't affect any on-going reviews.
pre-defined roles: https://www.postgresql.org/docs/current/predefined-roles.html
slack conversation: https://materializeinc.slack.com/archives/C02FWJ94HME/p1676998963912989

jkosh44 · 2023-02-21T20:01:17Z

I was thinking that we'd be able to leave pre-defined roles out of this initial project, but after a conversation on slack I think that we'll need to include them. I'll update the design to add a section for pre-defined roles. I won't change any of the existing sections, so it shouldn't affect any on-going reviews. pre-defined roles: https://www.postgresql.org/docs/current/predefined-roles.html slack conversation: https://materializeinc.slack.com/archives/C02FWJ94HME/p1676998963912989

I made the updates, it did end up touching a couple of existing sections, but only very slightly. I kept it all in a single small commit that you can look at.

maddyblue

Role/public changes lgtm.

benesch · 2023-02-22T07:24:19Z

I'm planning to review this by the end of the week!

chaas

I came across this comment by @mjibson on another issue #13430 (comment) about preventing the creation of duplicate indexes

A concern should we implement RBAC: a user could be denied read access to a mv/index due to RBAC, but then this feature might also prevent them from creating an index/mv of their own. We should thus perhaps issue a NOTICE that there was a duplicate, pointing to the first one, and the user can then decide to immediately DROP what they created.

It's a good question about what we should do if someone tries to create an object that already exists but they don't have access to, e.g. a database of schema. Is it leaking information somehow by producing an error? Curious what postgres does.

jkosh44 · 2023-02-22T19:45:49Z

doc/developer/design/20230216_role_based_access_control.md

+When a new user logs in, we will create a new role for them with only the `LOGIN` and `INHERIT` attributes. If that user
+is a frontegg admin, then the role will also have the `SUPERUSER` attribute.


@ggnall Do have an opinion on the following related scenario: A user logs in AND they are a Frontegg admin AND they have an existing role in Materialize AND the existing role is not a super user. Should we update that Materialize role to be a super user or leave it alone?

(FYI @mjibson)

Does this scenario play out on every login? Or only on the initial one?

When we eventually decouple Account permissions (or build our own), it probably makes sense to have SQL/RBAC be the system of record on who can do what.

From that perspective, an existing explicitly GRANTed role should take precedence. This would enable a case where a user needs to manage users, billing, etc., but should have limited or no database access.

The simplest thing to explain is: "Account admins are always SUPERUSERS", but it seems reasonable to say "Account admins are always SUPERUSERS by default, unless they are explicitly granted another role".

There will be extra steps in the latter case to upgrade users:

OrgAdmin -> OrgMember: would retain explicitly granted roles

OrgMember -> OrgAdmin: would retain explicitly granted roles. Another SUPERUSER would have to remove existing roles or GRANT them SUPERUSER.

From another perspective, I'm not 100% sure how important this scenario is:

This would enable a case where a user needs to manage users, billing, etc., but should have limited or no database access.

If it's simpler for us to implement "Account admins are always SUPERUSERS" and it's reversible, we could start there.

@ggnall The above scenario is about a frontegg user who was initially a non-admin frontegg user, but was then later granted admin in frontegg. Should we bump up their perms in mz? Also applies the other way around: if a frontegg user later gets their admin role revoked, should we also revoke mz superuser?

@mjibson We may have crossed messages but for:

"Account admins are always SUPERUSERS"

...Yes, frontegg admins grants/revokes should probably be paired with mz superuser grants/revokes. It would make sense to keep these in sync, so that users that invite new members to their organization are also able to fully manage access.

You're never allowed to specify LOGIN or SUPERUSER when you create a role. The role in Materialize always exists with neither of those characteristics.
...
When you log in to a role with Frontegg and your JWT has the "Organization Admin" role, you implicitly get SUPERUSER privilege for that session.

One consequence of this is that we lose the mechanism to revoke SUPERUSER from someone while they're logged in. So as long as someone remains connected, they retain SUPERUSER after it's been revoked in Frontegg. Unless, we invent some new syntax for this and make sure to always update this field in the session before executing a command.

You're never allowed to specify LOGIN or SUPERUSER when you create a role. The role in Materialize always exists with neither of those characteristics.
...
When you log in to a role with Frontegg and your JWT has the "Organization Admin" role, you implicitly get SUPERUSER privilege for that session.

One consequence of this is that we lose the mechanism to revoke SUPERUSER from someone while they're logged in. So as long as someone remains connected, they retain SUPERUSER after it's been revoked in Frontegg. Unless, we invent some new syntax for this and make sure to always update this field in the session before executing a command.

Or we have some periodic Frontegg refresh that periodically re-authenticates through Frontegg and updates these attributes.

Also I've updated the doc with what we've discussed in this thread.

Or we have some periodic Frontegg refresh that periodically re-authenticates through Frontegg and updates these attributes.

We've already got a task that periodically refreshes access tokens for pgwire connections!

materialize/src/pgwire/src/protocol.rs

Line 178 in a757249

.and_then(|token| frontegg.continuously_validate_access_token(token, user.clone()))

Would be pretty easy to stitch in SUPERUSER updates to that task, I think. Would only apply to pgwire sessions and not HTTP sessions, but I think that's ok; pgwire sessions tend to be long lived while http sessions tend to be short lived.

Or we have some periodic Frontegg refresh that periodically re-authenticates through Frontegg and updates these attributes.

We've already got a task that periodically refreshes access tokens for pgwire connections!

materialize/src/pgwire/src/protocol.rs

Line 178 in a757249

.and_then(|token| frontegg.continuously_validate_access_token(token, user.clone()))

Would be pretty easy to stitch in SUPERUSER updates to that task, I think. Would only apply to pgwire sessions and not HTTP sessions, but I think that's ok; pgwire sessions tend to be long lived while http sessions tend to be short lived.

I've updated the doc to include this.

doc/developer/design/20230216_role_based_access_control.md

jkosh44 · 2023-02-24T15:36:36Z

It's a good question about what we should do if someone tries to create an object that already exists but they don't have access to, e.g. a database of schema. Is it leaking information somehow by producing an error? Curious what postgres does.

There are actually two scenarios here:

A user does not have any privileges on a schema. In this case the existence of a table is not leaked:

postgres=> CREATE TABLE bar.t1 ();
ERROR:  permission denied for schema bar
LINE 1: CREATE TABLE bar.t1 ();
                     ^
postgres=> SELECT * FROM bar.t1;
ERROR:  permission denied for schema bar
LINE 1: SELECT * FROM bar.t1;
                      ^

A user has create privileges but not usage privileges on a schema. In this case the existence of a table is leaked:

postgres=> CREATE TABLE bar.t1 ();
ERROR:  relation "t1" already exists
postgres=> SELECT * FROM bar.t1;
ERROR:  permission denied for schema bar
LINE 1: SELECT * FROM bar.t1;
                      ^
postgres=> CREATE TABLE bar.t2 ();
CREATE TABLE
postgres=> SELECT * FROM bar.t2;
ERROR:  permission denied for schema bar
LINE 1: SELECT * FROM bar.t2;
                      ^

Though this is such a weird scenario we should either not care or outright prevent it.

However, after re-reading @mjibson a couple of times I think it was actually about something else.

A concern should we implement RBAC: a user could be denied read access to a mv/index due to RBAC, but then this feature might also prevent them from creating an index/mv of their own. We should thus perhaps issue a NOTICE that there was a duplicate, pointing to the first one, and the user can then decide to immediately DROP what they created.

What I think he's saying is that we should allow duplicate indexes to exist, because certain users may only have access to a single index and not the other.

benesch

Finally made it through this! Thanks very much for writing this up, @jkosh44. Looks really good!

I felt like the material here was vast enough that I couldn't engage with it all, but also that there is quite a bit more to design (e.g., how are privileges actually represented in the catalog). I think it's exactly right to pause around here and start on the implementation, but I'd love to see this design document continually revised + updated during the implementation.

With that said, I think we should add the following two sections up front:

Rollout plan

How can we start landing RBAC PRs incrementally without committing to syntax or semantics? I assume we can put GRANT, REVOKE, etc. behind unsafe mode. But how do we start adding privilege checks when users don't have access to GRANT or REVOKE? Is it possibly as easy as continuing to give everyone SUPERUSER privilege at the moment? Then privilege checks just won't apply. When we're ready to flip the switch, we remove the unsafe mode toggles and also remove SUPERUSER privileges from "Organization Members".

Probably some holes in that braindump, so fleshing a plan out in the design doc would be great. Definitely don't need to flesh out the whole rollout—just enough to describe how we'll land the first batch of PRs.
Testing plan

I imagine the plan here is mostly "write lots of SLT and testdrive tests to test the SQL implementation, and write one or two custom Rust/mzcompose tests to test the Frontegg integration." Whatever it is, we should write that down, and make sure @philip-stoev and @def- are aware of this ASAP. I'm sure they'll have no shortage of tests to write.

benesch · 2023-02-24T07:56:21Z

doc/developer/design/20230216_role_based_access_control.md

+| role creation             | `CREATEROLE`    | Can create, alter, drop, grant membership to, and revoke membership from roles.                                                                       | Yes             |
+| inheritance of privileges | `INHERIT`       | Can inherit the privileges of roles that it is a member of. On by default. For this project we can keep this as a mandatory attribute.                | Yes             |
+| cluster creation          | `CREATECLUSTER` | Can create a cluster.                                                                                                                                 | No              |
+| persisted data creation   | `CREATEPERSIST` | Can write data to S3, either directly or indirectly (for example inserting into a table would be directly and creating a source would be indirectly). | No              |


This is odd to me! Is there something special about persist that means we want to promote this to a per role permission? I would lean for simply not having this permission.

If so, we should find a different name, since “persist” isn’t a user facing concept anywhere else!

The idea behind this was to differentiate between being able to create objects like views, which don't persist large amounts of data in S3, and being able to create objects like sources, which may persist large amounts of data in S3.

I personally would also lean towards removing this, so I'll probably just remove it, unless that explanation changed your mind.

@mjibson FYI, in case you want to weigh in.

If one of the problems users have is "limit who can spend money", and money is spent by creating replicas or writing things to s3, it made sense to me that both replica creation and s3 writing should be things that need granting. I don't care too much either way, and if no users need this for now then no objection to removing it.

benesch · 2023-02-27T04:50:04Z

doc/developer/design/20230216_role_based_access_control.md

+When a new user logs in, we will create a new role for them with only the `LOGIN` and `INHERIT` attributes. If that user
+is a frontegg admin, then the role will also have the `SUPERUSER` attribute.


Though, I think the next logical question is, if we don't allow specifying the LOGIN attribute via SQL, then should we be allowing any RBAC via SQL? We should at least prevent users from typing SUPERUSER because that also will be overwritten by Frontegg on login. Though perhaps we should completely remove CREATE ROLE, ALTER ROLE, DROP ROLE, GRANT, and REVOKE.

Frontegg's role and permission system is nowhere near powerful enough to be used directly for SQL RBAC, is the problem! You can't have roles that inherit from other roles, and you can't (easily) create privileges associated with objects in SQL. Like, if a user makes a table, something would have to add Frontegg privileges for INSERT and SELECT on that source in Frontegg. It would be hard and messy.

What makes sense to me is to tightly couple to Frontegg, but only for the minimum required to map the "Organization Admin" and "Organization Member" roles sensibly to SQL RBAC-land.

Reading through this discussion gives me the following idea:

You're never allowed to specify LOGIN or SUPERUSER when you create a role. The role in Materialize always exists with neither of those characteristics.

When you log in to a role with Frontegg, you implicitly get LOGIN privilege for that session.

When you log in to a role with Frontegg and your JWT has the "Organization Admin" role, you implicitly get SUPERUSER privilege for that session.

The mz_roles table makes no mention of these privileges because they are not inherent properties of the role, but rather properties of a given session.

We introduce an mz_has_superuser() function (or maybe there's some existing PostgreSQL function we can piggyback on) that reports whether the current session has SUPERUSER privilege.

In the future, if necessary, we can expose an mz_sessions table that shows all live sessions, their roeles, and whether they have SUPERUSER privileges.

This avoids a key pitfall with syncing LOGIN and SUPERUSER status to mz_roles when a user logs in: you can't actually be sure the user is logging in with an up-to-date JWT. Suppose a user is downgraded from "Organization Admin" to "Organization Member" while actively using Materialize. They might end up with two JWTs, one saying they're an admin and one saying they're a member. There's no guarantee they use these JWTs in the right order, and they might log in second with the older JWT, and now mz_roles will sync in stale information.

By instead associating SUPERUSER privilege on a session-by-session basis, we sidestep this problem entirely. The session privileges will exactly reflect the privileges in the JWT used to start the session.

doc/developer/design/20230216_role_based_access_control.md

benesch · 2023-02-27T05:03:31Z

doc/developer/design/20230216_role_based_access_control.md

+- Do we want different `SELECT` privileges based on if a new dataflow will be spun up or if we can use an existing one?
+    - Pros: We can differentiate between using existing compute resources vs creating new ones when reading.
+    - Cons: Users (and the database) are unable to determine if they're allowed to execute a read until after that read


My take is we should avoid this for now. It's a pretty darn specific privilege, and it's one that we can always introduce later (FAST_PATH_ONLY_USAGE or something). Would be good to suss out how much of a need for this there actually is.

I'm not sure that it will be common for e.g. one team to say "this is our cluster, feel free to issue fast queries but absolutely no slow queries"—especially since enough fast queries can probably meaningfully impact cluster performance. Would rather encourage folks to create materialized views and say "if you want to query our views, turn on your own cluster and build your own indexes."

For example, an alternative to explore here is a session parameter that prevents you from running non-fast-path peeks. I can imagine you might want to opt into this for some applications where you want to be really sure that you don't accidentally have a regression (either in your application code or in Materialize's optimizer) that causes queries to suddenly start taking the slow path and crash the cluster.

doc/developer/design/20230216_role_based_access_control.md

chaas · 2023-02-27T14:57:46Z

I assume we can put GRANT, REVOKE, etc. behind unsafe mode.

I think the plan is to deprecate unsafe mode and move toward LaunchDarkly flags. @ggnall can confirm LD is ready to go.

jkosh44 · 2023-02-27T16:05:18Z

how are privileges actually represented in the catalog

I'm hoping to piggyback off of PostgreSQL for this, I just haven't put in the legwork yet to see how they do it.

Rollout plan

How can we start landing RBAC PRs incrementally without committing to syntax or semantics? I assume we can put GRANT, REVOKE, etc. behind unsafe mode. But how do we start adding privilege checks when users don't have access to GRANT or REVOKE? Is it possibly as easy as continuing to give everyone SUPERUSER privilege at the moment? Then privilege checks just won't apply. When we're ready to flip the switch, we remove the unsafe mode toggles and also remove SUPERUSER privileges from "Organization Members".

I like this plan with the caveat added that users that don't authenticate through Frontegg are not given superuser. Otherwise, we don't have a good way of testing everything if even our test users are given superuser status.

Testing plan

I imagine the plan here is mostly "write lots of SLT and testdrive tests to test the SQL implementation, and write one or two custom Rust/mzcompose tests to test the Frontegg integration." Whatever it is, we should write that down, and make sure @philip-stoev and @def- are aware of this ASAP. I'm sure they'll have no shortage of tests to write.

Yeah, I think it will look a lot like this. I'm also hoping to borrow any tests that PostgreSQL has in their repo.

I'll add all of this to the doc.

jkosh44 · 2023-03-29T15:31:42Z

Now that I'm on the final (non-optional) phase, the implementation details for all phases are filled in, if anyone wants to take another look. Specifically I think the final phase's implementation is of particular interest: a448732.

One thing that may be challenging is that the PostgreSQL aclitem datatype does not store role names, it only stores role IDs:

https://github.com/postgres/postgres/blob/3aa961378b4e517908a4400cdc476ca299693de9/src/include/utils/acl.h#L48-L59

Yet when displaying aclitems, it shows the role names and not the role IDs:

postgres=# SELECT makeaclitem(16384, 10, 'CREATE', false);
  makeaclitem   
----------------
 joe=C/postgres
(1 row)

Looking the the makeaclitem implementation, it only uses role IDs and not role names:

https://github.com/postgres/postgres/blob/3aa961378b4e517908a4400cdc476ca299693de9/src/backend/utils/adt/acl.c#L1604-L1645

It isn't until the aclitem is converted to a string that the role names filled in:

https://github.com/postgres/postgres/blob/3aa961378b4e517908a4400cdc476ca299693de9/src/backend/utils/adt/acl.c#L614-L682

This means that the conversion to a string needs access to the catalog. If I'm remembering correctly, Materializes does not have catalog access when formatting a Datum to a string.

maddyblue · 2023-03-29T16:10:45Z

That sounds identical to the regtype and regproc types which are OIDs under the covers but implement a cast to string that knows how to print itself, which has access to the catalog.

jkosh44 · 2023-03-29T16:44:56Z

That sounds identical to the regtype and regproc types which are OIDs under the covers but implement a cast to string that knows how to print itself, which has access to the catalog.

Oh very cool, I did not know about these types. However, it looks like we don't follow the same semantics here as PostgreSQL:

postgres=# SELECT 16::regtype;
 regtype 
---------
 boolean
(1 row)

materialize=> SELECT 16::regtype;
 regtype 
---------
 16
(1 row)

maddyblue · 2023-03-29T16:54:05Z

That's due to some other postgres casting rule we don't yet follow. We are indeed able to convert those to strings, you just have to ask:

mjibson=> select 16::regtype;
 regtype 
---------
 16
(1 row)

Time: 59.627 ms
mjibson=> select 16::regtype::text;
 text 
------
 bool
(1 row)

But the impl should look something like

materialize/src/sql/src/plan/typeconv.rs

Line 318 in 3fbb6ed

(RegType, String) => Explicit: sql_impl_cast("(

jkosh44 · 2023-03-29T17:28:51Z

That's due to some other postgres casting rule we don't yet follow.

I'm not sure if that's 100% correct. I don't think that PostgreSQL implicitly casts regtype to a text, it just has custom display logic for regtype:

postgres=# SELECT 16::regtype;
 regtype 
---------
 boolean
(1 row)

postgres=# SELECT pg_typeof(16::regtype);
 pg_typeof 
-----------
 regtype
(1 row)

postgres=# SELECT 16::regtype::text;
  text   
---------
 boolean
(1 row)

postgres=# SELECT pg_typeof(16::regtype::text);
 pg_typeof 
-----------
 text
(1 row)

I'm not sure where the casting is implemented, but the formatting is implemented here: https://github.com/postgres/postgres/blob/8e5eef50c5b41fd39ad60365c9c1b46782f881ca/src/backend/utils/adt/regproc.c#L1223-L1269

The return type of that function is explicitly cstring:
https://github.com/postgres/postgres/blob/8e5eef50c5b41fd39ad60365c9c1b46782f881ca/src/include/catalog/pg_proc.dat#L7119-L7121

and not text like some other functions that are know to return text, like system_user:
https://github.com/postgres/postgres/blob/8e5eef50c5b41fd39ad60365c9c1b46782f881ca/src/include/catalog/pg_proc.dat#L1520-L1522

which makes me think that PostgreSQL is not implicitly casting regtype to text.

It makes sense to me that the casting works in Materialize because it is implemented on the main coord thread that has access to the catalog, but formatting happens on the pgwire client thread which doesn't have access to the catalog.

We probably want to keep the proposed column as a maclitem[] type and not text[], because maclitem carries more semantic meaning internally than text and is not denormalized. However, then we'd have to somehow force all queries against maclitem to be implicitly cast to text, which is a bit of a strange behavior.

jkosh44 · 2023-03-29T17:30:59Z

All this is just if we want maclitem to be formatted nicely like aclitem, which is probably required for certain meta-commands to work properly but not strictly needed for RBAC to work properly. Otherwise we can just format maclitem using raw role IDs and have it look ugly.

maddyblue · 2023-03-29T18:32:24Z

it just has custom display logic for regtype

This is a part of postgres I don't yet understand. The type of regtype is regtype (==oid), but it always returns a string datum. I don't get how it's correct for pg to announce it's a regtype, but then serialize it as a string always.

maddyblue · 2023-03-29T18:37:50Z

doc/developer/design/20230216_role_based_access_control.md

+- We will create a new Rust struct called `AclMode` that is a bit flag containing all privileges.
+    - See https://github.com/postgres/postgres/blob/3aa961378b4e517908a4400cdc476ca299693de9/src/include/nodes/parsenodes.h#L74-L101
+    for PostgreSQL's implementation.
+- We will create a new datatype called `maclitem` in the `mz_internal` schema to represent a single privilege.


If we're going to add a new type, it needs to be described more. Can you list the operators, functions, and casts it supports. Same for the array version of it. I see pg has some details at https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-ACLITEM-OP-TABLE.

Also how they will be serialized over pgwire in text and binary. (This is probably just: "exactly the same as postgres" but worthwhile to do the investigation for how postgres does it first to make sure we also can.)

I've added these details.

A simpler approach, that I'm starting to like better, is to add the following system table:

CREATE TABLE mz_privileges ( object_id: Text, -- ID of object. grantee: Text, -- ID of role that has a certain privilege. grantor: Text, -- ID of role that granted the privilege. privileges: Text, -- Letter codes of privileges concatenated together. );

maclitem is slightly more complicated than I was hoping for because our repr layer doesn't have access to RoleIds.

jkosh44 · 2023-03-29T18:45:32Z

it just has custom display logic for regtype

This is a part of postgres I don't yet understand. The type of regtype is regtype (==oid), but it always returns a string datum. I don't get how it's correct for pg to announce it's a regtype, but then serialize it as a string always.

I think this is the case for all non-trivial data types though, right? For example internally an interval is one 64-bit integer and two 32-bit integers:

https://github.com/postgres/postgres/blob/8e5eef50c5b41fd39ad60365c9c1b46782f881ca/src/include/datatype/timestamp.h#L44-L53

However, when interval's are displayed to the user they are first formatted as a string:

postgres=# SELECT make_interval(1, 2, 3, 4, 5, 6, 7);
         make_interval          
--------------------------------
 1 year 2 mons 25 days 05:06:07
(1 row)

postgres=# SELECT pg_typeof(make_interval(1, 2, 3, 4, 5, 6, 7));
 pg_typeof 
-----------
 interval
(1 row)

postgres=# SELECT make_interval(1, 2, 3, 4, 5, 6, 7)::text;
         make_interval          
--------------------------------
 1 year 2 mons 25 days 05:06:07
(1 row)

postgres=# SELECT pg_typeof(make_interval(1, 2, 3, 4, 5, 6, 7)::text);
 pg_typeof 
-----------
 text
(1 row)

Even for simple ones like dates and times, which are just a single integer, there is custom formatting when displaying to the user:

https://github.com/postgres/postgres/blob/8e5eef50c5b41fd39ad60365c9c1b46782f881ca/src/include/utils/date.h#L23-L25

postgres=# SELECT make_time(1, 2, 3);
 make_time 
-----------
 01:02:03
(1 row)

postgres=# SELECT pg_typeof(make_time(1, 2, 3));
       pg_typeof        
------------------------
 time without time zone
(1 row)

postgres=# SELECT make_time(1, 2, 3)::text;
 make_time 
-----------
 01:02:03
(1 row)

postgres=# SELECT pg_typeof(make_time(1, 2, 3)::text);
 pg_typeof 
-----------
 text
(1 row)

postgres=# 
postgres=# SELECT make_date(1, 2, 3);
 make_date  
------------
 0001-02-03
(1 row)

postgres=# SELECT pg_typeof(make_date(1, 2, 3));
 pg_typeof 
-----------
 date
(1 row)

postgres=# SELECT make_date(1, 2, 3)::text;
 make_date  
------------
 0001-02-03
(1 row)

postgres=# SELECT pg_typeof(make_date(1, 2, 3)::text);
 pg_typeof 
-----------
 text
(1 row)

maddyblue · 2023-03-29T18:48:54Z

when displaying to the user

What does this mean? Like, what's the difference between a regproc displayed to the user and not displayed to the user? How does it determine if something is displayed to the user? Is it anything in the final SELECT projection? It feels like there's some data layer mz lacks here.

jkosh44 · 2023-03-29T19:00:34Z

when displaying to the user

What does this mean? Like, what's the difference between a regproc displayed to the user and not displayed to the user? How does it determine if something is displayed to the user? Is it anything in the final SELECT projection? It feels like there's some data layer mz lacks here.

For each datatype in PostgreSQL, there's a X_out function that serializes the datatype into a c-string, for example the regtypeout function linked above. I'm not exactly sure at what point this is called in PostgreSQL, but it's my understanding that it's called on all the values in every row right before that row is sent over the wire to the client when sending a query response (there may be some nuances here that I don't understand between the text and binary protocol).

Our equivalent of this is when we encode values here:

materialize/src/pgrepr/src/value.rs

Lines 302 to 311 in 3fbb6ed

    
           /// Serializes this value to `buf` in the specified `format`. 
        
           pub fn encode(&self, ty: &Type, format: Format, buf: &mut BytesMut) -> Result<(), io::Error> { 
        
               match format { 
        
                   Format::Text => { 
        
                       self.encode_text(buf); 
        
                       Ok(()) 
        
                   } 
        
                   Format::Binary => self.encode_binary(ty, buf), 
        
               } 
        
           }

The big difference is that when we encode values, we don't have access to the catalog.

what's the difference between a regproc displayed to the user and not displayed to the user

I think fundamentally the break down is as follows:
Not displayed to the user would be how a regproc is stored on disk and how it's stored in memory when being manipulated by functions and operators.
Displayed to the user would be how a regproc is serialized before being sent over the wire to a client.

jkosh44 · 2023-03-29T19:01:23Z

Though the more I think about it, the more I think that it's probably fine if maclitem matches retype and encodes itself as text using raw IDs instead of role names.

jkosh44 · 2023-03-29T19:15:58Z

Sorry I don't think I fully understood what was going on at first which is why my language was inexact/vague. I understand what's happening now, so here's a summary that uses more exact language to avoid confusion.

Currently in Materialize, our text encoding of the regtype and regproc for the pgwire protocol does not match PostgreSQL. The text encoding that ProstgreSQL uses for regtype and regproc uses the type and function's human readable name instead of raw OIDs. The text encoding that Materialize uses for regtype and regproc uses the raw OIDs.

The reason this discrepancy exists is because PostgreSQL has access to the catalog when doing text encoding, while Materialize does not.

As a corollary, the text encoding of the proposed type, maclitem will use raw role IDs instead of human readable role names.

I'll add this to the design doc.

jkosh44 · 2023-04-19T20:03:37Z

I'm going to go ahead and merge this. If there's updates later then I'll open a new PR.

jkosh44 requested review from benesch, ggnall, maddyblue and a team February 16, 2023 19:52

doc: Add design doc for RBAC

a40dbb0

Part of MaterializeInc#11579

jkosh44 force-pushed the rbac-design branch from 9723157 to a40dbb0 Compare February 16, 2023 19:58

jkosh44 added 2 commits February 16, 2023 15:10

Fix clippy

6f2dd7c

Add Postgres source files

7309994

maddyblue reviewed Feb 16, 2023

View reviewed changes

jkosh44 commented Feb 16, 2023

View reviewed changes

doc/developer/design/20230216_role_based_access_control.md Outdated Show resolved Hide resolved

jkosh44 requested a review from chaas February 16, 2023 20:43

Fix typo

365cd1d

jkosh44 mentioned this pull request Feb 21, 2023

sql: Add attributes to role #17727

Merged

4 tasks

Add details for PUBLIC and frontegg admins.

e2ffd65

maddyblue approved these changes Feb 21, 2023

View reviewed changes

chaas reviewed Feb 22, 2023

View reviewed changes

jkosh44 commented Feb 22, 2023

View reviewed changes

sjwiesman reviewed Feb 23, 2023

View reviewed changes

doc/developer/design/20230216_role_based_access_control.md Outdated Show resolved Hide resolved

Add default privileges and ownerships for a fresh Materialize install

c58d42e

benesch reviewed Feb 27, 2023

View reviewed changes

jkosh44 added 3 commits February 27, 2023 09:40

Fix default owners and privileges

e385c3c

Update drop owned and reassigned owned

f1a9d10

Move drop owned and reassigned owned to optional phase

e7557b2

Move LOGIN and SUPERUSER to session attributes

fc2b252

jkosh44 added 3 commits March 20, 2023 11:24

Add owner implementation

76d5646

Split ownership and privileges into separate phases

8617461

Add implementation details for privileges

a448732

maddyblue reviewed Mar 29, 2023

View reviewed changes

jkosh44 added 3 commits March 29, 2023 16:09

Update maclitem details

6290671

Fix formatting

917034a

Fix formatting

b8e0a22

jkosh44 mentioned this pull request Apr 10, 2023

sql: Add more variables for controlling RBAC #18650

Merged

4 tasks

jkosh44 added 5 commits April 10, 2023 17:29

Update rollout plan

d93fa81

Update privilege defaults

c4e0c24

Update grant and revoke syntax

e4ff026

Fix clippy

f5f8bf1

Update default privileges

2831b3e

jkosh44 merged commit 6a36355 into MaterializeInc:main Apr 19, 2023

jkosh44 deleted the rbac-design branch April 19, 2023 20:03

materialize-bot mentioned this pull request Apr 21, 2023

release: v0.52.0 required reviews #18907

Closed

12 tasks

jkosh44 mentioned this pull request Apr 24, 2023

[Epic] Role Based Access Control #11579

Closed

1 task

		When a new user logs in, we will create a new role for them with only the `LOGIN` and `INHERIT` attributes. If that user
		is a frontegg admin, then the role will also have the `SUPERUSER` attribute.

doc: Add design doc for RBAC #17702

doc: Add design doc for RBAC #17702

Conversation

jkosh44 commented Feb 16, 2023

Motivation

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkosh44 commented Feb 21, 2023 • edited Loading

jkosh44 commented Feb 21, 2023

maddyblue left a comment

Choose a reason for hiding this comment

benesch commented Feb 22, 2023 • edited Loading

chaas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkosh44 commented Feb 24, 2023

benesch left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chaas commented Feb 27, 2023

jkosh44 commented Feb 27, 2023

jkosh44 commented Mar 29, 2023 • edited Loading

maddyblue commented Mar 29, 2023

jkosh44 commented Mar 29, 2023

maddyblue commented Mar 29, 2023

jkosh44 commented Mar 29, 2023

jkosh44 commented Mar 29, 2023

maddyblue commented Mar 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkosh44 commented Mar 29, 2023

maddyblue commented Mar 29, 2023

jkosh44 commented Mar 29, 2023 • edited Loading

jkosh44 commented Mar 29, 2023

jkosh44 commented Mar 29, 2023

jkosh44 commented Apr 19, 2023

jkosh44 commented Feb 21, 2023 •

edited

Loading

benesch commented Feb 22, 2023 •

edited

Loading

benesch left a comment •

edited

Loading

jkosh44 commented Mar 29, 2023 •

edited

Loading

jkosh44 commented Mar 29, 2023 •

edited

Loading