Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TypeScript types generation in /schema endpoint #18121

Open
rijkvanzanten opened this issue Apr 10, 2023 · 21 comments · May be fixed by #19867
Open

Support TypeScript types generation in /schema endpoint #18121

rijkvanzanten opened this issue Apr 10, 2023 · 21 comments · May be fixed by #19867

Comments

@rijkvanzanten
Copy link
Member

In addition to GraphQL and OAS, we should be able to generate TypeScript types in the /schema endpoint. #8531 was a great start. I think we've learned there that we—even though it's kinda against TypeScript common-practice—use the database-key version of the table names in the generations, to prevent naming conflicts (similar to how we're doing that in GraphQL)

@johnsusek
Copy link

#5418 (comment)

@johnsusek
Copy link

johnsusek commented Jan 2, 2024

Per my above comment, the OAS can be used to generate types, e.g.:

npx openapi-typescript http://directus.local/server/specs/oas -o ./schema.d.ts

Going this route lets the /schema endpoint remain (programming language) independent.

https://github.com/drwpow/openapi-typescript

I think if we added this to the docs somewhere it could take care of a lot of use cases. What's cool is some tools can even generate Redux/Vuex stores, or even entire UIs, from an OAS, which dovetails perfectly with directus.

And if the OAS doesn't contain enough metadata to generate any special types we need, it will encourage work on the spec to benefit all clients/languages.

For the SDK maybe openapi-typescript can be integrated into a 'typesync' command that just fetches and generates the .d.ts for the user

@johnsusek
Copy link

One thing to keep in mind w/r/t types is that when using fields like:

?fields=*,sections.*,sections.buttons.*.*

the syntax to access an individual button is sections[0].buttons[0].item.title as opposed to `sections[0].buttons[0].title

The generated types aren't aware of this item property required for the tertiary relationship. I'm not sure if/how this could be addressed in the oas.

@br41nslug
Copy link
Member

br41nslug commented Jan 3, 2024

The difference i see is that we have "input types" and "output types". OAS and general typescript types are often geared to be "output types" aka the exact type you'll be getting from an API or function. But the SDK needs "input types" with all the possible/optional properties and some more meta data to correctly build filter and field suggestions and then generates the "output types" based on specific queries performed.

Which is why i still think we'll need an extra SDK oriented type generator over a generic one (which can be based on the existing oas) or both really

Edit: Perhaps OAS targets "input types" too as such but it doesnt account for Directus specific construction like field types to apply functions to in filters/aggregation

The generated types aren't aware of this item property required for the tertiary relationship. I'm not sure if/how this could be addressed in the oas.

The SDK checks for the item property to determine the specific m2a relation so that should be good

@codeit-ninja
Copy link
Contributor

Per my above comment, the OAS can be used to generate types, e.g.:

npx openapi-typescript http://directus.local/server/specs/oas -o ./schema.d.ts

Going this route lets the /schema endpoint remain (programming language) independent.

https://github.com/drwpow/openapi-typescript

I think if we added this to the docs somewhere it could take care of a lot of use cases. What's cool is some tools can even generate Redux/Vuex stores, or even entire UIs, from an OAS, which dovetails perfectly with directus.

And if the OAS doesn't contain enough metadata to generate any special types we need, it will encourage work on the spec to benefit all clients/languages.

For the SDK maybe openapi-typescript can be integrated into a 'typesync' command that just fetches and generates the .d.ts for the user

Don't take this approach, the types generated by this are not accurate, as for example all properties are optional. Ive tried to use this method, and it doesn't work properly.

@echocrow
Copy link

echocrow commented Jan 11, 2024

Would love to see first-party support type generation! If done right, this could be leveraged in several places, such as

  • Typing the Knex database instance (commonly available in extension contexts)
  • Typing ItemService (also commonly available in services in extension contexts)
  • Typing extentions, particularly hooks, from hook names (<collection>.items.create, etc.) to their payloads and so on.

Probably already on the radar, but some key aspects that I think would be useful:

  • Indexing/formatting:
    As per OP, types should at least be indexed by the database-key version of the table names (in Directus that defaults to snake_case, but this is not enforced). Beyond avoiding conflicts, this increase compatibility with many other tools.
    • Further regarding type name formatting: An interface my_collection {} does feel counter to common TS convention, but I get that auto-formatting to PascalCase could result in conflicts. I'd still lean towards auto-formatting, assuming type generation is an opt-in process (aka this could only affect those who do drill into type generation, but also use near-identical table names that map to the same PascalCase), or if perhaps formatting can be opted out of.
    • Alternatively: Maybe neither create nor export individual collection types, and just generate one all-containing Collections interface? Problem "solved" by avoiding. Individual collections could then still be referenced via e.g. Collections['my_colletion'].
  • Generation Usage:
    Beyond providing an endpoint, would be great to have this as a CLI command or even an exported function for custom integrations.
  • Optional vs Required:
    As per above comment, types should indicate whether fields are required or optional fields.
  • Complete Schema:
    Beyond user-defined collections, it would be useful to also generate or provide types for directus_* tables. These can then be used in Knex select queries, and <system-collection>.* hooks such as files.insert or fields.update.
  • Relations:
    Maybe somehow capturing O2O/M2O/O2M relationships, aka references to other collections. However, I'm not sure which tool per-se would benefit from this. I don't think Knex cares, and I don't think ItemService has type support for this, but I'd love to be wrong.

ATM we're checking most of the above by generating types via kysely-codegen. We've tried several other approaches, such as openapi-typescript, kanel, pg-to-ts, etc., but they all fell short in some ways for our wish list.
Like most others, Kysely's codegen derives all its types from the database directly (similar to @directus/schema, I believe). That means we do lose some extra meta information, such as string enums in Directus Interfaces, but it's fast and good-enough for us.
We paired that with some extra utility types, e.g. to resolve or mark Generated<...> types as optional, and to create separate select/insert/update types. The latter is useful for Knex's table type defs.

@ryami333
Copy link

Related: #21805

@louisprp
Copy link

Supabase have developed a very nice Postgres extension called pg_meta which has an endpoint to retrieve the typescript types from the database. Their sdk implementation might differ a bit, but they also differentiate between “input” and “output” (and update) types. The implementation is quite solid and was the first thing that came to mind. Might be worth a shot.

@echocrow
Copy link

they also differentiate between “input” and “output” (and update) types.

so does kysely-codegen, and it kinda makes sense. (e.g. often you'd omit primary keys on insert, and are guaranteed a primary key on select).
this actually also works well when typing the aforementioned Knex's table types.

for general types, and for Directus' ItemsService, we've defaulted to the generated "select" types. been using that for a few months now with great success to type

  • Knex' queries builder
  • Directus' ItemsService
  • Directus' read/create/update/delete item hooks (${TCollection}.items.${TAction}) and their payloads

@rijkvanzanten
Copy link
Member Author

@louisprp @echocrow Do they differentiate between create and update as well, or just input/output? In Directus you can have different field access for update permissions so just making sure that's covered in the input/output type difference.

@louisprp
Copy link

louisprp commented Mar 29, 2024

The following snippet taken from their docs describes the generated types given this schema:

create table public.movies (
  id bigint generated always as identity primary key,
  name text not null,
  data jsonb null
);

Which results in:

export type Json = string | number | boolean | null | { [key: string]: Json | undefined } | Json[]

export interface Database {
  public: {
    Tables: {
      movies: {
        Row: {
          // the data expected from .select()
          id: number
          name: string
          data: Json | null
        }
        Insert: {
          // the data to be passed to .insert()
          id?: never // generated columns must not be supplied
          name: string // `not null` columns with no default must be supplied
          data?: Json | null // nullable columns can be omitted
        }
        Update: {
          // the data to be passed to .update()
          id?: never
          name?: string // `not null` columns are optional on .update()
          data?: Json | null
        }
      }
    }
  }
}

Here, all operations are supported with the correct type mappings (select, insert, update). As far as per-field permissions go, I don’t think this is supported. The schema will be generated for your whole database, no matter the role. In this case, I would agree with other comments from #21805, that it definitely makes more sense to have it generated this way, if that’s what you are referring to.

@rijkvanzanten
Copy link
Member Author

As far as per-field permissions go, I don’t think this is supported. The schema will be generated for your whole database, no matter the role. In this case, I would agree with other comments from #21805, that it definitely makes more sense to have it generated this way

What do you mean with this way in that sentence? You mean the way where it's the same global schema for everybody, even though you may or may not have access, or the way where you have a schema for the actual access control that's available to your user?

@echocrow
Copy link

@louisprp @echocrow Do they differentiate between create and update as well, or just input/output? In Directus you can have different field access for update permissions so just making sure that's covered in the input/output type difference.

@rijkvanzanten I don't recall this for all the different libs we tried in the past, but kysely-codegen does differentiate between select vs insert vs update.
Semi-noteworthy: Unlike pg_meta (based on @louisprp's post), it does this not by defining every collection (i.e. table) three times, but by using a helper type where applicable to differentiate between select/insert/update.
The helper type looks like this…

type ColumnType<
  SelectType,
  InsertType = SelectType,
  UpdateType = SelectType,
> = ...

and other helper types are available to extract an op-specific type when needed.

Excerpt of the generated schema:

interface Products {
  id: Generated<number>;
  name: string;
  type: Generated<string>;
  slug: Generated<string | null>;
  device: string | null;
  design: string | null;
  version: string | null;
  // ...
}

// ...

export interface DB {
  // ...
  countries: Countries;
  // ...
  directus_activity: DirectusActivity;
  // ...
  products: Products;
  product_types: ProductTypes;
  // ...
}

The generated types are not perfect from introspection alone (or at least not from how we're using kysely-codegen Some noteworthy caveats we're seeing today:

  • Currently no relational references. E.g. our products have a type field that references a product_types record. The generated types currently do not provide any hints that this type is a reference to product_types.
  • No "optional" keys; values are either some type T or T | null when nullable. (We wrote our own little helper to mark nullable fields optional keys. This was good enough for us.)
  • Some ID keys are semi-correct; e.g. directus_files's id field is typed as string, but should probably be Generated<string> instead, as it's optional on insert.

For our usecase though, with a little bit of duct taping it's good and fast enough. It's now part of our code gen script and executes in ~800ms. That's including starting and destroying the DB connection taken from Directus's env variables.

Unsure if the state of how we use it today overlaps much with how Directus may want to go about type generation. But if there's any interest, happy to share both our codegen script and how we use it to type Knex, Hooks, and the ItemsService. (Those are were the APIs where type safety was most important to us).

@louisprp
Copy link

louisprp commented Mar 29, 2024

What do you mean with this way in that sentence?

Yes, in this case I mean that it’s the same global schema for everyone. Maybe I misunderstood the discussion from glancing over it. I believe that this is the best approach, considering that other tools provide the same experience (e.g. the aforementioned supabase postgres extension and ORMs like Drizzle, as far as I know). Furthermore, I am not sure of the real benefit security wise of doing it differently (on a per user basis). Not sure if this is even a consideration, just throwing it out here.

@rijkvanzanten
Copy link
Member Author

Yes, in this case I mean that it’s the same global schema for everyone. Maybe I misunderstood the discussion from glancing over it. I believe that this is the best approach, considering that other tools provide the same experience (e.g. the aforementioned supabase postgres extension and ORMs like Drizzle, as far as I know). Furthermore, I am not sure of the real benefit security wise of doing it differently (on a per user basis). Not sure if this is even a consideration, just throwing it out here.

This is what continues to confuse me! Your role/user's access control directly determines what columns are available in a table through the API. So if you generate a "global schema", you end up with a schema that has more fields in it than you can actually access with the user/role you're using with the SDK 🤔 In an ORM like Drizzle, you're in the server-side so it's always full-admin thus it makes sense to have a single global schema, but for Directus, where the output schema of the API is dependent on the role/user you're using, it means the global schema is most likely not accurate for your user. (It would also become a security vulnerability as you're now exposing the whole data model to users who might not have access to it, but that's another discussion)

Right now, Directus' schema introspection for GraphQL/OpenAPI will return it based on the role/user you specify when generating the schema to make sure that the schema you get is actually the schema you can reliably use. To me it feels like it should behave the same for TS for the aforementioned reasons.

(hehe don't get me wrong just having it generate a single large schema for everybody all the time is a lot simpler to maintain, so it'd be nice if we can get away with that)

@louisprp
Copy link

So if you generate a "global schema", you end up with a schema that has more fields in it than you can actually access with the user/role you're using with the SDK 🤔

I see your point in this and would agree that it makes more sense logically, although it is the first time that I see it implemented in this fashion. From memory I can only come up with schemas that are “global” by design (for example OpenAPI specs, GraphQL usually), even though some methods might not be accessible to all users. Furthermore, I would argue that the security benefit is more of an “security through obscurity” kind of thing that not necessarily protects the data. In the end, I doubt that this would drastically increase the security, given that the people accessing the schema are developers familiar with the project anyway, and I think most of them just want to properly type the stuff that they implement. Also, the security should arguably not be provided by just not knowing about a specific table or column, but rather through the proper access controls, which should be the case with directus.

In the end, I it is hard to find a use case for user-specific schemas, since when you’re developing for multiple users I can’t see how it could be done, and if you’re only developing for one specific use case you might as well just have the whole schema, but maybe you have some other experiences to bring up.

@rijkvanzanten
Copy link
Member Author

[...] schemas that are “global” by design [...], even though some methods might not be accessible to all users. [...] I think most of them just want to properly type the stuff that they implement.

These things to me are in direct contradiction with each other! If you have a schema with stuff you can't actually use or access, you don't have a properly typed setup 🫠

Also, the security should arguably not be provided by just not knowing about a specific table or column, but rather through the proper access controls, which should be the case with directus.

This is a long-term often-debated topic. For instances where security is of the essence, giving away the data model beyond what the user can access already gives away too much private data. It's the same reason why Directus returns 403s instead of 404s; you can't currently extract the data model unless you're authenticated, and only can you see the data model for the parts you can read at that point.

@sangrepura
Copy link

I am just starting evaluating Directus, and reading through this conversation it's still not clear to me if Directus can be introspected with a tool like graphql-codegen? Do they not have first-class Typescript support? Are the types presented by in Directus actually inaccurate?

@echocrow
Copy link

These things to me are in direct contradiction with each other! If you have a schema with stuff you can't actually use or access, you don't have a properly typed setup 🫠

As you pointed out, this would also only be applicable to SDK usage (or ItemsService with custom accountability) for TS developers performing an operation with a limited-access role, accessing specifically a collection/field outside their scope.

If I may lean on the 80/20 rule here: It sure would be nice to have a 100% perfectly accurate schema, which is precisely tuned to the given access roles. But if this increases the technical complexity significantly, I'd be in favor of having mostly accurate types (ones that may include extraneous collections/fields, but is otherwise accurate) sooner, than no type hinting for significantly longer.

RE security: I also find myself with one foot in either camp. On one hand, it does sound like a "security through obscurity" fallacy. At the same time, I also would not want all our internal company table and field names to be publicly listed—more for reasons of public perception than actual security.

At the end, this may boil down to the use case this is catering for. For us today, all our generated types live in private repos, where schema-wide types are fine, and are primarily used server-side. However, if we were a SaaS, and our goal was be to provide types for clients and their developers using the Directus SDK, I can see myself needing RBAC-specific types. 🫠

Perhaps a community poll could help determine how important role-specific type generation is over more generic, role-agnostic types? Or perhaps this could be tackled as a multi-staged release?

@WoLfulus
Copy link
Contributor

I've been using my own tool fttb: https://github.com/linefusion/indirectus/

Example:
npx indirectus sdk generate --url http://localhost:8055 --token <some-static-token-with-admin-privileges>.

IMHO a more "generic" approach (like what I did in that tool by using templates) would open for different implementations and customizations - it's just a matter of exposing a well structured and complete data in order to be able to properly generate something valid in the template engine.

Also, exposing a /metadata[?role=id] that responds with that data would be really useful and enables client side generation too.

I did this because I needed TS types, but also C# code for a custom SDK I have, and a Rust backend (just by having additional templates in place).

@heyarne
Copy link
Contributor

heyarne commented Apr 18, 2024

This is what continues to confuse me! Your role/user's access control directly determines what columns are available in a table through the API. So if you generate a "global schema", you end up with a schema that has more fields in it than you can actually access with the user/role you're using with the SDK.

For our use-case, this is very OK. We generate a typed directus client package (at the moment via openapi-typescript, with all the caveats mentioned in this thread) using an access token that can read the whole schema. This client is then distributed internally, among other packages in our monorepo. Some are bots, some are frontends and so on. These all use their own token to access the backend, so access control happens on the server, when the request is made, which is in line with what I would expect.

I understand that metadata (collection names and relations) could be leaked this way. But that's not something that happens in our case, because as I said, we write the tools that consume our hand-rolled typed SDK. We already have that information. For us the main benefit is ensuring that responses and requests are well-typed (and they are still well-typed if the client knows about more collections than the token has access to, given that we need to handle errors anyways; it's an additional error, but doesn't change anything about the soundness of the types).

If I may lean on the 80/20 rule here: It sure would be nice to have a 100% perfectly accurate schema, which is precisely tuned to the given access roles. But if this increases the technical complexity significantly, I'd be in favor of having mostly accurate types (ones that may include extraneous collections/fields, but is otherwise accurate) sooner, than no type hinting for significantly longer.

I agree!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🙅 Blocked
Development

Successfully merging a pull request may close this issue.