Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split off openapi into postgrest-contrib #1698

Open
wolfgangwalther opened this issue Dec 18, 2020 · 17 comments
Open

Split off openapi into postgrest-contrib #1698

wolfgangwalther opened this issue Dec 18, 2020 · 17 comments
Labels

Comments

@wolfgangwalther
Copy link
Member

Problem

OpenAPI support is not up to date: We support v2, but v3 is out there for some years now (#932). Currently we have 26 issues open that are tagged "OpenAPI".

I'm sure some of those issues are not that hard to solve in theory. I feel like having the OpenAPI code in haskell is quite a hurdle for community contributions, though. The majority of our users probably does not know haskell at all - but would be able to make contributions in SQL.

Test coverage seems to be quite low for that area, too. I recently played a bit with #1653 - but was surprised to see that I didn't get any failing tests after changing some things.

Suggestion

I suggest to split off all the openapi related code. Remove it from core postgrest, put it in a separate (still official) postgrest-contrib repo and turn all the code into SQL. We already have the db-root-spec config option, to be able to use it.

Then we ship PostgREST without openapi support at all. In the docs we are very clear about how to add openapi support from our "official extension". It's basically just running a SQL script to create the function(s) + setting the config option. Without this config option, the root endpoint does not return anything (or maybe a hint on how to add openapi support, if that's better).

I see the following advantages of this:

  • Less complex codebase for postgrest core
  • Fewer stuff to load with the schema cache, might make startup faster
  • Hopefully increased community contributions for the openapi parts, because of lower hurdles to do so
  • Ability to provide OpenAPI v3 support in addition to v2 without breaking changes: Users are able to choose which set of functions to install.
  • Ability to disable OpenAPI output by not providing a db-root-spec config at all
  • Ability to extend / change the output specifically for a single project (currently root-spec is available as well, but you would still have to reproduce the whole code in SQL first)
  • Ability to add root endpoint filtering (How to show the JSON schema of only one table #1421) by treating the db-root-spec mostly like a regular RPC (with some small tweaks, it should be able to add something powerful)

The postgrest-contrib repo could have other SQL-only stuff, too. E.g. #1690 and other SQL snippets we are suggesting in the docs right now. We can have several distribution formats for this repo, e.g. as pg extension or just as one big SQL file (for cloud hosted pg servers). Users can install just some parts of this or everything - and then just enable those functions in the postgrest configuration.

Having this in a separate repo should make it a bit easier to track on the issue tracker. Also, while some SQL parts might be postgrest-version dependent, most parts while likely be usable across more versions. Having a different repo (and multi-version compatibility in mind) should make it easier to perform major version upgrades of postgrest - without the need to update the SQL code at the same time.

The biggest challenge I see will be making that openapi output reflect what PostgREST really does. Parts of the output are only reflecting the database schema and are unrelated to PostgREST itself (think db comments as an example), but others (think embedding opportunities - if those could be part of the openapi output, that would be great) are heavily dependent on PostgREST itself. To sync this, we could pass (a part of) the schema cache via JSON parameter to the openapi function, to make it aware of what PostgREST knows about the database. We'd still have to work out the details, of course.

@steve-chavez
Copy link
Member

@wolfgangwalther Sounds great! I agree overall. Even if we solve the current issues more would keep coming considering that OpenAPI is still evolving. So It'd be great to do openapi through SQL.

Until now, the blocker for this was how to pass the schema cache info to db-root-spec. Before I thought we'd need to offer our schema cache in SQL for this. But now you're proposing we pass it as an argument to the db-root-spec function right? I think that could work.

One thing though, should we keep the "crippled" output for users that don't want a more complete OpenAPI? Perhaps we could do this for some versions until our SQL OpenAPI gains more acceptance.

I say this because we got some feedback on #1511 (comment), I think most users would prefer postgrest stay low on the db state it needs.

@steve-chavez
Copy link
Member

It seems that this would be done in part by #1697, allowing the DbStructure to be simple json. We'd then pass that json(on Haskell) to db-root-spec on each / request. One thing great about that is that db-root-spec would be always up-to-date with our schema cache.

@steve-chavez
Copy link
Member

I wonder if we could pass a structured type instead of a big json to db-root-spec. Perhaps we could cast the json to a custom type. That way it'd be easier to work the openapi output.

@wolfgangwalther
Copy link
Member Author

Until now, the blocker for this was how to pass the schema cache info to db-root-spec. Before I thought we'd need to offer our schema cache in SQL for this. But now you're proposing we pass it as an argument to the db-root-spec function right? I think that could work.

Yes, although I would like to limit the amount of data passed in as much as possible. I would still expect the function to make calls to the pg catalog to get most of the data. Just the key pieces that need to be in sync with postgrest, would come in via argument.

Otherwise we would create too much of a dependency of the openapi output on postgrest core again. Whereever possible, the openapi repo / code / development should not be blocked by changes needed in core.

One thing though, should we keep the "crippled" output for users that don't want a more complete OpenAPI? Perhaps we could do this for some versions until our SQL OpenAPI gains more acceptance.

I would not really keep it in deliberately... but maybe we can just do the following:

  1. Extend db-root-spec to get data passed in via argument. This enables the use of our custom proc.

  2. Develop the SQL-only openapi output.

  3. Make a release of postgrest.

  4. Rip out the openapi schema from core.

This way we'd have the benefits of simplicity for postgrest core immediately for the nightly. The latest released version will still have the old openapi output in it, but people can already use the new SQL func with it. So we could already update the docs to recommend that.

So basically.. I'd keep it in for max 1 release, but with smart timing.

I say this because we got some feedback on #1511 (comment), I think most users would prefer postgrest stay low on the db state it needs.

Yes, this kind of feedback is actually why I came up with the idea. I find this solution much less invasive as the original lib.sql / pg_postgrest extension idea. Conceptually the difference is now that we never require any of that SQL code for postgrest to work with basic functions - while the postgrest extension was required in previous ideas.

Still - if we're going to make the openapi output an optional extension, we can do it right away. I don't think we need to keep it in for several versions in different quality.

It seems that this would be done in part by #1697, allowing the DbStructure to be simple json. We'd then pass that json(on Haskell) to db-root-spec on each / request. One thing great about that is that db-root-spec would be always up-to-date with our schema cache.

Yes, that's really good. Looking at some of the numbers Remo posted in that PR, we might need to encode the DbStructure to JSON only once immediately when we update the schema cache. Otherwise we'd need to encode it on every / request, which might be costly...

I wonder if we could pass a structured type instead of a big json to db-root-spec. Perhaps we could cast the json to a custom type. That way it'd be easier to work the openapi output.

Hm. We could stay flexible here, I think. If we just say the first unnamed argument will get the dbStructure JSON via json_to_record, then this would automatically do that conversion to whatever composite type we have defined on the DB side, right? Isn't that the same way we call RPCs now?

By using an unnamed argument, we'd still have all named arguments unused for input from the query-string.

@jsommr
Copy link

jsommr commented Jan 4, 2021

Can't wait for OpenAPI to get removed. It was a great idea but with no one stepping up to maintain it (and few OpenAPI Haskell libraries) it just makes it seem like PostgREST has more issues than it actually does.

@steve-chavez
Copy link
Member

steve-chavez commented Aug 30, 2021

I find this solution much less invasive as the original lib.sql / pg_postgrest extension idea

In regards to invasiveness, I wonder if we could extend PostgREST through an embedded scripting language such as Dhall and be even less invasive to the database. Another Haskell project that supports an scripting language is Pandoc, which supports Lua(examples).

Some pluses of Dhall:

  • Has the nice property of importing an script from HTTP/S URLs(ref)
  • The above would let us move our OpenAPI out of the core repo and still be backwards compatible(we'd just include our OpenAPI Dhall expression by default).
  • OpenAPI is basically a transformation from DbStructure to JSON, which seems fit for a non-turing complete language
  • Seems well integrated with Haskell(ref)

Big con:

  • We'd be moving away from "all SQL" to customize PostgREST, which brings more complexity for the users.

Just an idea I wanted to get off my head.

robx added a commit to robx/postgrest that referenced this issue Jun 7, 2022
Whereas the test used to specify a body of

  [{"qiName":"referrals","qiSchema":"test"},"private"]

we see both

  [{"qiName":"authors_w_entities","qiSchema":"test"},"test"]

and

  [{"qiName":"organizations","qiSchema":"test"},"test"]

on the GHC 9.2 upgrade PR.

This changes the test to no longer expect a particular body.

Compare:
- PostgREST#2292 (comment)
- PostgREST#1698
robx added a commit that referenced this issue Jun 7, 2022
Whereas the test used to specify a body of

  [{"qiName":"referrals","qiSchema":"test"},"private"]

we see both

  [{"qiName":"authors_w_entities","qiSchema":"test"},"test"]

and

  [{"qiName":"organizations","qiSchema":"test"},"test"]

on the GHC 9.2 upgrade PR.

This changes the test to no longer expect a particular body.

Compare:
- #2292 (comment)
- #1698
@tonyxiao
Copy link

tonyxiao commented Mar 1, 2023

Totally onboard with this. Right now we are resorting to proxying postgrest and modifying the root response with our own cusotmizations via typescript. Would love to contribute to the oss but brushing up on haskell and getting that whole dev env setup does not feel appealing at all when I have a deadline to meet 😂

@tonyxiao
Copy link

tonyxiao commented Mar 1, 2023

Would also enable alternative formats such as https://www.buildwithfern.com/

@steve-chavez
Copy link
Member

Right now we are resorting to proxying postgrest and modifying the root response with our own cusotmizations via typescript.

@tonyxiao Just FYI, it's possible to override the openapi spec with a custom function using the db-root-spec config. See #2673 (reply in thread)

@tonyxiao
Copy link

tonyxiao commented Mar 1, 2023

ah good to know, thanks for the tip!

@wayland
Copy link

wayland commented Oct 20, 2023

Hi!

I've started some work on this, but I have some questions:

Assumptions:

  • I see we don't have a postgrest-contib repo yet -- how did you envision this working?
    • As a follow-up, where are we storing the doco for the openapi stuff? My instinct says a docs folder in postgrest-contrib, but happy to be guided.
  • Are we OK with adding other Postgres extensions? I'm looking at https://github.com/petere/pguri but am also curious whether plperl would be an acceptable option. If I did something like this, I'd expect to include a Dockerfile based on postgres that adds these things
  • Is there a way I can configure PostgREST to call one of my functions when the config is updated?

Thanks!

@steve-chavez
Copy link
Member

Hey @wayland!

Those are great news! @laurenceisla Also started work here on https://github.com/PostgREST/postgrest-openapi. Perhaps you'd like to collaborate? The extension is still at early stages.

Is there a way I can configure PostgREST to call one of my functions when the config is updated?

Yes, check https://postgrest.org/en/stable/references/api/openapi.html#override-openapi

Are we OK with adding other Postgres extensions? I'm looking at https://github.com/petere/pguri but am also curious whether plperl would be an acceptable option.

Ideally the extension would be pure SQL so it's compatible with cloud providers. plperl is available on most I think, so maybe it's fine.

As a follow-up, where are we storing the doco for the openapi stuff? My instinct says a docs folder in postgrest-contrib, but happy to be guided.

Yeah, docs for the extension would be fine on the repo README. Currently this is missing since it's still a prototype but you can check the tests for some examples.

@wayland
Copy link

wayland commented Oct 21, 2023

Oh, great! He's gotten further than I have; I'll see what I can do that helps, though I may not get onto it until next weekend now.

Regarding the docs, I was planning that we take the existing ones from postgres-docs, and go from there.

Thanks! :)

@wolfgangwalther
Copy link
Member Author

I see work has started in https://github.com/PostgREST/postgrest-openapi. Splitting this off into a separate repo was my original suggestion. However, now that we have merged postgrest-docs back into the main repo, I wonder whether a separate repo is the best approach?

We could also develop that SQL-based extension in the core repo. Similar to how PostgreSQL has their contrib/ folder in the main repo.

This would give us a much better base for nix-based tooling, for instance. Also, the extension would be much more visible, including the issues we already have in the issue tracker.

What would be the pros and cons for you here, @laurenceisla @steve-chavez?

@steve-chavez
Copy link
Member

I don't have any objections on that. I think a contrib folder would be good too.

The only thing is that postgrest-openapi is not usable yet AFAICT (perhaps @laurenceisla can clarify), but maybe having it here with more eyes would make it progress quicker.

@laurenceisla
Copy link
Member

The only thing is that postgrest-openapi is not usable yet AFAICT

Yes, it's not completely ported. The most important missing features are functions support, permissions and FK relationships (the last one maybe won't be included in the initial release).

This would give us a much better base for nix-based tooling, for instance. Also, the extension would be much more visible, including the issues we already have in the issue tracker.

I agree, it would be better to keep it all together, since the latest changes in core PostgREST can be tested with the library more promptly.

I don't have experience in maintaining contrib libraries alongside core. It would mean that if a change in PostgREST core postgrest-openapi, then would the releases stop and would we need to fix the library before doing so?

If the above is not a problem, then I think we could migrate it as-is to get the benefits that you mentioned.

@wolfgangwalther
Copy link
Member Author

I don't have experience in maintaining contrib libraries alongside core. It would mean that if a change in PostgREST core postgrest-openapi, then would the releases stop and would we need to fix the library before doing so?

I think that's a good thing. I don't think we should "break the openapi contrib and then stop releasing". Instead we should treat it just like the docs - if you make a change, you need to make sure that the openapi part still works - in the same commit / PR. Don't merge when breaking stuff in the first place.

After all the idea to split this off into a SQL-based contrib piece is mostly about making it better, by making it easier to maintain and easier to contribute to. This is not about making it "less official".

If the above is not a problem, then I think we could migrate it as-is to get the benefits that you mentioned.

I'd not migrate it as is. I'd treat the current repo as some kind of PoC, where we can (and do) try things out for now, until we have a working replacement - with the ultimate goal of merging a working solution to the core repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

6 participants