Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: allow user-defined function to handle application-specific Accept headers #1548

Closed
W1M0R opened this issue Jun 29, 2020 · 10 comments · Fixed by #2825
Closed

Comments

@W1M0R
Copy link

W1M0R commented Jun 29, 2020

Consider a request /accounts that returns a list of accounts and their details.

Request header Accept: application/json will return the information as JSON. However, some clients might want the data returned in some other application-specific representation, e.g. protobuf base64, flatbuffer base64, msgpack base64, etc.

Currently, one would have to implement a generic stored procedure (say /rpc/msgpack) that will parse the body to determine what content is requested (since we lose the /accounts query url info), then to call the stored procedure that implements /accounts, convert the result to json and then again to msgpack. However, with this approach we lose the benefit of having PostgREST handling filtering and resource selection with query parameters.

Ideally, the client should be able to query /accounts but use an application-defined Accept header, let's say Accept: application/msgpack. PostgREST can be started with a configuration option that indicates the name of a user-defined Postgres function (e.g. myschema.myhandler(accept text, result jsonb)) that will handle any unknown Accept headers. When PostgREST receives the /accounts request, it should determine that the provided Accept header is not part of the default PostgREST supported headers, and having confirmed that myschema.myhandler) exists, it will fulfill the /accounts query, but before sending the result to the user, it will forward the results to myschema.myhandler and return the result of that function instead.

The myschema.myhandler function will use its accept parameter to determine that the user requested application/msgpack data, and then convert the result parameter to msgpack and return that instead.

Concerning the return value of myschema.myhandler, it should probably also be a user-defined custom type.

@steve-chavez
Copy link
Member

steve-chavez commented Jun 30, 2020

@W1M0R I think the rpc part could be handled by #1546 (comment). With that you could Accept: application/x-msgpack from the client given that your function produces a Content-Type: application/x-msgpack. Inspecting the body wouldn't be needed.

That PR also includes some thought for regular GET /table Accept logic via pre-request, perhaps you can chime in there.

(The proposal of handling extra media types with a convention is interesting though, will have to think more about it).

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Jul 2, 2020

Not sure where to post this, as this topic seems to be spread across multiple issues and PR now. Posting it in here, because I think this is the best approach so far.

I would like to suggest a couple of additions to that as well:

Content Negotiation (RPC)

I don't really like the whole "raw-media-endpoints config option" approach - I think it would be a lot better, if we could somehow pass the information about which accept headers a function might respond to in the function definition itself. The best I could come up with in this regard would be to do it like this:

CREATE FUNCTION my_rpc_with_custom_accept_header (...) RETURNS bytea AS $$
  [...]
$$ LANGUAGE SQL SET app.accept = 'application/msgpack';

The SET app.accept = part would actually set a local variable available during function execution - but wouldn't be used for that purpose at all. However it would be available in pg_proc.proconfig as well and could be used when creating the schema cache. This way we could handle the content negotiation before making any query. We could implement some cleverness as well to allow multiple values (SET app.accept = 'application/msgpack, application/whatever') and wildcards (SET app.accept = 'image/*').

Handling Tables

As to handle the table case described in this issue, it would be cool if we didn't need to use just one generic handler for all possible tables, but have the ability to have multiple handlers, each specific to a table as well.

One way to do it would be similar to the "accept" part above: something like SET app.handler_for = 'accounts' on the function definition. That would require different function names for each table - overloading this function would not work.

I think overloading that handler would be cool - we could maybe achieve that by passing the table type as an argument to the function. So something like this:

CREATE FUNCTION msgpack_handler (
  handler_for accounts,
  accept text, -- not sure if we need this at all, with the set app.accept stuff.
               -- could be read out via request options as well, I think
  result jsonb
) RETURNS bytea -- return types should probably only be bytea and text,
                -- as those don't need further processing when sending to the client
AS $$
  [...]
$$ LANGUAGE SQL SET app.accept='application/msgpack';

CREATE FUNCTION msgpack_handler (handler_for messages, accept text, result jsonb) RETURNS bytea
AS $$
  [...]
$$ LANGUAGE SQL SET app.accept='application/msgpack';

Note the different type for the handler_for argument. Now the code to call that handler would just need to be something like msgpack_handler(NULL::{tablename}, ...) and that tablename would of course be available from the requested uri.

Now to determine which handler to call, the schema cache could just look up the argument type and app.accept configuration to find a match.

@steve-chavez
Copy link
Member

That's really clever @wolfgangwalther. I think you've nailed the RPC case, querying pg_proc.proconfig should be fast.
The only change I propose is to rename app.accept to response.mime(or maybe response.media) to be more consistent with our other settings.

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Jul 2, 2020

I agree that app.accept is bad. I tried just accept first while creating a test function, but postgres won't accept (haha) that. Didn't try any further after app.accept worked.

But, I think neither response.mime nor response.media really nails it. I think we have two general choices here:

  • approach it from the "content negotiation" standpoint, i.e. given a request with an accept header - which functions can satisfy this request. In this case we should prefix request., I guess. Since we are trying to match the accept header - request.accept would make a lot of sense to me. The actual accept header is in request.headers.accept, right? So that should not conflict.

  • approach it from the "what does this function return" standpoint. In this case we are making a statement about the response, so we should prefix it with that. The corresponding header would be Content-Type, right? So response.content_type would make sense. However that might not actually cut it all the way, because setting wildcards or multiple types, seems odd here.

I think request.accept would be the best choice. Also conveys the message about what it does best, when reading it.

@wolfgangwalther
Copy link
Member

As for the tables:

I think it's quite important to be able to somehow have different handler functions for different tables. Suppose one table should accept requests for image/* or such, because it has binary data / files saved. In this case accepting this header really only makes sense for this table and not for most others. If we only had a generic handler function, then at least a part of the "content negotiation" process has to be done inside this function by the postgrest-user and the proper error has to be returned manually when not on the right endpoint.
It would be a lot better, I think, if the whole content negotiation could be handled by postgrest.

Now as to my suggestion, I agree that the "NULL argument handler_for with table type" seems quite an odd use case for the argument. This would just add the benefit of overloading functions, which personally I would find nice - but might not work the effort required.

The easier solution would of course be to just use SET request.handler_for = 'accounts'. Could also add SET request.handler_for = '*' as a fallback for a generic function, or add multiple tables here like SET request.handler_for = 'accounts,messages'.

Actually... while the wildcard could be achieved with the argument solution as well somehow, the multiple tables solution would probably not work. I tend to favour the SET request.handler_for solution now. This should be a lot easier to implement and I guess goes along nicely with the SET request.accept - overloading-ability is not worth it to sacrifice this.

@wolfgangwalther
Copy link
Member

Applying the same thoughts as in my 2nd to last comment about response. or request. it probably makes more sense to call it SET response.handler_for = 'accounts' in the "content negotiation for tables case".

This would also keep request.handler_for free for maybe doing the same conversion the other way around for the request body. Imagine the client that wants to receive data in application/msgpack format. They might want to send POST or PATCH bodies in that format as well with a corresponding Content-Type header?

In this case we could use a function like this to parse the body before executing the query:

CREATE FUNCTION msgpack_request_handler (body BYTEA /* or TEXT */) RETURNS TABLE (...) AS $$
  [...]
$$ LANGUAGE SQL SET request.handler = 'accounts', request.content_type='application/msgpack';

@wolfgangwalther
Copy link
Member

Just noticed that my last point about handling the request body was actually asked for in #506. Also this could allow the user to implement a custom solution for the likes of multipart/form-data, although the COPY solution proposed in #922 is probably better in terms of performance.

@steve-chavez
Copy link
Member

I agree that app.accept is bad. I tried just accept first while creating a test function, but postgres won't accept (haha) that. Didn't try any further after app.accept worked.

@wolfgangwalther Yes, the dot(.) is for custom settings :D, check https://www.postgresql.org/docs/current/runtime-config-custom.html.

Regarding request.accept or response.accept, I'm thinking now we should go with pgrst.accept(it says modify the default PostgREST Accept for this func). The problem with the request/response prefixes is that they're used for transaction scoped settings and what we want is actually global. It could be a bit confusing for users to overload those prefixes.

Many great ideas on the table case here. I would have to experiment a bit to give more meaningful feedback(maybe our current json output could be a type of handler?).

However design-wise I think that the RPC case is ready(if we agree on the setting name of course) and we would already satisfy many use cases with it. We could add the custom RPC accept as a first step and also take the chance to deprecate the raw-media-types config.

A new release is close and it'd be awesome to include this feature :)

@wolfgangwalther
Copy link
Member

Absolutely right about request.accept / response.accept / pgrst.accept. I thought about the same lines recently, when writing about #1559 (comment).

@W1M0R
Copy link
Author

W1M0R commented Nov 16, 2023

Thanks for all your hard work on this feature @steve-chavez and @wolfgangwalther! Your teamwork, consideration, direction and dedication is inspiring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants