Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL handlers for custom media types #2825

Merged
merged 3 commits into from
Oct 26, 2023

Conversation

steve-chavez
Copy link
Member

@steve-chavez steve-chavez commented Jun 15, 2023

Summary

Custom media types are now possible on RPC and tables by using scalar functions and aggregates respectively.

Docs in progress: PostgREST/postgrest-docs#689

Examples

  • Create custom text/csv handler that includes a BOM (byte order mark) plus Content-Disposition header:
create domain "text/csv" as text;

-- override CSV with BOM plus attachment
create or replace function test.bom_csv_trans (state text, next test.lines)
returns text as $$
  select state || next.id::text || ',' || next.name || ',' || next.geom::text || E'\n';
$$ language sql;

create or replace function test.bom_csv_final (data text)
returns "text/csv" as $$
  select set_config('response.headers', '[{"Content-Disposition": "attachment; filename=\"lines.csv\""}]', true);
  -- EFBBBF is the BOM in UTF8 https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
  select convert_from (decode (E'EFBBBF', 'hex'),'UTF8') || (E'id,name,geom\n' || data));
$$ language sql;

drop aggregate if exists test.bom_csv_agg(test.lines);
create aggregate test.bom_csv_agg (test.lines) (
  initcond = ''
, stype = text
, sfunc = bom_csv_trans
, finalfunc = bom_csv_final
);
curl 'localhost:3000/lines?id=in.(1,2)' -H "Accept: text/csv"  -i

HTTP/1.1 200 OK
Content-Disposition: attachment; filename="lines.csv"

id,name,geom
1,line-1,0102000020E610000002000000000000000000F03F000000000000F03F00000000000014400000000000001440
2,line-2,0102000020E6100000020000000000000000000040000000000000004000000000000018400000000000001840

# works for RPC that returns a table type too
curl 'localhost:3000/rpc/get_lines?id=eq.1' -H "Accept: text/csv"  -i

HTTP/1.1 200 OK
Content-Disposition: attachment; filename="lines.csv"

id,name,geom
1,line-1,0102000020E610000002000000000000000000F03F000000000000F03F00000000000014400000000000001440
  • Create a custom application/geo+json generic handler (anyelement can work thanks to ST_AsGeoJSON being generic):
create domain "application/geo+json" as jsonb;

create or replace function test.lines_geojson_trans (state jsonb, next anyelement)
returns jsonb as $$
  select state || extensions.ST_AsGeoJSON(next)::jsonb;
$$ language sql;

create or replace function test.lines_geojson_final (data jsonb)
returns "application/geo+json" as $$
  select jsonb_build_object(
    'type', 'FeatureCollection',
    'crs',  json_build_object(
        'type',      'name',
        'properties', json_build_object(
            'name', 'EPSG:4326'
         )
     ),
    'features', data
  );
$$ language sql;

drop aggregate if exists test.lines_geojson_agg(anyelement);
create aggregate test.lines_geojson_agg (anyelement) (
  initcond = '[]'
, stype = jsonb
, sfunc = lines_geojson_trans
, finalfunc = lines_geojson_final
);
curl 'localhost:3000/lines?id=in.(1,2)' -H "Accept: application/geo+json" 

{"crs": {"type": "name", "properties": {"name": "EPSG:4326"}}, "type": "FeatureCollection", "features": [{"type": "Feature", "geometry": {"type": "LineString", "coordinates": [[1, 1], [5, 
5]]}, "properties": {"id": 1, "name": "line-1"}}, {"type": "Feature", "geometry": {"type": "LineString", "coordinates": [[2, 2], [6, 6]]}, "properties": {"id": 2, "name": "line-2"}}]}
  • Create a custom html handler response
create domain "text/html" as text;

create or replace function welcome() returns "text/html" as $_$
select $$
<html>
  <head>
    <title>Custom handler for HTML</title>
  </head>
  <body>
    <h1>Welcome!</h1>
  </body>
</html>
$$;
$_$ language sql;
curl 'localhost:3000/rpc/welcome' -H "Accept: text/html"

<html>
  <head>
    <title>Custom handler for HTML</title>
  </head>
  <body>
    <h1>Welcome!</h1>
  </body>
</html>

Breaks

Closes

Pending

@steve-chavez
Copy link
Member Author

steve-chavez commented Jun 21, 2023

There's a problem in solving #2699 with the current aggregate interface.

So having a replacement for our geojson is simple:

create domain "application/geo+json" as jsonb;

create or replace function test.geoson_trans (state jsonb, next anyelement)
returns jsonb as $$
  select state || extensions.ST_AsGeoJSON(next)::jsonb;
$$ language sql;

create or replace function test.geojson_final (data jsonb)
returns "application/geo+json" as $$
  select jsonb_build_object('type', 'FeatureCollection', 'features', data);
$$ language sql;

create or replace aggregate test.geojson_agg (anyelement) (
  initcond = '[]'
, stype = jsonb
, sfunc = geoson_trans
, finalfunc = geojson_final
);

curl "localhost:3000/shops" -H "Accept: application/geo+json"

{"type": "FeatureCollection", "features": [{"type": "Feature", "geometry": {"type": "Point", "coordinates": [-71.10044, 42.373695]}, "properties": {"id": 1, "address": "1369 Cambridge St"}}, {
"type": "Feature", "geometry": {"type": "Point", "coordinates": [-71.10543, 42.366432]}, "properties": {"id": 2, "address": "757 Massachusetts Ave"}}, {"type": "Feature", "geometry": {"type":
"Point", "coordinates": [-71.081924, 42.36437]}, "properties": {"id": 3, "address": "605 W Kendall St"}}]}

If we want to add a crs to the final geojson, we could do it like:

create or replace function test.geojson_final (data jsonb)
returns "application/geo+json" as $$
  select jsonb_build_object(
    'type', 'FeatureCollection', 
    'crs',  json_build_object(
        'type',      'name', 
        'properties', json_build_object(
            'name', 'EPSG:4326'  
         )
     ),
    'features', data
  );
$$ language sql;

That's not a generic way though, ideally we use Find_SRID as recommend on #2699.

For this, we could use FINALFUNC_EXTRA:

create or replace function test.geojson_final (data jsonb, typ anyelement)
returns "application/geo+json" as $$
  -- here, assume we enhance pg_typeof with what's required for find_SRID to work, i.e. getting the schema, name and column. Which is possible.
  select jsonb_build_object('type', 'FeatureCollection', 'type', pg_typeof(typ), 'features', data);
$$ language sql;

create or replace aggregate test.geojson_agg (anyelement) (
  initcond = '[]'
, stype = jsonb
, sfunc = geoson_trans
, finalfunc = geojson_final
, finalfunc_extra = true
);

curl "localhost:3000/shops" -H "Accept: application/geo+json"
{"type": "FeatureCollection", "type": "record", "features": [{"type": "Feature", "geometry": {"type": "Point", "coordinates": [-71.10044, 42.373695]}, "properties": {"id": 1, "address": "1369
 Cambridge St"}}, {"type": "Feature", "geometry": {"type": "Point", "coordinates": [-71.10543, 42.366432]}, "properties": {"id": 2, "address": "757 Massachusetts Ave"}}, {"type": "Feature", "g
eometry": {"type": "Point", "coordinates": [-71.081924, 42.36437]}, "properties": {"id": 3, "address": "605 W Kendall St"}}]}

Notice there's a type: record. This is bc of how we call the aggregation:

customAggF :: QualifiedIdentifier -> SqlFragment
customAggF qi = fromQi qi <> "(_postgrest_t)"

If we casted _postgrest_t::test.shops, then the correct type: shops would show.

The casting won't work if resource embedding is done.

curl "localhost:3000/shops?select=*,shop_bles(*)" -H "Accept: application/geo+json"
{"code":"42846","details":"Input has too many columns.","hint":null,"message":"cannot cast type record to shops"}

So maybe we can cast it only when embedding is not used? That would make the aggregate function unreliable though.

Or maybe we should disallow resource embedding on non application/json media types?


I believe this is something it should be solved in PostGIS itself. Maybe it's not worth it. The user can define different types of geojson aggs manually.

@steve-chavez
Copy link
Member Author

steve-chavez commented Jun 21, 2023

I was under the wrong impression that we could do everything with aggregates. For instance, creating an aggregate that also strips nulls #1601. For this we would only need to apply a function to the final function of the aggregate.

json_agg uses json_agg_finalfn and json_agg_transfn. So:

create or replace function test.appjson_agg_final (data json)
returns "application/json" as $$
  select json_strip_nulls(pg_catalog.json_agg_finalfn($1));
$$ language sql;

-- ERROR:  function pg_catalog.json_agg_finalfn(json) does not exist
-- LINE 3:   select json_strip_nulls(pg_catalog.json_agg_finalfn($1));
-- HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

create or replace aggregate test.appjson_agg (anyelement) (
  initcond = '[]'
, stype = json
, sfunc = pg_catalog.json_agg_transfn
, finalfunc = appjson_agg_final
);

The above won't work bc json_agg_finalfn argument is internal:

postgres=# \df+ json_agg_finalfn
List of functions
-[ RECORD 1 ]-------+------------------------------
Schema              | pg_catalog
Name                | json_agg_finalfn
Result data type    | json
Argument data types | internal

Wolfgang idea with CASTs would solve this case #1582 (comment). So looks we'll need CASTs anyway.

CREATE CAST ("application/json" AS json) WITH FUNCTION  json_strip_nulls;

In general, CASTs allows us to have function composition.

I'm thinking we should merge #2523 before going further on this feature.

@steve-chavez
Copy link
Member Author

I was under the wrong impression that we could do everything with aggregates. For instance, creating an aggregate that also strips nulls #1601

Alternatively, I think this could be solved later with the postgrest-contrib extension. We might be able to define custom aggregates there. Even some that could integrate with our tx settings (request.method, etc). With this we could avoid the need for CASTs too (though I won't preclude the possibility of implementing that in the future), which were deemed complicated by a user recently (ref) .

@steve-chavez
Copy link
Member Author

steve-chavez commented Oct 10, 2023

I've just found out that thanks the new GRANT SET ON PARAMETER on pg 15, we can implement the original proposal with the custom GUC #1582. It works like this:

create role manager login password 'secret';
grant all on schema public to manager;

set pgrst.bar to ''; -- this must be done otherwise the next statement fails with permission denied. I believe what happens is that the GUC is now registered as a placeholder.
grant set on parameter pgrst.bar to manager;
GRANT

begin;
set local role to manager;

create or replace function a_bar() returns text as $$
  select 'hello';
$$ language sql
set pgrst.bar='foo';
CREATE FUNCTION

This even works on RDS.

The original proposal had the advantage of being able to define a default handler for */* in a more simple way.

The only thing that worries me is that the above could be considered a bug somehow and then it could be patched but it seems unlikely since it kinda makes sense (check the comment above).

I'll proceed with the pgrst.handler implementation.

cc @wolfgangwalther

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Oct 10, 2023

I've just found out that thanks the new GRANT SET ON PARAMETER on pg 15, we can implement the original proposal with the custom GUC #1582. It works like this:

[...]

This even works on RDS.

While this solves your concern about the availability of this feature on cloud providers, it does not solve the inlining problem mentioned in #1582 (comment).

The idea of using SET pgrst.xxx on the function definition should be generally abandoned. The approach via CASTs and "representations" is a very sound concept, so I don't see a reason to not do that.

@steve-chavez
Copy link
Member Author

steve-chavez commented Oct 24, 2023

As a consequence of #2894, application/vnd.pgrst.object and application/vnd.pgrst.object are no longer overridable per our original design, since they take parameters (nulls=stripped) now.

Maybe it's not so bad to make vendored media types non-overridable. After all users cannot override vnd.pgrst.plan too, this one has a lot of special parameters.'


We could choose to drop nulls=stripped but as mentioned above the same effect can't be achieved right now with just one single handler function. We would need the CASTs thing.

Comment on lines +188 to +200
it "will match image/png according to q values" $ do
r1 <- request methodGet "/rpc/ret_image" (acceptHdrs "image/png, */*") ""
liftIO $ do
simpleBody r1 `shouldBe` readFixtureFile "A.png"
simpleHeaders r1 `shouldContain` [("Content-Type", "image/png")]

r2 <- request methodGet "/rpc/ret_image" (acceptHdrs "text/html,application/xhtml+xml,application/xml;q=0.9,image/png,*/*;q=0.8") ""
liftIO $ do
simpleBody r2 `shouldBe` readFixtureFile "A.png"
simpleHeaders r2 `shouldContain` [("Content-Type", "image/png")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clears the case in #1462, replace image/png for image/webp. This is more about correct content negotiation.

Comment on lines +215 to +224
-- https://github.com/PostgREST/postgrest/issues/1371#issuecomment-519248984
it "will match a custom text/csv with BOM" $ do
r <- request methodGet "/lines" (acceptHdrs "text/csv") ""
liftIO $ do
simpleBody r `shouldBe` readFixtureFile "lines.csv"
simpleHeaders r `shouldContain` [("Content-Type", "text/csv; charset=utf-8")]
simpleHeaders r `shouldContain` [("Content-Disposition", "attachment; filename=\"lines.csv\"")]
Copy link
Member Author

@steve-chavez steve-chavez Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clears #1371 (comment), includes a BOM CSV plus attachment header. Had to read from a file otherwise couldn't match the BOM on Haskell

BREAKING CHANGE

Can be done later with custom media types
BREAKING CHANGE

Can be done later with custom media types
@steve-chavez steve-chavez changed the title Custom media types for Accept SQL handlers for custom media types Oct 25, 2023
@steve-chavez steve-chavez changed the title SQL handlers for custom media types SQL handlers for custom media types on Accept Oct 25, 2023
@steve-chavez steve-chavez changed the title SQL handlers for custom media types on Accept SQL handlers for custom media types Oct 25, 2023
@steve-chavez steve-chavez marked this pull request as ready for review October 25, 2023 21:16
* test text/html and drop HtmlRawOutputSpec.hs
* all tests passing, removed all pendingWith
* make functions compatible with pg <= 12
* move custom media types tests to own spec
* anyelement aggregate
* apply aggregates without a final function
* overriding works
* overriding anyelement with particular agg
* cannot override vendored media types
* plan spec works with custom aggregate
* renamed media types to make clear which ones are overridable
* correct content negotiation with same weight
* text/tab-separated-values media type
* text/csv with BOM plus content-disposition header
@steve-chavez
Copy link
Member Author

steve-chavez commented Oct 25, 2023

Ok this should be ready to go. It implements the first stage idea discussed on #1582 (comment). This means that the CASTs idea can be built on top.

The code supports the idea of "handlers", not only aggregates, meaning a regular function can be the final one applied.

handlerF :: Maybe Routine -> QualifiedIdentifier -> MediaHandler -> SQL.Snippet
handlerF rout target = \case
  BuiltinAggArrayJsonStrip   -> asJsonF rout True
  BuiltinAggSingleJson strip -> asJsonSingleF rout strip
  -- these ones are overridable
  BuiltinOvAggJson           -> asJsonF rout False
  BuiltinOvAggGeoJson        -> asGeoJsonF
  BuiltinOvAggCsv            -> asCsvF
  CustomFunc funcQi          -> customFuncF rout funcQi target
  NoAgg                      -> "''::text"

customFuncF :: Maybe Routine -> QualifiedIdentifier -> QualifiedIdentifier -> SQL.Snippet
customFuncF rout funcQi target
  | (funcReturnsScalar <$> rout) == Just True = fromQi funcQi <> "(_postgrest_t.pgrst_scalar)"
  | otherwise                                 = fromQi funcQi <> "(_postgrest_t::" <> fromQi target <> ")"

There might be some rough edges, but it already solves a lot of issues. Merging it later will cost a lot (it already took a lot of rebases), so I'd rather merge it at this stage and improve it later.

@steve-chavez
Copy link
Member Author

Merging. Will continue with #2826 to solve the remaining issues.

@steve-chavez steve-chavez merged commit 82b3834 into PostgREST:main Oct 26, 2023
30 of 31 checks passed
@fjf2002 fjf2002 mentioned this pull request Oct 31, 2023
@steve-chavez
Copy link
Member Author

With raw-media-types gone, https://postgrest.org/en/stable/how-tos/providing-images-for-img.html is not possible now. We'd need a way to dynamically set the accepted media type inside the function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment