Skip to content

feat: [WIP] parameters tracked snippets, and application/json+sql response #4012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

fauh45
Copy link

@fauh45 fauh45 commented Apr 14, 2025

This should closes #3580

This PR is intended to add Raw SQL output when the user queries for certain Accept media type on their request. The changes are pretty big, mostly because the as far as my knowledge goes, there's isn't really any way to get the parameters value from Hasql snippets (somewhat related to #3934). In my mind there's only two option to get the parameters.

  1. Modify Hasql, and I don't think I'm ready to do that
  2. Or, add a wrapper to Hasql.Snippet where it could save the encoded value as well, I've done this through TrackedSnippets.

All TrackedSnippets do is handle the Snippets as usual, but also save the parameters as Maybe ByteString (from what I see all the parameters are ByteString or Maybe ByteString in the repository). Now because there's a lot of function depending on Hasql.Snippet there's a lot of changes just to change the parameter or return type to TrackedSnippet, which also means snippets of SQL especially in query building needs to be converted to TrackedSnippetto work. Thus a lot of the changes are basically just converting to TrackSnippet and the other way around.

There's also an interface to get both of the query and also the parameters with Accept: application/json+sql. This will return a response with { "sql": "<raw sql generated>", "params": ["params1", "params2"] }, to be noted that the params here is string representation of the params. So in case of JSON placeholder that uses jsonbLazyBytes I'm not sure how it will handle it (also I seems to be lost on how the JSON params could be a jsonb?).

TODO

Below is list that I'm sure there's some changes needed to make this work.

  • Is this the best way to achieve this goals?
  • Does the CHANGELOG need to be more descriptive?
  • Which part of the Docs you guys think it should be added?
  • Better testing for the application/json+sql
  • Make serialized parameters can work with the sql returned. Especially with JSON placeholder as it seems it is encoded as json or jsonb direcly by Hasql. While the serialized parameters is currently still serialized to string. Code snippet that handles it on SqlFragment.hs (jsonPlaceHolder = TrackedSnippet (SQL.encoderAndParam (HE.nullable $ if includeDefaults then HE.jsonbLazyBytes else HE.jsonLazyBytes) body) [LBS.toStrict <$> body])

@fauh45
Copy link
Author

fauh45 commented Apr 14, 2025

@wolfgangwalther this should be my current solution for the issue, the scope is a little bit more than #3580 as now it also returns the placeholder values

@steve-chavez
Copy link
Member

@fauh45 How about clearing all CI failures? Then we can further review.

@fauh45
Copy link
Author

fauh45 commented Apr 16, 2025

@steve-chavez will do! 🫡

@fauh45
Copy link
Author

fauh45 commented Apr 16, 2025

@steve-chavez fixed formatting, also added MTApplicationJSONSQL to dbMediaHandler so that the test case reflects the current media handler list. Tested locally with postgrest-check and it seems to passed.

@fauh45
Copy link
Author

fauh45 commented Apr 16, 2025

Oh nvm, forgot to push minor fix to remove unused functions. Sorry!

@@ -65,6 +66,7 @@ toContentType ct = (hContentType, toMime ct <> charset)
-- | Convert from MediaType to a ByteString representing the mime type
toMime :: MediaType -> ByteString
toMime MTApplicationJSON = "application/json"
toMime MTApplicationJSONSQL = "application/json+sql"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

application/json suffixes have to be registered on IANA to be valid (ref).

Unregistered suffixes should not be used

So we would need a vendored media type. I suggest application/vnd.pgrst.sql+json.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since +json is a registered suffix, would application/sql+json work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

application/sql+json

IIUC, we would have to register the +json to the SQL media type if we want to use it like that.

That being said, I like that media type too and it doesn't seem "potentially damaging", unlike adding a suffix to application/json.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the idea of vendoring the media type, though prefixing sql makes it more clear though IMO. Just to be safe, I'll probably go with the vendored media type for now

Comment on lines +18 to +22
it "responds raw sql to a GET request" $ do
r <- request methodGet "/items?id=eq.1" mtApplicationJSONSQLHdrs ""
liftIO $ do
simpleStatus r `shouldBe` status200
simpleHeaders r `shouldContain` [("Content-Type", "application/json; charset=utf-8")]
Copy link
Member

@steve-chavez steve-chavez Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need the response bodies in these tests.

For now I've checked it locally and it looks like:

curl "localhost:3000/projects?id=eq.1&name=eq.Joe" -H "Accept: application/json+sql"

{"params":["1","Joe"],"sql":"WITH pgrst_source AS ( SELECT \"test\".\"projects\".* FROM \"test\".\"projects\" WHERE  \"test\".\"projects\".\"id\" = $1 AND  \"test\".\"projects\".\"name\" =
 $2  )  SELECT null::bigint AS total_result_set, pg_catalog.count(_postgrest_t) AS page_total, ''::text AS body, nullif(current_setting('response.headers', true), '') AS response_headers, 
nullif(current_setting('response.status', true), '') AS response_status, '' AS response_inserted FROM ( SELECT * FROM pgrst_source ) _postgrest_t"}

It's an interesting functionality for sure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will got it added!

@@ -1088,6 +1088,7 @@ negotiateContent conf ApiRequest{iAction=act, iPreferences=Preferences{preferRep
-- TODO: despite no aggregate, these are responding with a Content-Type, which is not correct.
(ActDb (ActRelationRead _ True), Just (_, mt)) -> Right (NoAgg, mt)
(ActDb (ActRoutine _ (InvRead True)), Just (_, mt)) -> Right (NoAgg, mt)
(_, Just (_, MTApplicationJSONSQL)) -> Right (NoAgg, MTApplicationJSONSQL)
Copy link
Member

@steve-chavez steve-chavez Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This media type should be opt in (it reveals internal details that should only be available for debugging purposes), I suggest reusing the https://docs.postgrest.org/en/v12/references/configuration.html#db-plan-enabled (configDbPlanEnabled) for this. See how it's done on:

m@(MTVndPlan (MTVndSingularJSON strip) _ _) -> mtPlanToNothing $ Just (BuiltinAggSingleJson strip, m)
m@(MTVndPlan MTVndArrayJSONStrip _ _) -> mtPlanToNothing $ Just (BuiltinAggArrayJsonStrip, m)
-- TODO the plan should have its own MediaHandler instead of relying on MediaType
m@(MTVndPlan mType _ _) -> mtPlanToNothing $ (,) <$> (fst <$> lookupHandler mType) <*> pure m
-- all the other media types can be overridden
x -> lookupHandler x
mtPlanToNothing x = if configDbPlanEnabled conf then x else Nothing -- don't find anything if the plan media type is not allowed

It's a bit messy now unfortunately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so in a way on the Plan there should be a check to allow it if the configDbPlanEnabled? Honestly I totally understand that, enabling this does open up lots of internal details

Comment on lines +24 to +29
it "responds raw sql to a POST request for insert" $ do
r <- request methodPost "/projects"
mtApplicationJSONSQLHdrs [json|{"id":100, "name": "Project 100"}|]
liftIO $ do
simpleStatus r `shouldBe` status200
simpleHeaders r `shouldContain` [("Content-Type", "application/json; charset=utf-8")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the parameters are not complete when having both POST body and query string parameters:

http POST "localhost:3000/projects?select=*&id=eq.1&name=eq.Joe" "Accept: application/json+sql" "Prefer:return=representation" <<EOF
{"id":2}
EOF
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Tue, 22 Apr 2025 05:25:48 GMT
Server: postgrest/12.3 (pre-release)
Transfer-Encoding: chunked

{
    "params": [
        "{\"id\":2}\n"
    ],
    "sql": "WITH pgrst_source AS (INSERT INTO \"test\".\"projects\"(\"id\") SELECT \"pgrst_body\".\"id\" FROM (SELECT $1 AS json_data) pgrst_payload, LATERAL (SELECT \"id\" FROM json_to_re
cord(pgrst_payload.json_data) AS _(\"id\" integer) ) pgrst_body  RETURNING \"test\".\"projects\".*) SELECT '' AS total_result_set, pg_catalog.count(_postgrest_t) AS page_total, array[]::te
xt[] AS header, ''::text AS body, nullif(current_setting('response.headers', true), '') AS response_headers, nullif(current_setting('response.status', true), '') AS response_status, '' AS 
response_inserted FROM (SELECT * FROM pgrst_source) _postgrest_t"
}

id=1 and name=Joe are missing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMMIW here, isn't POST (insert operation) doesn't really take query string to make the query at all (INSERT cannot have WHERE)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as long as there are no placeholders in the query for them, they shouldn't show up. What about PUT, where this can happen - does it work?

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

Copy link
Member

@steve-chavez steve-chavez Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will open an issue about the above comments. Edit: done #4041. Wrt to this PR, there is a problem when considering that case.

The generated query does contain a $2 placeholder referring to the filter:

$ PGRST_LOG_LEVEL="info" PGRST_LOG_QUERY="main-query" postgrest-with-postgresql-16 -f test/spec/fixtures/load.sql postgrest-run
# request
http POST 'localhost:3000/projects?select=*,clients!inner(*)&clients.id=eq.2' "Prefer:return=representation"  <<EOF                              
[{"id":14, "name": "asdf", "client_id": 1}, {"id":15, "name": "qer", "client_id": 2}]
EOF

# logs
23/Apr/2025:13:35:17 -0500: WITH pgrst_source AS (INSERT INTO "test"."projects"("client_id", "id", "name") SELECT "pgrst_body"."client_id", "pgrst_body"."id", "pgrst_body"."name" FROM (SE
LECT $1 AS json_data) pgrst_payload, LATERAL (SELECT "client_id", "id", "name" FROM json_to_recordset(pgrst_payload.json_data) AS _("client_id" integer, "id" integer, "name" text) ) pgrst
_body  RETURNING "test"."projects".*) SELECT '' AS total_result_set, pg_catalog.count(_postgrest_t) AS page_total, array[]::text[] AS header, coalesce(json_agg(_postgrest_t), '[]') AS bod
y, nullif(current_setting('response.headers', true), '') AS response_headers, nullif(current_setting('response.status', true), '') AS response_status, '' AS response_inserted FROM (SELECT
 "projects".*, row_to_json("projects_clients_1".*)::jsonb AS "clients" FROM "pgrst_source" AS "projects" INNER JOIN LATERAL ( SELECT "clients_1".* FROM "test"."clients" AS "clients_1" WHE
RE  "clients_1"."id" = $2 AND "clients_1"."id" = "projects"."client_id"  ) AS "projects_clients_1" ON TRUE ) _postgrest_t
127.0.0.1 - postgrest_test_anonymous [23/Apr/2025:13:35:17 -0500] "POST /projects?select=*,clients!inner(*)&clients.id=eq.2 HTTP/1.1" 201 - "" "HTTPie/3.2.2"

But when using the new header, the $2 placeholder is not present:

## request
http POST 'localhost:3000/projects?select=*,clients!inner(*)&clients.id=eq.2' "Prefer:return=representation" "Accept: application/json+sql" <<EOF
[{"id":14, "name": "asdf", "client_id": 1}, {"id":15, "name": "qer", "client_id": 2}]
EOF

#logs
23/Apr/2025:13:36:17 -0500: WITH pgrst_source AS (INSERT INTO "test"."projects"("client_id", "id", "name") SELECT "pgrst_body"."client_id", "pgrst_body"."id", "pgrst_body"."name" FROM (SE
LECT $1 AS json_data) pgrst_payload, LATERAL (SELECT "client_id", "id", "name" FROM json_to_recordset(pgrst_payload.json_data) AS _("client_id" integer, "id" integer, "name" text) ) pgrst
_body  RETURNING "test"."projects".*) SELECT '' AS total_result_set, pg_catalog.count(_postgrest_t) AS page_total, array[]::text[] AS header, ''::text AS body, nullif(current_setting('res
ponse.headers', true), '') AS response_headers, nullif(current_setting('response.status', true), '') AS response_status, '' AS response_inserted FROM (SELECT * FROM pgrst_source) _postgre
st_t
127.0.0.1 - postgrest_test_anonymous [23/Apr/2025:13:36:17 -0500] "POST /projects?select=*,clients!inner(*)&clients.id=eq.2 HTTP/1.1" 200 - "" "HTTPie/3.2.2"

# same thing on the client

Maybe the same problem will be present on PATCH or PUT.

Copy link
Author

@fauh45 fauh45 Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this is something I don't really understand that deeply about the codebase just yet haha.

That's kinda weird... Is it in any way related to the Plan somehow or is more part of the Query builder part?

Because technically it should still parse the query string, then build the read plan, lastly build the query based on the plan right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's kinda weird... Is it in any way related to the Plan somehow or is more part of the Query builder part?

The MutateReadPlan has a ReadPlanTree besides the MutatePlan:

| MutateReadPlan {
mrReadPlan :: ReadPlanTree
, mrMutatePlan :: MutatePlan
, pTxMode :: SQL.Mode
, mrHandler :: MediaHandler
, mrMedia :: MediaType
, mrMutation :: Mutation
, crudQi :: QualifiedIdentifier
}

Because technically it should still parse the query string, then build the read plan, lastly build the query based on the plan right?

Yes, this still holds. The above is still a single query.

@taimoorzaeem
Copy link
Collaborator

From #4131(comment):

that issue went stale, maybe the PR tried to do too much at once

I agree. This would be hard to maintain if merged as is. Instead, for better maintainability, I think we need to plan this in some refactors first to have some space for this feature and then implement this. It would then be easier to review and also more maintainable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Returning generated SQL
4 participants