Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add sqlc.embed to allow model re-use #1615

Merged
merged 5 commits into from Apr 7, 2023

Conversation

nickjackson
Copy link
Contributor

@nickjackson nickjackson commented May 11, 2022

Intro

This PR allows sqlc.embed to be used in a query. This is based on the discussion at #984.

You can now write a query like

-- name: WithSubquery :many
SELECT sqlc.embed(users), (SELECT count(*) FROM users) AS total_count FROM users;

and sqlc will produce a struct like

type WithSubqueryRow struct {
	User       User
	TotalCount int64
}

Why?

In our code base we have many List queries which require additional pagination fields, and I'm unhappy about how much extra mapping we need to do because of it. I could also see it being used to save mapping on some sql joins.

How it Works

There is now a sql/rewrite.Embeds() function which takes the ast at compile time and looks for the sqlc.embed(param) function, swapping it out for an ast.ColumnRef of param.*. This allows the compiler to re-use all of the existing ast.ColumnRef logic to build and expand the columns.

Columns now have a flag on the proto to indicate they are embedded, allowing the codgen to be adjusted accordingly. I can go into more detail on this side if you wish.

Finally

It's currently in Draft because I need to implement the Go stdlib templating and add tests for it. Do I need to do anything to disable this feature for non-go codegen?

On #984 there was discussion about maybe disabling this for Left joins, I'd be happy to do this but could do with some pointers about how best to do that.

I'm more than happy to receive feedback, especially around the topic of naming things and where to put the various functions. I'm also happy to add more testing if necessary.

Thanks!

@Emyrk
Copy link
Contributor

Emyrk commented May 12, 2022

Just voicing support for this ❤️

@kylecarbs
Copy link
Contributor

@kyleconroy this would help us out dramatically! Does this approach seem reasonable?

@bcsmith2846
Copy link

bcsmith2846 commented Jun 30, 2022

@kyleconroy this PR could add a lot of value to the way that we use sqlc in our project. We're thinking of forking the project to merge this PR into the fork, but there is some risk that we take on by forking when it comes to merging future changes into our fork if there aren't plans to merge this PR into the project at some point.

So, I'm wondering if there are plans to merge this into a future release? If there aren't, is there any specific issue with the PR that we can address so that it is more likely to be included in a future release?

@nickjackson
Copy link
Contributor Author

@kyleconroy is there an issue with this PR that I can address? The feature looks popular, and I'm willing to make changes.

@Emyrk
Copy link
Contributor

Emyrk commented Sep 14, 2022

I put this on my Christmas list, hoping Santa pulls through!

@sreya
Copy link

sreya commented Sep 30, 2022

@kyleconroy sqlc is great but this is something I've encountered repeatedly as well. Is there a design or philosophical reason that you do not want to support this feature?

@skabbes
Copy link
Contributor

skabbes commented Nov 27, 2022

@nickjackson I know I'm late to the party here, so apologies in advance.

First off, this is a very cool piece of functionality, and I am in particular interested if this could be implemented in a way that is more harmonious with SQL itself. (and ideally with less special handling) by using a generic tuple type from the database.

I am referring to specifically "row constructors": https://www.postgresql.org/docs/current/sql-expressions.html#SQL-SYNTAX-ROW-CONSTRUCTORS

In your example:

SELECT sqlc.embed(users), (SELECT count(*) FROM users) AS total_count FROM users;

-- could be written in "raw" SQL as
SELECT row(users.*) as user, (SELECT count(*) FROM users) AS total_count FROM users;

-- and to stay consistent with sqlc's "star-expansion" it would expand to a static query like
SELECT row(users.id, users.name) as user, (SELECT count(*) FROM users) AS total_count FROM users;

I think this stays a lot closer to the raw SQL, and could possibly enable some crazier queries like:

-- name: UserWithPosts :many
select
  row(users.*) as user,
  count(posts.id) as post_count,
  array_agg(row(posts.*)) as posts
from users left join posts on posts.user_id = user.id
group by users.id

(I'm sorry if your functionality already allows for this, I couldn't tell for certain).

Which should generate a resulting struct like:

type UserWitihPostsRow struct {
  User User
  PostCount int
  Posts []Post
}

I believe this would require not even require an additional sqlc function at all. Of course as with all things, the devil's in the details.

Anyway, I was wondering if you considered this at all? [I am not a maintainer but...] this feels like a smaller change (no additional APIs needed, and therefore easier to merge?)

@nickjackson
Copy link
Contributor Author

Hi @skabbes thanks for message - I've done some research into it but I'd like to just preface that I'm no expert, I just took #984 and ran with it because it was something my team desired.

I do like the possibilities with what you are suggesting, getting data out like that would be super convenient. It could reduce the number of round trips and subsequent mappings. In my opinion, I think your suggestion should be an additional feature, not a replacement and I'll explain why.

I personally don't mind sqlc.embed() not being compatible with SQL because it's only function is to tweak sqlc behaviour. In this case, it changes how the template package produces a struct and how it scans a result into that struct. The underlying SQL query does not really change, and it produces standard sql that most users would be familiar with for the typical use cases.

I think the major issue you'll find with row() is that it requires specialised types/scanners that may vary depending on the database/driver. This is non-trivial and something that as far as I can tell sqlc is usually pretty hands-off about. Users can pick types on their main model fields, but I don't believe it's possible to pick a custom type for a field on an "anonymous" query result. If it were possible, the user could implement their own scanner implementations that handle the row() data type.

There may be a case for supporting row() with pgx's pgtype.CompositeFields directly within sqlc. I think the compiler would need to pass some additional metadata through, but the scanning implementation is pretty simple, at least for the simple row(users.*) implementation. In this case, I believe this would be a much simpler change than sqlc.embed() but I cannot speak for other driver/database implementations.

Another more selfish reason that I'd like to keep sqlc.embed() is that we use Cockroachdb and it does not support row(), but I appreciate that shouldn't hold a feature that works for postgres back.

I hope this answers your questions. 😄

@skabbes
Copy link
Contributor

skabbes commented Nov 30, 2022

I think your suggestion should be an additional feature, not a replacement and I'll explain why

With you explanation (and the fact that sqlc wants to be multi-database), I'm on board with that.

Users can pick types on their main model fields, but I don't believe it's possible to pick a custom type for a field on an "anonymous" query result

I think this feature is important for sqlc to implement, but still trying to find the sqlc-way for it to work. I had a suggestion for a sqlc.output function when we were originally doing the work for nullability.

If this feature existed, sqlc.embed would probably could been implemented with something like row_to_jsonb and a custom scanner (the same way sqlc.embed would have been implemented with row).

SELECT
  -- pseudo-code for "output column tagging"
  sqlc.output(row_to_jsonb(users), 'type', 'model.User'),
  (SELECT count(*) FROM users) AS total_count
FROM users;

Regardless you convinced me :) but also I think sending enough metadata through to code generators for my above example to have worked would also be aweseme.

Thanks for the contribution! looking forward to seeing it land in 1.17!

@richchurcher
Copy link

Not to pile on, but I'm playing with the branch and it works just as I'd expect it to. Haven't attempted anything particularly complex, but the workflow for simple joins like:

-- name: ListUsers :many
select sqlc.embed(u), sqlc.embed(t) from users u
  join things t on u.thing_id = t.id
  where u.deleted_at is null
    and u.email > $1
  order by u.email asc
  limit $2;

-- type ListUsersRow struct {
-- 	User   User
--	Thing Thing
-- }

is sensible and easy. Nice work.

@zakaria-chahboun
Copy link

Hey guys! any update for this feature?

@Emyrk
Copy link
Contributor

Emyrk commented Jan 26, 2023

@kyleconroy I see this in the milestone https://github.com/kyleconroy/sqlc/milestone/18.

Is Feb 6 an accurate timeline for this?

@richchurcher
Copy link

@nickjackson I don't know if this is helpful, but I rebased the upstream onto your branch in nickjackson#2 in case it saves you any grey hairs. I'm reasonably certain I got it right (tests are still passing, at least!)

@preslavrachev
Copy link

For the record, this did not make into the v1.17.0 release, did it?

@vmalyavin
Copy link

Great feature
Any updates?

@algorithmcardboard
Copy link

algorithmcardboard commented Mar 16, 2023

@kyleconroy This is going to be super useful feature for adopting sqlc in our org. Is there anyway I can help get this PR merged?

@kyleconroy kyleconroy merged commit c7c6a36 into sqlc-dev:main Apr 7, 2023
6 checks passed
@richchurcher
Copy link

@kyleconroy @nickjackson thanks so much for your patience in getting this across the line! I can finally stop using a fork 😆

@thesiti92
Copy link

🔥

@samfrm
Copy link

samfrm commented Apr 11, 2023

The best merge ever 😆

@zakaria-chahboun
Copy link

Finally!

@ewhauser
Copy link

@skabbes Did you open another issue for sqlc to support the row constructor? This functionality is useful even if embed does not support it.

@SauloLauers
Copy link

SauloLauers commented May 12, 2023

In a case I have a table cart as example, and a table cart_products that contains all the products of the cart. Is there any way I can retrieve the cart and a slice of cart produtcs?
The way Im making it, returns me an object with a cart and a product for each product I have.

This is just a little schema, so I can try to explain better

CREATE TABLE "cart" ( "id" BIGSERIAL PRIMARY KEY, "user_id" bigint NOT NULL, "active" boolean NOT NULL DEFAULT true, "created_at" timestamptz NOT NULL DEFAULT (now()), "updated_at" timestamptz NOT NULL );

CREATE TABLE "cart_products" ( "id" BIGSERIAL PRIMARY KEY, "id_cart" bigint NOT NULL, "id_product" bigint NOT NULL, "product_amount" bigint NOT NULL, "product_price" float NOT NULL, "created_at" timestamptz NOT NULL DEFAULT (now()), "updated_at" timestamptz NOT NULL );

ALTER TABLE "cart_products" ADD FOREIGN KEY ("id_cart") REFERENCES "cart" ("id");

At the query Ive wrote the following:

-- name: GetCartProducts :many SELECT sqlc.embed(c), sqlc.embed(p) FROM cart as c join cart_products p on (c.id = p.id_cart);

So lets say that at cart 1 I have 2 products. When I run that GetCartProducts I get the following:
{ "cart": [ { "cart": { "id": 1, "user_id": 1, "active": true, "created_at": "2023-05-12T09:38:55.696Z", "updated_at": "2023-05-12T09:38:53.065Z" }, "cartproduct": { "id": 1, "id_cart": 1, "id_product": 1, "product_amount": 5, "product_price": 100, "created_at": "2023-05-12T12:39:28.81667Z", "updated_at": "2023-05-12T12:39:28.81667Z" } }, { "cart": { "id": 1, "user_id": 1, "active": true, "created_at": "2023-05-12T09:38:55.696Z", "updated_at": "2023-05-12T09:38:53.065Z" }, "cartproduct": { "id": 2, "id_cart": 1, "id_product": 2, "product_amount": 5, "product_price": 50, "created_at": "2023-05-12T13:00:06.803834Z", "updated_at": "2023-05-12T13:00:06.803834Z" } } ], "status": "success" }

But I wanted to get:

{ "cart": [ { "cart": { "id": 2, "user_id": 2, "active": true, "created_at": "2023-05-12T09:38:55.696Z", "updated_at": "2023-05-12T09:38:53.065Z" }, "cartproduct": [ { "id": 2, "id_cart": 1, "id_product": 2, "product_amount": 5, "product_price": 50, "created_at": "2023-05-12T13:00:06.803834Z", "updated_at": "2023-05-12T13:00:06.803834Z" }, { "id": 2, "id_cart": 1, "id_product": 2, "product_amount": 5, "product_price": 50, "created_at": "2023-05-12T13:00:06.803834Z", "updated_at": "2023-05-12T13:00:06.803834Z" } ] } ] }

Yours,

Saulo Lauesr

@nickjackson
Copy link
Contributor Author

@SauloLauers sqlc.embed isn't able to do that for you, you have to do a little mapping still.

I put your example into a playground and can see the output structure is as I'd expect it to be. https://play.sqlc.dev/p/4b5a7f132ce1e0d13fffa509f666fe41ba255b568ba25ab27ca93f77052713ef

You need to create your own types for Cart and CartProduct (you may already have them in your domain/api layers) which have the structure you need, and then you map the result of GetCartProducts into that.

@SauloLauers
Copy link

@SauloLauers sqlc.embed isn't able to do that for you, you have to do a little mapping still.

I put your example into a playground and can see the output structure is as I'd expect it to be. https://play.sqlc.dev/p/4b5a7f132ce1e0d13fffa509f666fe41ba255b568ba25ab27ca93f77052713ef

You need to create your own types for Cart and CartProduct (you may already have them in your domain/api layers) which have the structure you need, and then you map the result of GetCartProducts into that.

Thank you for the quick answer. Yes, I have the structs. I just wanted to see it it was possible somehow. I tried even with a group by. But I did my work arround already mapping it at controller.

@azrshana
Copy link

@nickjackson Thanks for this PR, excited to try it out in 1.18.

I'm curious, does this work with left joins? If I have an Order struct which sometimes contain a Receipt for example. If I do a left join and use sqlc.embed, will the Order struct returned contain a nullable Receipt? Thanks!

@gnuletik
Copy link

gnuletik commented Jun 2, 2023

@azrshana If you use a LEFT JOIN, the embedded struct will not be nullable.
Instead, you may have a scan error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet