pointer & omitempty is cumbersome/impossible with a large schema #149

nathanstitt · 2021-10-28T20:20:45Z

First let me thank you for your great work on genqlient, it has a LOT of promise and fills a big need in the golang GraphQL space. I've been searching for something like it for quite awhile.

In my opinion, this behavior would be ideal:

all references to other relationships should default to a pointer
omitempty be allowed even if the arguments are required.

I've added an example repo that may be easier to follow than the below verbiage. It's at https://github.com/nathanstitt/genqlient-stress-test

My reasoning for this is:

I have a project that's using the Hasura Graphql Engine https://hasura.io which generates a schema for query/mutations on database tables. Hasura has a very comprehensive query language that supports nesting and/or type queries using GraphQL types.

I've been using github.com/machinebox/graphql with the project for some time but would love to have it be type safe and not require me to hand craft structs that match the existing GraphQL schema.

I attempted to migrate to genqlient but ran into several issues that you may find enlightening if I explain them.

My schema has nested types like so:

input client_bool_exp {
  _and: [client_bool_exp!]
  _not: client_bool_exp
  _or: [client_bool_exp!]
}

This requires _and, _not, and _or to be a pointer, otherwise the generated code won't compile due to:invalid recursive type Client_bool_exp errors.

I ended up with a list of 122 @genqlient(for: "<struct>._not", pointer: true) statements in order to get it to compile, and I only have 35 tables 😭

Once it was compiling, I then encountered issues with mutations inserting undesired zero values into the database because omitempty wasn't set.

Rather than attempt to repeat the above pointer exercise, I set all mutations to be prefixed with @genqlient( omitempty: true). That failed because the inputs were required, and if I attempted to use for, it then fails with: for is only applicable to operations and arguments. I made a tiny PR to allow that #148 but @csilvers suggested I first make an issue to discuss first.

FWIW: #15 also touches on this subject as well

I'm attaching my 25K line schema just in case you want to review it. If you really want to torture yourself, attempt to write this mutation against it:

mutation insertAccount(
  $tempUserId: bigint!,
  $account: account_insert_input!
) {
  insert_account_one(object: $account) {
	id
  }
  delete_user_temp(where: { id: {_eq: $tempUserId} }) {
	affected_rows
  }
}

schema.graphql.zip

The text was updated successfully, but these errors were encountered:

csilvers · 2021-10-28T23:15:21Z

@benjaminjkraft is probably the right person to weigh in on this, he knows a lot more about this code than anyone else.

It would be very helpful if you could include a mutation that has the appropriate genqlient annotations applied to it. That is, an example of one of your mutations (it doesn't have to be all of them) with the @genqlient(for: ..., pointer: true) directives that you needed to add to get it to compile. And also, if possible, all the @genqlient(for: ..., omitempty) directives you'd need to add to avoid the spurious zeros.

I know you're resistant to adding a bunch of omitempty directives, given you already had to add a bunch of pointer:true directives, but I think it will add value to at least see what that looks like!

nathanstitt · 2021-10-29T00:49:09Z

sure thing, it was actually the above insertAccount mutation that was ugly here's an example the pointers I needed to add:

# @genqlient(for: "account_user_role_bool_exp._not", pointer: true)
# @genqlient(for: "availability_bool_exp.user", pointer: true)
# @genqlient(for: "availability_bool_exp._not", pointer: true)
# @genqlient(for: "billing_info_bool_exp._not", pointer: true)
# @genqlient(for: "account_user_bool_exp._not", pointer: true)
# @genqlient(for: "account_on_conflict.where", pointer: true)
# @genqlient(for: "account_on_conflict.constraint", pointer: true)
# @genqlient(for: "account_bool_exp._not", pointer: true)
# @genqlient(for: "file_insert_input.parent", pointer: true)
# @genqlient(for: "client_insert_input.account", pointer: true)
# @genqlient(for: "client_insert_input.time_recorded", pointer: true)
# @genqlient(for: "project_insert_input.permissions", pointer: true)
# @genqlient(for: "task_insert_input.file_link", pointer: true)
# @genqlient(for: "project_insert_input.pcat", pointer: true)
# @genqlient(for: "account_insert_input.owner", pointer: true)
# @genqlient(for: "account_insert_input.client", pointer: true)
# @genqlient(for: "users_obj_rel_insert_input.on_conflict", pointer: true)
# @genqlient(for: "account_user_obj_rel_insert_input.on_conflict",  pointer: true)
# ...100 or so others ommited
mutation insertAccount(
  $tempUserId: bigint!,
  $account: account_insert_input!
) {
  insert_account_one(object: $account) {
	id
  }
  delete_user_temp(where: { id: {_eq: $tempUserId} }) {
	affected_rows
  }
}

The issue is the account table indirectly links to just about every other table in some way. Due to how rich the Hasura filtering is, it's possible to select an account based on any other relation. Even if I was willing to do all the pointer statements for this mutation, I'm a bit afraid I'll hit the same issue with a different one later.

Of course I could have use a blanket # @genqlient(pointer: true) statement, but then I'd have to use a pointer for even the basic types, which wouldn't allow things like:

account := Account_insert_input{
    Account_level:   &"trial", // error, cannot take address of untyped string contstant
}

That's why I think it'd be ideal if genqlient defaulted to pointers for properties that are a struct value, but not for basic types.

nathanstitt · 2021-10-29T00:56:03Z

@csilvers I'm wiling to add the omit empties as well to see what it'd look like, but I'm prevented from doing so due to this error:

I have a simple mutation:

input message_insert_input {
  body: String
  subject: String
  update_user: bigint
}
mutation insertMessage($msg:  message_insert_input!) {
  insert_message_one(object:$msg) {
    id
  }
}

Here I do not want to specify the update_user because that's set to current_user by the server and shouldn't need to be set.

If I attempt this:

mutation insertMessage($msg:  message_insert_input!) {
  # @genqlient(for: "message_insert_input.update_user", omitempty: true)
  insert_message_one(object:$msg) {
    id
  }
}

I receive error:
for is only applicable to operations and argument

I tried various other combinations and moving it to before the mutation line, but couldn't see any way to do it

nathanstitt · 2021-10-29T14:59:24Z

@csilvers to provide more clarity around this, I've created a example repo with the schema and a few queries/mutations.

It also has an example struct in main.go that is the main one giving me issues when I attempt to create the above accountInsert mutation.

https://github.com/nathanstitt/genqlient-stress-test

csilvers · 2021-10-29T15:26:48Z

~~The for takes a variable name, not a type name~~. (Totally not true!) And it should go before the mutation. So:

@genqlient(for: "msg.update_user", omitempty: true)
mutation insertMessage($msg:  message_insert_input!) { ... }

csilvers · 2021-10-29T15:29:50Z

OK, it looks like this should work:

# @genqlient(for: "message_insert_input.update_user", omitempty: false)
mutation ... { ... }

What error do you get when you try that?

Here is an example in the genqlient repo:

# @genqlient(omitempty: true)
# @genqlient(for: "UserQueryInput.id", omitempty: false)
query OmitEmptyQuery(
  $query: UserQueryInput,
  $queries: [UserQueryInput],
  $dt: DateTime,
  $tz: String,
  # @genqlient(omitempty: false)
  $tzNoOmitEmpty: String,
) {
  user(query: $query) { id }
  users(query: $queries) { id }
  maybeConvert(dt: $dt, tz: $tz)
  convert2: maybeConvert(dt: $dt, tz: $tzNoOmitEmpty)
}

What is the full definition of the message_insert_input type?

senseijack · 2021-10-29T15:53:18Z

FWIW, we're having the same issue with a schema generated by Hasura, illegal cycles in generated code due to the *_bool_exp types in the schema. We haven't been able to compile at all, even using directives the same way nathanstitt has. It would be nice if the generated code automatically used pointers for struct values.

nathanstitt · 2021-10-29T15:59:19Z

The definition of message_insert_input is here https://github.com/nathanstitt/genqlient-stress-test/blob/main/schema.graphql#L9553-L9564 which gets compiled to this: https://github.com/nathanstitt/genqlient-stress-test/blob/main/generated-gql.go#L1754-L1766

At first I wondered if it was because I was using omit on a mutation and your example is a query, but I just added an example to my test repo using the identical syntax as the test data query and still get the error 😡

I'll try to dig into this further by closely comparing what we're doing differently

nathanstitt · 2021-10-29T17:06:30Z

@senseijack I was fortunate that there was only one mutation that triggered the recursive cycles so it was somewhat feasible to use the directives as a workaround. When I was testing, if I added another similar mutation it would also require the same directives, which is completely unfeasible to maintain. My DB structure is also fairly simple with only 35 tables. I shudder to think what it'd be like for a larger one.

csilvers · 2021-10-29T17:20:04Z

I just added an example to my test repo using the identical syntax as the test data query and still get the error rage

Weird! Would be great to understand what's going on there.

FWIW, we're having the same issue with a schema generated by Hasura

It sounds like it will definitely be a good idea to do something to improve use with Hasura. @benjaminjkraft any ideas?

benjaminjkraft · 2021-11-02T20:41:10Z

Hello and thanks for raising this discussion! It sounds like there are a few separate problems here, so let me comment on each one in turn.

The first issue is the errors around omitempty. This is just an awkward syntactic choice which is that directives apply to all nodes on the next line. So in your example when you do

# @genqlient(omitempty: true)
# @genqlient(for: "account_bool_exp.id", omitempty: false)
query findAccount($query:  account_bool_exp) {
  ...
}

the directives actually apply to both query findAccount and $query: account_bool_exp; you only want the former as the latter is an error. I think it's clear by now that this choice is too confusing as-is, so I'll file a separate issue (edit: #151) to simplify things, but to avoid the problem you can use separate lines like so:

# @genqlient(omitempty: true)
# @genqlient(for: "account_bool_exp.id", omitempty: false)
query findAccount(
  $query:  account_bool_exp,
) {
  ...
}

(If that doesn't work, if you could file a separate issue with the details, including the exact error message, that would be great! We can keep this issue to discuss the other problems.)

The remaining issues I see are all around the need to add a lot of omitempty and pointer options for large (autogenerated) schemas, especially for mutation input types. But I want to break that out a little further, because I don't know that there will be a one size fits all approach. Here are the different causes I see:

For types input T { f: T, ... } (or even f: T!) you always need pointer because Go (reasonably) doesn't allow struct T { f T; ... }, so the default generated code doesn't compile.
For other object (struct) valued fields you find you always want pointers so as to handle nulls. Is this only when the object is optional, or always? Is it for input types only, or also for output types? But you don't want pointers for basic types, because Go makes it tricky to write those (see also proposal: expression to create pointer to simple types golang/go#45624).
For optional basic input types, you find you always want omitempty. I think if the above issue is resolved, you can do that by putting omitempty on each mutation, although maybe you'd prefer to use a global default. (That would also avoid the potential to run into Validate against conflicting options on the same type #123 if you're inconsistent.)

For problem 1, it's a bit of a special case but I think it would be reasonable for genqlient to just do the right thing automatically; there's no real alternative and the current code is broken. I'll file a separate issue for that (edit: #152)

For problem 2, I don't think it's a good idea to change the default at this point, but it could make sense to say you can ask for that as an option. (In principle you could do that at the query level or globally, but it seems to me that globally is going to make things simpler for a schema like yours; again see also #123.) A similar option might help with problem 3.

So that's what I propose for this issue: (an) option(s) in genqlient.yaml to configure the defaults for pointer and omitempty and/or for optional fields. I think that will provide you a lot of value without further complicating the list of options (I feel we've already gone too far down that road). For pointer we could allow different defaults for optional vs. required fields, and for structs vs. basic types. (I think omitempty as currently implemented only makes sense for optional basic types.) Does that make sense to you? For pointer, which of those would you want to configure? Any other cases you see where you think genqlient should be able to "do what you mean" better?

nathanstitt · 2021-11-03T15:23:54Z

That is a great summary @benjaminjkraft.

I do find errors around the omitempty statement, but you're correct we shouldn't distract this issue with them. I'll document them further and file a new new issue.

For types input T { f: T, ... } (or even f: T!) you always need pointer because Go (reasonably) doesn't allow struct T { f T; ... }, so the default generated code doesn't compile.

Yes, this is the biggest issue I found, and the cause of the huge list hand generated pointer: true statements I needed. I think if this was resolved, genqlient would at least be usable for my case.

For other object (struct) valued fields you find you always want pointers so as to handle nulls. Is this only when the object is optional, or always? Is it for input types only, or also for output types? But you don't want pointers for basic types, because Go makes it tricky to write those (see also proposal: expression to create pointer to simple types golang/go#45624).

I found that input are the real pain point here. Without struct valued fields being marked as optional, generated mutations grow to a huge size for even simple statements. I hadn't considered the optionality of the field, but after reflection, it seems to me that there's really no point (hah) having a pointer for a required field since it'll always be set

I don't think outputs being marked as pointer by default is too important. Using value types does have benefits for code safety because you won't unintentionally access nil pointer. The large struct trees likely cause some memory bloat though, so It's fine with me either way.

However, there are numerous shared types between query & mutations so whichever rule we establish would need to apply to most of my generated code regardless.

For optional basic input types, you find you always want omitempty. I think if the above issue is resolved, you can do that by putting omitempty on each mutation, although maybe you'd prefer to use a global default. (That would also avoid the potential to run into Validate against conflicting options on the same type #123 if you're inconsistent.)

Yes, I agree. I don't think we need to concern ourselves with setting a global default for omitempty on basic types. I'm fine with manually adding a few omitempty as needed. I looked at my current hand generated structs and I have only set omitempty on a handful of basic fields. I do have omitempty and a pointer type used for almost all the fields that are references to other struct types.

(an) option(s) in genqlient.yaml to configure the defaults for pointer and omitempty and/or for optional fields.

Yes that sounds perfect! Depending on how many issues arise in the future you can debate whether the default should be true or false but having it would give users like myself a much needed control

For pointer we could allow different defaults for optional vs. required fields, and for structs vs. basic types. (I think omitempty as currently implemented only makes sense for optional basic types.) Does that make sense to you? For pointer, which of those would you want to configure? Any other cases you see where you think genqlient should be able to "do what you mean" better?

I think we only need to configure optional field that contains a struct value. Required fields will always need to be present, so no need for a pointer. However, if it's easier to implement all struct values as pointer that's also ok. It also may be easier to explain in documentation if the rule is just "struct fields are pointers and optional"

I also think that if the field is a pointer, then omitempty should also be set A pointer is by definition optional since it can be nil.

Something that we haven't mentioned is array types. I do not think those should be a pointer. A pointer doesn't offer any benefits and in code it's much easier to just call len(ary) and not have to worry about it being nil. An array containing pointer types is fine, but not really needed.

Those were great clarifying questions - thanks so much for taking the time to think through this so deeply.

I think that a single preference to set the default on struct valued fields to be pointer: true, omitempty: true would fix most of my issues here. Naming is hard, but maybe use_optional_references or some such would convey that sentiment

While I don't want to over-promise in regards to my schedule and available time, I can take a crack at a PR if you guys are busy atm.

benjaminjkraft · 2021-11-03T19:24:53Z

Great! It sounds like we're very much agreed on the general approach. A few specific notes:

I'm fine with either optional structs or all structs being pointers for this option. I think with structs it doesn't make a huge difference. If one is significantly easier to implement, that seems great, although I'd guess it won't matter.
Arrays of structs are an interesting question. In principle if you have an [T] you do want to be able to pass [null], so pointers would make sense. In practice I think it's much rarer to do that, especially as an input type. So again I can go either way here, whichever is easier to implement or explain. It does make me think it should just be for optional structs, though, as I see less value in making [T!] a []*T`.
On omitempty: for pointer values it doesn't actually matter much semantically -- since pointers get serialized to null, the only difference is whether the input includes "field": null or simply omits field entirely. GraphQL treats both the same as far as I know, although since we're talking about large objects with few fields set probably if it's not too hard we may as well omit it, as you suggest, to save on bytes.

I am happy for you to go ahead and implement this! Or if you find you don't have time or it gets too involved I can probably do it too.

benjaminjkraft · 2021-11-13T03:54:55Z

Fixed by #155. Anyone using this option, please report back to let us know how it goes in the real world! We can add additional knobs if necessary.

senseijack · 2021-11-13T05:58:52Z

Thanks for the quick implementation of this feature. I'll test this in our staging environment next week and let you know how it works out.

senseijack · 2021-11-15T22:03:35Z

@benjaminjkraft I tested this option in production today, it worked perfectly.

kantum · 2023-01-18T18:26:51Z

@benjaminjkraft I'm also using hasura and used this option successfully, however when I want to add more directives on my operations, they sometimes do apply but if I run go run github.com/Khan/genqlient again it does not work anymore.

Is it possible to use both this option and directives or not?

benjaminjkraft · 2023-01-18T18:56:00Z

@kantum you should always be able to combine options and directives (directives take precedence of course). It sounds like you may be hitting another bug or misbehavior, could you file a new issue? (Or take a look at #151 to see if any of the confusions there explain your issue.)

benjaminjkraft · 2023-05-06T20:23:04Z

Hasura users -- please take a look at #272 and add any thoughts!

nathanstitt added the enhancement New feature or request label Oct 28, 2021

nathanstitt mentioned this issue Oct 28, 2021

allow omitempty even if arguments are required #148

Closed

This was referenced Nov 2, 2021

Clarify to which nodes genqlient directives apply #151

Open

Correctly handle self-recursive types automatically #152

Open

benjaminjkraft added help wanted Issues that anyone could pick up and implement if useful to them needs design Issues where we don't know exactly what we need yet; discuss on the issue before implementing labels Nov 2, 2021

nathanstitt mentioned this issue Nov 8, 2021

Add "struct_references" configuration #155

Merged

benjaminjkraft closed this as completed in 2fdbb62 Nov 13, 2021

benjaminjkraft mentioned this issue May 6, 2023

[Meta] Document and improve support for Hasura & Dgraph schemas #272

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pointer & omitempty is cumbersome/impossible with a large schema #149

pointer & omitempty is cumbersome/impossible with a large schema #149

nathanstitt commented Oct 28, 2021 •

edited

Loading

csilvers commented Oct 28, 2021

nathanstitt commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

csilvers commented Oct 29, 2021 •

edited

Loading

csilvers commented Oct 29, 2021

senseijack commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

csilvers commented Oct 29, 2021

benjaminjkraft commented Nov 2, 2021 •

edited

Loading

nathanstitt commented Nov 3, 2021

benjaminjkraft commented Nov 3, 2021

benjaminjkraft commented Nov 13, 2021

senseijack commented Nov 13, 2021

senseijack commented Nov 15, 2021

kantum commented Jan 18, 2023

benjaminjkraft commented Jan 18, 2023 •

edited

Loading

benjaminjkraft commented May 6, 2023

pointer & omitempty is cumbersome/impossible with a large schema #149

pointer & omitempty is cumbersome/impossible with a large schema #149

Comments

nathanstitt commented Oct 28, 2021 • edited Loading

csilvers commented Oct 28, 2021

nathanstitt commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

csilvers commented Oct 29, 2021 • edited Loading

csilvers commented Oct 29, 2021

senseijack commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

nathanstitt commented Oct 29, 2021

csilvers commented Oct 29, 2021

benjaminjkraft commented Nov 2, 2021 • edited Loading

nathanstitt commented Nov 3, 2021

benjaminjkraft commented Nov 3, 2021

benjaminjkraft commented Nov 13, 2021

senseijack commented Nov 13, 2021

senseijack commented Nov 15, 2021

kantum commented Jan 18, 2023

benjaminjkraft commented Jan 18, 2023 • edited Loading

benjaminjkraft commented May 6, 2023

nathanstitt commented Oct 28, 2021 •

edited

Loading

csilvers commented Oct 29, 2021 •

edited

Loading

benjaminjkraft commented Nov 2, 2021 •

edited

Loading

benjaminjkraft commented Jan 18, 2023 •

edited

Loading