New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Null response when resolving a non-null filed from a different subgraph #1308
Comments
this is a tricky one, because the router is doing the right thing here. It has to present the supergraph as if it was one monolithic graph, so the basic error handling rules apply: if But the behaviour of the gateway looks useful here. Perhaps that's something that should be provided by client controlled nullability instead https://github.com/graphql/graphql-wg/blob/main/rfcs/ClientControlledNullability.md#-behavior |
I understand that it technically violates composed schema but as I have said, returning |
@Geal Do you think you can add something like |
@Mithras modifying the error management in that way would be a significant amount of work, and would potentially break other parts, so this is not something we can do as a short term solution, it will require some thought before we reach a satisfying solution. First, can you give me some information about your setup with the gateway (ie package version, federation 1 or 2, are there some custom extensions)? Because we tried to reproduce with the gateway and it behaves like the router: https://stackblitz.com/edit/basic-federation-7ak1t8?file=one.js,index.js |
I tested it and yes, that's because we skip validation in Gateway for performance reasons (about 2x throughput without validation) with this yarn patch:
because we don't really see much benefit of re-validating already valid responses from subgraphs at a cost of halving throughput.
Personally I'm in favor of option 2 but I would probably opt-in option 3 as well if it were available because for me performance >>> a couple schema mismatches. Our subgraphs always return valid schema and re-validating it in Router/Gateway doesn't seem necessary to me. I wouldn't use option 3 without option 2 though because some generated clients might not work if there is nothing in response for a non-nullable field. |
about the performance issue, I think you can revisit that decision for the router, because validation is much cheaper there. Having this last validation step can prevent bugs like, as an example, one instance of a subgraph that did not get the latest upgrade and returns invalid data (but valid from its point of view). It's more a matter of how you design your graph. In some contexts, you can still process a response without one of the requested fields, so it should be nullable, like So none of the 3 options are really satisfactory. Option 1 is too coarse grained, you might still benefit from non nullable fields in some parts of the schema, 2 is bound to create surprising issues and 3 will not happen because the bare minimum we need to do is follow the GraphQL specification |
Sure. The only reason I did this in Gateway was 2x performance boost. Router seems to be faster than Gateway even with validation enabled.
I disagree. Again, the problem is adding a non-null field to an entity will ALWAYS make some requests return
What issues do you see with option 2? I'm suggesting an optional flag that changes non-null fields to null fields. First of all, it's an option. People that are ok with null responses (I really want to see anybody who is ok with that though) don't have to opt-in. Second, making non-null fields nullable will never result in any schema violations. I don't see any issues with that. Just to make sure we are on the same page. When more than one subgraph contributes to entity fields, it is not possible to guarantee these fields won't be null. It does not make sense to have them non-nullable in supergraph schema. Non-null fields can only be enforced at each individual subgraph level. Never at supergraph level. |
It cannot be guaranteed, in the same way that an entity's field in a monolithic service could fail for various reasons (missing data, can't find file, unresponsive DB, etc).
that's where we are not aligned. It has nothing to do with federation. Marking a field as non-nullable means one thing: when it is requested, it does not make sense to get the entity's data if that field is missing. If the entity should be available even without that field, then that field has to be nullable, and that's a decision taken in the schema design, not in the router or rover. Where I(personally) think non-nullability is creating issues is that it should not be a decision from the server side, the client should actually be able to decide which data it can live without. So we're looking at the ongoing work around client controlled nullability. This discussion is going beyond the scope of what we are doing in the router, as it follows the consensus around the GraphQL specification and federation. So if it has to change, you should start by opening an issue on https://github.com/apollographql/federation and propose updates to the nullability behaviour. |
Router doesn't fail. It returns null without any reason. There is a reason services return errors and instead of just null. It's not going anywhere but let me give another good try. Subgraph A is guranteed to return propertyA for all entities it knows about so it defines this:
Is this wrong? I don't think so. Subgraph B is guranteed to return propertyB for all entities it knows about so it defines this:
Is this wrong? I don't think so. Now comes Apollo and composes it into
Is this correct? I don't think so. There is no way to guarantee either
Does this make sense? There is no way to guarantee fields won't be null in composed schema even when all subgraphs and the Router are working exactly as they should. Expecting both |
If the lack of
This is the specified behavior of GraphQL. The "whole response is set to null" because the field error is bubbling up to the
Many users rely on behavior that is not what you are describing here, but we can certainly imagine how a federated directive could help here, and as already stated, Client-Controlled Nullability might be what you want. Linting rules for the graph could also convey to your developers that non-null fields should be avoided. Some particular requests across federated graphs do rely on non-nullability to enforce and benefit from a simplified client experience by not returning what are, to them, unstable results. This doesn't sound like it is your use case!
If this is the case, then you should make
I don't understand this. The entities are keyed by a primary key. That primary key is the way to look up the entity in both subgraphs. If one of the subgraphs can't be assured to find that entity by its primary key and you want the rest of the operation to execute in its absence then you should again specify that in the schema by removing the non-null operator. This is a schema designer's choice.
I find that saying "ALWAYS" and "some requests" here to be quite confusing. Some requests will return
As pointed out above might be a comfortable exception in your case, doesn't guard against mis-implementation in a defensive-enough way to still offer the contract against the schema which the client expects. I'm curious why you didn't decide to disable validation in the subgraphs rather than disabling them in the point closest to the client where the validation perhaps matters the most? You'd still be halving the amount of validations you're doing, while upholding the contract with the schema. I can certainly understand how the behavior you're encountering isn't what you want. Feel free to respond to anything I said above, but I'm going to close this as "won't fix" because I'm certain that, aside from the error bubbling I noted above (which again, is important to fix), is best fixed by tooling outside of the Router since the Router merely operates on the Supergraph that it's been offered as input and the schema is contract we should obey. To be clear, that doesn't mean that there isn't a good solution to live elsewhere — just not in the Router. Thanks for opening this conversation originally! |
@Mithras I just want to clarify something since both of these sentences give me the impression that you're talking about marking or making something null: an additional step which you could see being easily missed. (Which resonates with me, if I'm understanding.), Nullable is the default in GraphQL. It's the addition of the I don't honestly believe that's obvious about GraphQL. That's beyond the scope of this conversation, but probably worth clarifying since it sounds like we're not seeing eye to eye here on a couple things. I do still think you're looking for tooling though to help you govern the way your graph is built though (e.g., linting), and I believe that Rover and Studio are great places for tooling, as you've suggested. |
It doesn't matter if you mark nullable or mark not nullable. The result is the same - nullability can be only guaranteed at subgraph level, not at supergraph level |
Here is a simple graphql-inspector rule in case somebody else doesn't want to get module.exports = ({ changes, newSchema }) => {
typeMap = newSchema.getTypeMap();
for (const typeName in typeMap) {
var type = typeMap[typeName];
if (type[Symbol.toStringTag] !== "GraphQLObjectType") {
continue;
}
if (!type.astNode) {
continue;
}
var keyDirectives = type.astNode.directives.filter(x => x.name.value === "key");
if (!keyDirectives.length) {
continue;
}
// var keys = keyDirectives.map(x => x.arguments.find(y => y.name.value === "fields").value.value);
for (const field of type.astNode.fields) {
// if (keys.includes(field.name.value)) {
// continue;
// }
if (field.type.kind === "NonNullType") {
changes.push({
criticality: { level: "BREAKING" },
type: "NON_NULL_FIELD_IN_FEDERATED_ENTITY",
message: `Non-nullable field '${field.name.value}' in federated type '${typeName}'`,
path: `${typeName}.${field.name.value}`
});
}
}
}
return changes;
}; |
So we are seeing a (possibly?) related issue with nullable fields being skipped. A query like
returns (with variables
while
returns
Per https://spec.graphql.org/October2021/#sec-Data:
but this does not seem to be an error case (or vars vs no vars would be equivalent). The spec is a bit vague about what happens if there is no error, but also nothing to return. |
Describe the bug
This one is tricky. Let me try to explain with an example:
SubgraphA
andSubgraphB
.SubgraphA
definesMyEntity
like thisSubgraphB
definesMyEntity
like thisSubgraphA
also defines a fieldmyEntity: MyEntity!
which returnsMyEntity
with id=1.Now if we try to run a query like this
two things can happen:
SubgraphB
can resolveMyEntity
with id=1, everything works fineSubgraphB
can not resolveMyEntity
with id=1, the whole response is set to null.By "can not resolve" I mean
SubgraphB
doesn't know aboutMyEntity
with id=1 and returns{ data: [null] }
forExpected behavior
Gateway was just omitting
propertyB
for case 2 above. Router should probably do the same.Desktop (please complete the following information):
Additional context
Everything works as expected if we make
propertyB
nullable but that's not a solution because it would require making pretty much all fields except IDs nullable.Maybe related: #1290
Somewhat related: #1304
The text was updated successfully, but these errors were encountered: