Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect ordering of fragments used for SHA256 hash in operationIdentifier #3269

Closed
scottasoutherland opened this issue Oct 30, 2023 · 10 comments · Fixed by apollographql/apollo-ios-dev#130
Assignees
Labels
bug Generally incorrect behavior needs investigation planned-next Slated to be included in the next release

Comments

@scottasoutherland
Copy link

Summary

We have a test which computes the sha256 of all of operations to verify that the operationIdentifier is correct. The reasoning for this is that sometimes engineers would not check in the whole set of graphQL changes and the operationIdentifier was inaccurate. However in apollo 1.6.0 we were running into a problem where our operationIdentifiers output would randomly change to the value of a different, unrelated operation. Digging into it seemed like we were getting incorrect cache hits for unrelated operations inside of the apollo codegen code.

However, we noticed this code was recently refactored here , so we decided to update to the most recent revision of Apollo iOS Codegen ( revision: 724674751239f30176ebdd0581f2f70bc6e250f2 ) and see if that resolved the issue for us. And it did! However now we are observing that for a very small few of our operations, our SHA256s no longer match. They stay consistent across multiple runs of codegen, but always mismatch our computed value in our test. Digging into this, it seems that the reason is that the order of fragments used in the string to compute the hash is DIFFERENT than the order they end up in in the final fragments array used in OperationDefinition.

We also tried updating to revision 21cacc4d5eabe49479d82f8c521b1d9f55da1ae4 of codegen but the issue persists.

To clarify, it seems like the string used here, definitionSource ( https://github.com/apollographql/apollo-ios-codegen/blob/21cacc4d5eabe49479d82f8c521b1d9f55da1ae4/Sources/ApolloCodegenLib/OperationIdentifierFactory.swift#L59 ) and the string used here, queryDocument (

public var queryDocument: String {
) don't match and have a different ordering of fragments. They seem mostly the same but with the order swapped with a few.

And importantly the new version of codegen matches neither apollo 1.6.0 or what our server would compute for these few queries that have a mismatch.

Version

1.6.0 & codegen revision: 21cacc4d5eabe49479d82f8c521b1d9f55da1ae4

Steps to reproduce the behavior

  • Create queries with lots of fragments
  • Run codegen with operationIdentifiers output.
  • Test at runtime the sha256 of the queryDocument
  • Compare to the operationIdentifier, and note that the values sometimes do not match.

Logs

No response

Anything else?

No response

@scottasoutherland scottasoutherland added bug Generally incorrect behavior needs investigation labels Oct 30, 2023
@ihnatmoisieiev
Copy link

ihnatmoisieiev commented Nov 1, 2023

Hello,
we're observing the same issue as you described on the beginning of your report with 1.6.0 and 1.6.1 versions:

"operationIdentifiers output would randomly change to the value of a different, unrelated operation."

We are facing the same from time to time and it blocks our development process.
I'd really appreciate in any advice and help how to resolve the issue.
Thank you!

UPD: could be connected to #3270

@AnthonyMDev AnthonyMDev self-assigned this Nov 1, 2023
@AnthonyMDev AnthonyMDev added this to the Patch Releases (1.x.x) milestone Nov 1, 2023
@AnthonyMDev AnthonyMDev added the planned-next Slated to be included in the next release label Nov 1, 2023
@AnthonyMDev
Copy link
Contributor

Thanks so much for this detailed bug report. We're going to be releasing a 1.7.0 version today, so this fix won't make the cut for that. But I'll get looking at this one tomorrow and will release a patch version with a fix ASAP. I'll keep this issue updated with any findings and progress.

To clarify, it seems like the string used here, definitionSource and the string used here, queryDocument don't match and have a different ordering of fragments. They seem mostly the same but with the order swapped with a few.

That seems to likely be the case! I'll confirm and should be able to get this resolved! Thank you for digging into it. Providing that information makes it much quicker for me to get this fixed!

@scottasoutherland
Copy link
Author

@ihnatmoisieiev I believe updating to the most recent commit of apollo-ios-codegen will fix that particular issue for you, but then you'll wind up with this one.

@AnthonyMDev
Copy link
Contributor

@scottasoutherland I'd like to confirm something.

Is it important to you that the ordering of the fragments is a specific ordering (ie. one that matches the behavior of 1.6.0), or just the the ordering is consistent (ie. the same when computing operation ids and in the queryDocument source in the generated model).

It would be a lot easier for us to re-order the fragments in the queryDocument to be the same as the order used in id computation than vice-versa.

@scottasoutherland
Copy link
Author

@scottasoutherland I'd like to confirm something.

Is it important to you that the ordering of the fragments is a specific ordering (ie. one that matches the behavior of 1.6.0), or just the the ordering is consistent (ie. the same when computing operation ids and in the queryDocument source in the generated model).

It would be a lot easier for us to re-order the fragments in the queryDocument to be the same as the order used in id computation than vice-versa.

We don't really care, so long as they are consistent between the two, and the hash can be verified & re-computed.

@AnthonyMDev
Copy link
Contributor

Okay great. Thanks

@AnthonyMDev
Copy link
Contributor

@scottasoutherland Would you be able to provide me an example case of an operation with fragments that reproduce this bug?

I'm having a hard to reproducing whatever edge case is causing this behavior.

@scottasoutherland
Copy link
Author

I don't think i will be allowed to share the exact operation & fragments that we use that are causing this, but I can say that we probably have 100+ operations and it's happening to 2 of them, and both of those have a lot of fragments. Which, i know isn't especially helpful, but I just wanted to give an idea of how often it is happening.

@AnthonyMDev
Copy link
Contributor

AnthonyMDev commented Nov 7, 2023

Sure, I don't expect you to be able to share your actual operation and fragments. I guess I'm wondering if you could build me an example with the edge case occurring. Its unlikely that it's just the sheer number of fragments that is causing the issue. It's more likely something about the shape of the fragments and the dependencies between them.

I have a query like this that I'm testing with, and everything seems to be working fine. I'm trying moving this around, but I'm not able to reproduce the issue yet.

query NameQuery {
  ...Fragment1
  ...Fragment4
}

fragment Fragment1 on Query {
  name
  ...Fragment2
}

fragment Fragment2 on Query {
  name
  ...Fragment3
}

fragment Fragment3 on Query {
  name
}

fragment Fragment4 on Query {
  name
  ...Fragment5
}

fragment Fragment5 on Query {
  name
}

@scottasoutherland
Copy link
Author

Just wanted to confirm that I tested this change out and it seems to have fixed the issue for us. Thanks for the fast turn around!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Generally incorrect behavior needs investigation planned-next Slated to be included in the next release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants