Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GA4GH GKS API review #1472

Closed
3 tasks
theferrit32 opened this issue Apr 3, 2024 · 8 comments
Closed
3 tasks

GA4GH GKS API review #1472

theferrit32 opened this issue Apr 3, 2024 · 8 comments
Assignees

Comments

@theferrit32
Copy link

Some schema notes I took while testing the va API:

(1) Numeric type mismatch means graphql cannot map value fields to the same query field.

  • SequenceLocation.start is a
    • DefiniteRange
    • IndefiniteRange
    • Number
  • IndefiniteRange.value is a jsonschema "number" (float)
  • Number.value is a jsonschema "integer" (int)
    So currently, one/both of these must be aliased.
    e.g. Number.value -> number_value, IndefiniteRange.value -> indefinite_range_value

(2) CURIE is a complex type instead of a simple string.

  • CURIE structure in the graphQL schema nests the string in an object under a .value field.
  • e.g. "17-43044295-T-TG"
    ->
{
  "_id":{
    "value":"ga4gh:VA.wN_AHuGDip6BqcNMxK22dUwfVB3A-4hf"
  },
  "type":"Allele",
  "location":{
    "_id":{
      "value":"ga4gh:VSL.XulT2LDDRynmnS1xxK1BdlVtkIYlF1gt"
    },
    ...
  },
  ...
}

(3) VAFocusAlleleURI is a complex type instead of a simple string. Similar to CURIE.

No specific changes needed immediately, but another note:
(4) Slow-start for the scaled down cloud run / hail cluster is inconvenient,
first requests fail for a minute or so.

Action items:

  • (Kyle) I will update the upstream jsonschema for IndefiniteRange.value to make it an integer.
  • (?) Change the graphQL type for IndefiniteRange.value to be an integer.
  • (?) Change CURIE/URI typed/structured fields to just be normal raw strings. Downstream consumers can verify against the typed strings using jsonschema if they want.
@phildarnowsky-broad
Copy link
Contributor

(2) and (3) are necessary due to the limitations of GraphQL. I'll look into (1) when I get a chance.

@theferrit32
Copy link
Author

gist with the exact graphql query I was using and all of the exact changes I made to response objects to get them to validate against the jsonschema. A lot of it is just deleting nulls, which is fine.

https://gist.github.com/theferrit32/df380899e9974cb885c2780cfa82ce7a

@theferrit32
Copy link
Author

@phildarnowsky-broad we are okay with the graphQL schema specification not exactly matching the jsonschema, if that is necessary for the JSON response payloads to more closely match the jsonschema. For example I'm okay with just deleting the CURIE type from the graphQL, and making all the fields currently typed as CURIE just be string. The jsonschema CURIE class can be used downstream by consumers if they want, it is just a thin subclass of string in the jsonschema, which is something not supported by graphQL.

@phildarnowsky-broad
Copy link
Contributor

@theferrit32

making all the fields currently typed as CURIE just be string

That's not possible due to the limitation I refer to above, since in GraphQL the individual types that make up a union type (as VACURIE does for VAAlleleLocation) have to be concrete object types, rather than primitives like string. Likewise for VAFocusAlleleURI and VAFocusAllele respectively.

@theferrit32
Copy link
Author

Ah I see, @phildarnowsky-broad. I wasn't thinking about the limitations about what can go in graphql unions.

So I think we have two options:
(1) leave those classes and union types as-is, don't care about the minor discrepancy between the graphql schema and jsonschema. If a consumer wants to make it match the jsonschema exactly (for example if they define data classes based on the jsonschema and want to feed these responses into those), it's fairly easy to do, but it just adds another step.
(2) Remove the unions by making each of those fields typed solely as an object or solely as a primitive string. For example for focusAllele we may just want the graphql to always return the VAAllele type, not VAFocusAlleleURI, even though these two types are allowed in the jsonschema.

@ahwagner do you have an opinion on this?

@ahwagner
Copy link

ahwagner commented Apr 5, 2024

I prefer option #2 as laid out by @theferrit32. GraphQL allows users to select only the fields they want, so let's just use full objects so that the returned JSON is valid per the VRS Schema without any translation step.

@phildarnowsky-broad
Copy link
Contributor

I'm about to start revising this @theferrit32 and I wanted to see if you'd made the corresponding changes on your end for what we discussed above. If not NBD I can revise the GraphQL end in line with this plan anyway, we just won't be able to actually see it working ourselves until the upstream data is also updated.

@mattsolo1
Copy link
Contributor

Closed by #1426

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants