Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Connectors: change query type to a union of ObjectQuery and ArrayQuery #9229

Open
BenoitRanque opened this issue Nov 18, 2022 · 2 comments
Labels
k/enhancement New feature or improve an existing feature

Comments

@BenoitRanque
Copy link
Contributor

BenoitRanque commented Nov 18, 2022

Is your proposal related to a problem?

Data connector data types are an important part of the spec and any implementation.
This is a proposal to change the Query type to better reflect the realities of how it is used.

The query type may be used to represent either a top-level query, or a subquery corresponding to a relationship.
The proposal is to change this type to a tagged union with two variants, one for objects and one for arrays.

These would be applicable at both the relationship level and the root level.

Describe the solution you'd like

The new proposed type looks like this:

type Query = ObjectQuery | ArrayQuery;
type ArrayQuery = {
  type: 'array';
  aggregates?: Record<string, Aggregate> | null;
  fields?: Record<string, Field> | null;
  limit?: number | null;
  offset?: number | null;
  order_by?: OrderBy;
  where?: Expression;
};
type ObjectQuery = {
  type: 'object';
  fields?: Record<string, Field> | null;
  where?: Expression;
};

Effectively, this brings the type in-line with the reality of how the types are used.
It also eliminates a vector for either undefined behavior or errors once parsed: the current typing suggests that object relationships can have limit, offset, and order_by properties, and the behavior if those are provided is unclear.

This typing update would reject such queries.

@BenoitRanque BenoitRanque added the k/enhancement New feature or improve an existing feature label Nov 18, 2022
@daniel-chambers
Copy link
Contributor

The reason we don't make the distinction between an "object query" and an "array query" is that from a query generation perspective there's little actual difference between the two (from a SQL generation perspective, anyway). Yes, from an HGE's GraphQL schema there is a difference, but from the SQL generation perspective, a relationship (object or array) is just a join and whether or not you get one row back or many is simply a matter of data (and DB constraints), not of the form of the query. HGE takes care of only sending things that are only relevant to array relationships when the subquery is for an array relationship. You, as the agent author, don't need to care and can just blindly accept what HGE sends you.

From a following-the-types perspective, if you're writing code that splits between object and array relationships, you're not following the types! The types tell you to ignore the distinction and just implement a general subquery functionality. And if you do so, you will generate correct queries regardless of whether there are array or object relationships involved. I would argue that if you have separate query generation logic for object vs array relationship-based subqueries, you have made a mistake and your code could be simplified to having one piece of subquery generation logic that just works for both.

(The only exception to this is if you wish to defensively form your join subquery for object relationships in a way that prevents it from returning multiple rows if the user has misconfigured an array relationship as an object relationship. In this case, you may make a very minor distinction between the two, probably by "artificially" setting the limit to 1 for an object relationship and then using your general subquery generation logic).

Having said all this however, I acknowledge that data connectors may be created for data sources that are not SQL based and therefore for those types of sources, the distinction between what is possible with an array relationship's subquery vs an object relationship's subquery may be more important. In these cases, it may make more sense to have a tighter type definition like the one you have suggested.

However, I think your ObjectQuery type is not quite correct, I can't see a use case for aggregates over an object relationship (unless I'm missing something), so I think the aggregates property could be removed.


Given the above justification for the existing type definition, and that changing the types would be a somewhat breaking change (to the types, probably not to the JSON itself), I don't think making this change is an urgent priority. However, I do think it may be valuable in the medium term. Certainly, if we were to do it, it would need to be done before a public launch of the data connector API in order to avoid breaking changes for customers. I've created a card on our JIRA to capture this: GDC-644.

@BenoitRanque
Copy link
Contributor Author

However, I think your ObjectQuery type is not quite correct, I can't see a use case for aggregates over an object relationship (unless I'm missing something), so I think the aggregates property could be removed.

Fair, I've edited the types to remove the aggregates property.

I'll also admit I was misguided and implementing my SQL similarly to how hasura does it, with object relationships equating to JSON objects directly found on the keys of said relationships, which made the difference between relationship types relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
k/enhancement New feature or improve an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants