[RFC] DataStore predicate grammar #8901

svidgen · 2021-09-15T03:05:45Z

This is a Request For Comments (RFC). RFCs are intended to elicit feedback regarding a proposed change to the Amplify Framework. Please feel free to post comments or questions here.

Summary

This RFC is to gather customer feedback around Amplify DataStore's grammar and handling of the logical operator grouping (and, or, not) used in query() and observe() predicates.

Selective sync predicates are out of scope.

Motivation

We have begun work to introduce nested predicates for querying against related models. In doing so, we have come to suspect that the existing predicate grammar is laden with ambiguities. It neither mirrors the behavior found in the native platform implementations nor that of other common query languages such as GraphQL. Because we are already changing predicates to add nesting, this is an opportune time to disambiguate this grammar.

Current Logical Operator Behavior

Consider a simple model:

type Task @model {
  id: ID!
  owner: String!
  title: String
}

Let's consider a few GraphQL queries and then how we'd perform these operation in iOS vs. JS. Note that iOS also supports an operator-overloaded syntax which JS cannot replicate. And also note that the Java/Android for each of these queries mirrors the iOS nearly exactly, with field-casing being the only difference.

For brevity, only the predicate portion of the query operation is shown. Each of these predicates would be the 2nd argument in a DataStore.query(Task, ...) call.

Simple `AND` query

E.g., "Onboarding tasks for a single owner."

GraphQL

query listTask {
  listTasks(filter: {
    owner: { eq: "some name" }
    title: { beginsWith: "[onboarding]" }
  }) {
    items { ... }
  }
}

iOS

Task.owner.eq("some name").and(
  Task.title.beginsWith("[onboarding]")
)

TypeScript/JavaScript

task => task
  .owner("eq", "some name")
  .title("beginsWith", "[onboarding]")
;

GraphQL and JavaScript have an implicit AND grouping. Each new AND'd condition is simply appended to the expression. This logical grouping must be done explicitly in iOS (and Android) with a wrapping and() call.

Simple OR query

E.g., "Tasks for two particular users plus all 'onboarding' tasks."

GraphQL

query listTask {
  listTasks(filter: {or: [
    { owner: { eq: "AAA" } }
    { owner: { eq: "BBB" } }
    { title: { beginsWith: "[onboarding]" } }
  ]}) {
    items { ... }
  }
}

iOS

Task.owner.eq("AAA")
  .or(Task.owner.eq("BBB"))
  .or(Task.title.beginsWith("[onboarding]"))
;

TypeScript/JavaScript

task => task.or(t => t
  .owner("eq", "AAA")
  .owner("eq", "BBB")
  .title("beginsWith", "[onboarding]")
);

GraphQL's or group accepts a list of "full" nested filters. It is not demonstrated here, but the normal implicit AND logic is applied to the conditions therein. iOS/Android requires explicit, wrapping or() call for each condition. JavaScript's or() method group creates a full new child predicate where all conditions are OR'd together.

More complicated example, mixing AND/OR groups

E.g., "Find onboarding tasks assigned to the wrong user."

GraphQL

query listTask {
  listTasks(
    filter: {
      title: { beginsWith: "[onboarding]" }
      or: [
        { owner: { eq: "AAA" }, title: { notContains: "[AAA]" } }
        { owner: { eq: "BBB" }, title: { notContains: "[BBB]" } }
        { owner: { eq: "CCC" }, title: { notContains: "[CCC]" } }
      ]
    }
  ) {
    items { ... }
  }
}

iOS

Task.title.eq("[onboarding]").and(
  Task.owner.eq("AAA").and(Task.title.notContains("[AAA]"))
  .or(Task.owner.eq("BBB").and(Task.title.notContains("[AAA]"))
  .or(Task.owner.eq("CCC").and(Task.title.notContains("[AAA]"))
);

TypeScript/JavaScript

task => task.or(t => t
  .title("beginsWith", "[onboarding]")
  .or(t => t
      .and(t => t.owner("eq", "AAA").title("notContains", "[AAA]"))
      .and(t => t.owner("eq", "BBB").title("notContains", "[BBB]"))
      .and(t => t.owner("eq", "CCC").title("notContains", "[CCC]"))
  )
);

GraphQL implies AND grouping at the top level. It's OR group applies to the full result of each AND-grouped child expression.

iOS and Android each require explicit and()/or() grouping for each condition.

JavaScript's logical groups are "modal". Each and()/or() call sets the "mode" for the child predicate. This pattern causes predicate structures in TS/JS to be the opposite of the native platforms, almost reversing the expected use of and() and or().

With Nested Predicates

As we consider this logical operator grammar, we must also consider more complicated cases that involve the upcoming nested predicates feature. Nested predicates will allow you to query for a Model against related models.

Lets extend our schema slightly:

type Project @model {
  id: ID!
  name: String!
}

type Task @model {
  id: ID!
  owner: String!
  title: String!
  project: Project!
  artifacts: [Artifact]
}

type Artifact @model {
  id: ID!
  title: String!
  task: Task!
}

A simple query, where we fetch all Tasks where the task's project name contains "[Research]", could look like this:

DataStore.query(Task, t => t.project.name('contains', '[Research]'));

Under the the existing grammar, we would be able to add intermediate conditions to additionally filter by Task owner for example, like so:

DataStore.query(Task, t => t.owner.name('eq', 'Jane').project.name('contains', '[Research]'));

But, what happens if we introduce logical operators under the existing grammar?

DataStore.query(Task, t => t
    .owner("eq", "John")
    .or(t =>
      t.title("contains", "[Spike]").project.name("contains", "[Research]")
    )
);

What filtering logic do you expect this predicate to perform? Is it Tasks where ...

task.owner = "John" OR ( task.title.contains("[Spike]") AND task.project.name.contains("[Research]") )

Or where ...

task.owner = "John" AND ( task.title.contains("[Spike]") OR task.project.name.contains("[Research]") )

(Or something else?!)

Under the current grammar, the 2nd interpretation would be correct. But, this is unexpected, disharmonious from other predicate grammars, and possibly too ambiguous as-is.

Proposed Solutions and Options

Let's consider a fairly complicated (albeit contrived) query and and look at some options for representing the same logic as a TS/JS DataStore predicate. We'll start with SQL, because it can unambiguously represent our logic.

Suppose we want all Tasks under "[Research]" projects with "[PDF]" artifacts, plus all "[SPIKE]" Tasks owned by our researchers ("Bob" and "Jane") with "[Report]" or "[REPORT]" artifacts.

SELECT DISTINCT Task.*
FROM
  Task
  LEFT JOIN Project ON (Project.id=Task.projectId)
  LEFT JOIN Artifact ON (Artifact.taskId=Task.id)
WHERE
  (
    Project.name LIKE "%[Research]%"
      AND
    Artifact.name LIKE "%[PDF]%"
  )
    OR
  (
    Task.title like "%[SPIKE]%"
      AND
    (
      Artifact.name LIKE "%[Report]%"
        OR
      Artifact.name LIKE "%[REPORT]%" )
    )
      AND
    (
      Task.owner = "Bob"
        OR
      Task.owner = "Jane"
    )

Logical Grouping Option A

Condition cannot be chained; hence there is no implicit logical grouping.

DataStore.query(
  Task,
  task => task.or(task => [
    task.and(task => [
      task.project.name.contains("[Research]"),
      task.artifact.name.contains("[PDF]")
    ]),
    task.and(task => [
      task.title.contains("[SPIKE"]),
      task.artifact.or(a => [a.name.contains("[Report]"), a.name.contains("[Report]")]),
      task.or(t => [t.owner.eq("Bob"), t.owner.eq("Jane")])
    ])
  ])
);

Logical Grouping Option B

Conditions can be chained with implicit AND grouping.

DataStore.query(
  Task,
  task => task.or(task => [
    task.project.name.contains("[Research]")
      .artifact.name.contains("[PDF]"),
    task
      .title.contains("[SPIKE"])
      .artifact.or(a => [a.name.contains("[Report]"), a.name.contains("[REPORT]")])
      .or(t => [t.owner.eq("Bob"), t.owner.eq("Jane")])
  ])
);

Questions for the community

What options do you prefer? (And why?)
Are any of the options problematic?
Do any of these options complicate or preclude a particular use-case?
What else haven't we thought of?

The text was updated successfully, but these errors were encountered:

rpostulart · 2021-09-15T17:32:44Z

This would be such a great addition to datastore to implement so many use cases and especially use cases where you need to do advanced filtering for multitenant applications.

Good to see you can query also on the relations of a model. This increases the speed of development.

For multi tenant it is key to mix and match AND / OR conditions to sync the right data.

I am really looking forward to have this available soon.

Is it possible to query lists? So if my id is in a list then query/sync?

List

rpostulart · 2021-09-15T20:44:42Z

Btw I prefer option A, because it is more visible what the code is doing.

svidgen · 2021-09-17T20:26:02Z

Hey @rpostulart,

Is it possible to query lists? So if my id is in a list then query/sync?

Hopefully you won't even need to do this! The idea is to empower you to write queries that handle these intermediate "query against list of x" operations internally. For example, whereas you would currently need do things like this:

const tasks = await DataStore.query(Task, task => task.owner('eq', 'rpostulart'));
const project_ids = [...new Set(tasks.map(t => t.id))];
const projects = [];
for (let id of project_ids) {
  projects.push(await DataStore.query(Project, id))
}

You'd now just do something like this:

const projects = await DataStore.query(Project, project => project.task.owner.eq('rpostulart'));

That said, if you do end up with a list of ID's at some point, the least ambiguous mechanism might just be a map().

DataStore.query(Task, task => task
  .or(t => ['a', 'b', 'c', 'x', 'y', 'z'].map(id => t.id.eq(id)))
);

I'd be a little concerned that allowing lists of values in the condition methods (eq([a,b,c]) for example) could lead us into ambiguity again. The map() solution operates within a logical group making it very explicit.

svidgen · 2021-09-20T16:32:17Z

Hey @rpostulart,

I need to clarify one thing that I read right past on Friday. Nested predicates are only designed to be used against local data. They will not be available for selective sync. Sorry if I set the wrong expectation there!

I'll update the issue body to clarify this.

rpostulart · 2021-10-07T10:35:53Z

OK, thanks for the feedback.

So how to deal with this use case:
I have a SAAS application with a tenant and in the tenant you have different teams. I only want data on my device (With datastore):

From my tenant, so not from other tenants
From the teams where I belong to. So not the data from my tenant from other teams

rpostulart · 2021-10-17T09:12:49Z

I believe this is the same like I needed above:

#7544 (comment)

jakejcheng · 2022-03-05T11:15:09Z

const projects = await DataStore.query(Project, project => project.task.owner.eq('rpostulart'));

does this apply to custom types? like how can I query based on custom type field?

svidgen · 2022-11-14T20:31:29Z

The new predicate grammar has been released as part of aws-amplify V5.

See https://github.com/aws-amplify/amplify-js#amplify-5xx-has-breaking-changes-please-see-the-breaking-changes-below

brucevilo1013 · 2024-07-12T20:38:43Z

Hi @svidgen
How can I implement sql query such as "not like 'passed%'" in datastore predicate?
The most similar implementation is to use notContains, but It is not correct in essential.
Thanks in advance.

svidgen added DataStore Related to DataStore category feature-request Request a new feature labels Sep 15, 2021

svidgen pinned this issue Sep 15, 2021

svidgen self-assigned this Sep 16, 2021

svidgen mentioned this issue Oct 6, 2021

feat(@aws-amplify/datastore): query and observe predicates redesign #8965

Merged

17 tasks

wvidana mentioned this issue Mar 23, 2022

Datastore syncExpressions in a one to many relationship? #7544

Open

abdallahshaban557 unpinned this issue Jul 1, 2022

svidgen mentioned this issue Oct 27, 2022

feature(@aws-amplify/datastore): lazy loading and nested/related-model predicates #10477

Merged

4 tasks

svidgen closed this as completed Nov 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] DataStore predicate grammar #8901

[RFC] DataStore predicate grammar #8901

svidgen commented Sep 15, 2021 •

edited

Loading

rpostulart commented Sep 15, 2021

rpostulart commented Sep 15, 2021

svidgen commented Sep 17, 2021

svidgen commented Sep 20, 2021

rpostulart commented Oct 7, 2021 •

edited

Loading

rpostulart commented Oct 17, 2021

jakejcheng commented Mar 5, 2022 •

edited

Loading

svidgen commented Nov 14, 2022 •

edited

Loading

brucevilo1013 commented Jul 12, 2024

[RFC] DataStore predicate grammar #8901

[RFC] DataStore predicate grammar #8901

Comments

svidgen commented Sep 15, 2021 • edited Loading

Summary

Motivation

Current Logical Operator Behavior

Simple AND query

GraphQL

iOS

TypeScript/JavaScript

Simple OR query

GraphQL

iOS

TypeScript/JavaScript

More complicated example, mixing AND/OR groups

GraphQL

iOS

TypeScript/JavaScript

With Nested Predicates

Proposed Solutions and Options

Logical Grouping Option A

Logical Grouping Option B

Questions for the community

rpostulart commented Sep 15, 2021

rpostulart commented Sep 15, 2021

svidgen commented Sep 17, 2021

svidgen commented Sep 20, 2021

rpostulart commented Oct 7, 2021 • edited Loading

rpostulart commented Oct 17, 2021

jakejcheng commented Mar 5, 2022 • edited Loading

svidgen commented Nov 14, 2022 • edited Loading

brucevilo1013 commented Jul 12, 2024

svidgen commented Sep 15, 2021 •

edited

Loading

Simple `AND` query

rpostulart commented Oct 7, 2021 •

edited

Loading

jakejcheng commented Mar 5, 2022 •

edited

Loading

svidgen commented Nov 14, 2022 •

edited

Loading