Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] DataStore predicate grammar #8901

Closed
svidgen opened this issue Sep 15, 2021 · 9 comments
Closed

[RFC] DataStore predicate grammar #8901

svidgen opened this issue Sep 15, 2021 · 9 comments
Assignees
Labels
DataStore Related to DataStore category feature-request Request a new feature

Comments

@svidgen
Copy link
Member

svidgen commented Sep 15, 2021

This is a Request For Comments (RFC). RFCs are intended to elicit feedback regarding a proposed change to the Amplify Framework. Please feel free to post comments or questions here.

Summary

This RFC is to gather customer feedback around Amplify DataStore's grammar and handling of the logical operator grouping (and, or, not) used in query() and observe() predicates.

Selective sync predicates are out of scope.

Motivation

We have begun work to introduce nested predicates for querying against related models. In doing so, we have come to suspect that the existing predicate grammar is laden with ambiguities. It neither mirrors the behavior found in the native platform implementations nor that of other common query languages such as GraphQL. Because we are already changing predicates to add nesting, this is an opportune time to disambiguate this grammar.

Current Logical Operator Behavior

Consider a simple model:

type Task @model {
  id: ID!
  owner: String!
  title: String
}

Let's consider a few GraphQL queries and then how we'd perform these operation in iOS vs. JS. Note that iOS also supports an operator-overloaded syntax which JS cannot replicate. And also note that the Java/Android for each of these queries mirrors the iOS nearly exactly, with field-casing being the only difference.

For brevity, only the predicate portion of the query operation is shown. Each of these predicates would be the 2nd argument in a DataStore.query(Task, ...) call.

Simple AND query

E.g., "Onboarding tasks for a single owner."

GraphQL
query listTask {
  listTasks(filter: {
    owner: { eq: "some name" }
    title: { beginsWith: "[onboarding]" }
  }) {
    items { ... }
  }
}
iOS
Task.owner.eq("some name").and(
  Task.title.beginsWith("[onboarding]")
)
TypeScript/JavaScript
task => task
  .owner("eq", "some name")
  .title("beginsWith", "[onboarding]")
;

GraphQL and JavaScript have an implicit AND grouping. Each new AND'd condition is simply appended to the expression. This logical grouping must be done explicitly in iOS (and Android) with a wrapping and() call.

Simple OR query

E.g., "Tasks for two particular users plus all 'onboarding' tasks."

GraphQL
query listTask {
  listTasks(filter: {or: [
    { owner: { eq: "AAA" } }
    { owner: { eq: "BBB" } }
    { title: { beginsWith: "[onboarding]" } }
  ]}) {
    items { ... }
  }
}
iOS
Task.owner.eq("AAA")
  .or(Task.owner.eq("BBB"))
  .or(Task.title.beginsWith("[onboarding]"))
;
TypeScript/JavaScript
task => task.or(t => t
  .owner("eq", "AAA")
  .owner("eq", "BBB")
  .title("beginsWith", "[onboarding]")
);

GraphQL's or group accepts a list of "full" nested filters. It is not demonstrated here, but the normal implicit AND logic is applied to the conditions therein. iOS/Android requires explicit, wrapping or() call for each condition. JavaScript's or() method group creates a full new child predicate where all conditions are OR'd together.

More complicated example, mixing AND/OR groups

E.g., "Find onboarding tasks assigned to the wrong user."

GraphQL
query listTask {
  listTasks(
    filter: {
      title: { beginsWith: "[onboarding]" }
      or: [
        { owner: { eq: "AAA" }, title: { notContains: "[AAA]" } }
        { owner: { eq: "BBB" }, title: { notContains: "[BBB]" } }
        { owner: { eq: "CCC" }, title: { notContains: "[CCC]" } }
      ]
    }
  ) {
    items { ... }
  }
}
iOS
Task.title.eq("[onboarding]").and(
  Task.owner.eq("AAA").and(Task.title.notContains("[AAA]"))
  .or(Task.owner.eq("BBB").and(Task.title.notContains("[AAA]"))
  .or(Task.owner.eq("CCC").and(Task.title.notContains("[AAA]"))
);
TypeScript/JavaScript
task => task.or(t => t
  .title("beginsWith", "[onboarding]")
  .or(t => t
      .and(t => t.owner("eq", "AAA").title("notContains", "[AAA]"))
      .and(t => t.owner("eq", "BBB").title("notContains", "[BBB]"))
      .and(t => t.owner("eq", "CCC").title("notContains", "[CCC]"))
  )
);

GraphQL implies AND grouping at the top level. It's OR group applies to the full result of each AND-grouped child expression.

iOS and Android each require explicit and()/or() grouping for each condition.

JavaScript's logical groups are "modal". Each and()/or() call sets the "mode" for the child predicate. This pattern causes predicate structures in TS/JS to be the opposite of the native platforms, almost reversing the expected use of and() and or().

With Nested Predicates

As we consider this logical operator grammar, we must also consider more complicated cases that involve the upcoming nested predicates feature. Nested predicates will allow you to query for a Model against related models.

Lets extend our schema slightly:

type Project @model {
  id: ID!
  name: String!
}

type Task @model {
  id: ID!
  owner: String!
  title: String!
  project: Project!
  artifacts: [Artifact]
}

type Artifact @model {
  id: ID!
  title: String!
  task: Task!
}

A simple query, where we fetch all Tasks where the task's project name contains "[Research]", could look like this:

DataStore.query(Task, t => t.project.name('contains', '[Research]'));

Under the the existing grammar, we would be able to add intermediate conditions to additionally filter by Task owner for example, like so:

DataStore.query(Task, t => t.owner.name('eq', 'Jane').project.name('contains', '[Research]'));

But, what happens if we introduce logical operators under the existing grammar?

DataStore.query(Task, t => t
    .owner("eq", "John")
    .or(t =>
      t.title("contains", "[Spike]").project.name("contains", "[Research]")
    )
);

What filtering logic do you expect this predicate to perform? Is it Tasks where ...

task.owner = "John" OR ( task.title.contains("[Spike]") AND task.project.name.contains("[Research]") )

Or where ...

task.owner = "John" AND ( task.title.contains("[Spike]") OR task.project.name.contains("[Research]") )

(Or something else?!)

Under the current grammar, the 2nd interpretation would be correct. But, this is unexpected, disharmonious from other predicate grammars, and possibly too ambiguous as-is.

Proposed Solutions and Options

Let's consider a fairly complicated (albeit contrived) query and and look at some options for representing the same logic as a TS/JS DataStore predicate. We'll start with SQL, because it can unambiguously represent our logic.

Suppose we want all Tasks under "[Research]" projects with "[PDF]" artifacts, plus all "[SPIKE]" Tasks owned by our researchers ("Bob" and "Jane") with "[Report]" or "[REPORT]" artifacts.

SELECT DISTINCT Task.*
FROM
  Task
  LEFT JOIN Project ON (Project.id=Task.projectId)
  LEFT JOIN Artifact ON (Artifact.taskId=Task.id)
WHERE
  (
    Project.name LIKE "%[Research]%"
      AND
    Artifact.name LIKE "%[PDF]%"
  )
    OR
  (
    Task.title like "%[SPIKE]%"
      AND
    (
      Artifact.name LIKE "%[Report]%"
        OR
      Artifact.name LIKE "%[REPORT]%" )
    )
      AND
    (
      Task.owner = "Bob"
        OR
      Task.owner = "Jane"
    )

Logical Grouping Option A

Condition cannot be chained; hence there is no implicit logical grouping.

DataStore.query(
  Task,
  task => task.or(task => [
    task.and(task => [
      task.project.name.contains("[Research]"),
      task.artifact.name.contains("[PDF]")
    ]),
    task.and(task => [
      task.title.contains("[SPIKE"]),
      task.artifact.or(a => [a.name.contains("[Report]"), a.name.contains("[Report]")]),
      task.or(t => [t.owner.eq("Bob"), t.owner.eq("Jane")])
    ])
  ])
);

Logical Grouping Option B

Conditions can be chained with implicit AND grouping.

DataStore.query(
  Task,
  task => task.or(task => [
    task.project.name.contains("[Research]")
      .artifact.name.contains("[PDF]"),
    task
      .title.contains("[SPIKE"])
      .artifact.or(a => [a.name.contains("[Report]"), a.name.contains("[REPORT]")])
      .or(t => [t.owner.eq("Bob"), t.owner.eq("Jane")])
  ])
);

Questions for the community

  • What options do you prefer? (And why?)
  • Are any of the options problematic?
  • Do any of these options complicate or preclude a particular use-case?
  • What else haven't we thought of?
@svidgen svidgen added DataStore Related to DataStore category feature-request Request a new feature labels Sep 15, 2021
@svidgen svidgen pinned this issue Sep 15, 2021
@rpostulart
Copy link

This would be such a great addition to datastore to implement so many use cases and especially use cases where you need to do advanced filtering for multitenant applications.

Good to see you can query also on the relations of a model. This increases the speed of development.

For multi tenant it is key to mix and match AND / OR conditions to sync the right data.

I am really looking forward to have this available soon.

Is it possible to query lists? So if my id is in a list then query/sync?

List

@rpostulart
Copy link

Btw I prefer option A, because it is more visible what the code is doing.

@svidgen svidgen self-assigned this Sep 16, 2021
@svidgen
Copy link
Member Author

svidgen commented Sep 17, 2021

Hey @rpostulart,

Is it possible to query lists? So if my id is in a list then query/sync?

Hopefully you won't even need to do this! The idea is to empower you to write queries that handle these intermediate "query against list of x" operations internally. For example, whereas you would currently need do things like this:

const tasks = await DataStore.query(Task, task => task.owner('eq', 'rpostulart'));
const project_ids = [...new Set(tasks.map(t => t.id))];
const projects = [];
for (let id of project_ids) {
  projects.push(await DataStore.query(Project, id))
}

You'd now just do something like this:

const projects = await DataStore.query(Project, project => project.task.owner.eq('rpostulart'));

That said, if you do end up with a list of ID's at some point, the least ambiguous mechanism might just be a map().

DataStore.query(Task, task => task
  .or(t => ['a', 'b', 'c', 'x', 'y', 'z'].map(id => t.id.eq(id)))
);

I'd be a little concerned that allowing lists of values in the condition methods (eq([a,b,c]) for example) could lead us into ambiguity again. The map() solution operates within a logical group making it very explicit.

@svidgen
Copy link
Member Author

svidgen commented Sep 20, 2021

Hey @rpostulart,

I need to clarify one thing that I read right past on Friday. Nested predicates are only designed to be used against local data. They will not be available for selective sync. Sorry if I set the wrong expectation there!

I'll update the issue body to clarify this.

@rpostulart
Copy link

rpostulart commented Oct 7, 2021

OK, thanks for the feedback.

So how to deal with this use case:
I have a SAAS application with a tenant and in the tenant you have different teams. I only want data on my device (With datastore):

  1. From my tenant, so not from other tenants
  2. From the teams where I belong to. So not the data from my tenant from other teams

@rpostulart
Copy link

I believe this is the same like I needed above:

#7544 (comment)

@jakejcheng
Copy link

jakejcheng commented Mar 5, 2022

const projects = await DataStore.query(Project, project => project.task.owner.eq('rpostulart'));

does this apply to custom types? like how can I query based on custom type field?

@svidgen
Copy link
Member Author

svidgen commented Nov 14, 2022

The new predicate grammar has been released as part of aws-amplify V5.

See https://github.com/aws-amplify/amplify-js#amplify-5xx-has-breaking-changes-please-see-the-breaking-changes-below

@svidgen svidgen closed this as completed Nov 14, 2022
@brucevilo1013
Copy link

Hi @svidgen
How can I implement sql query such as "not like 'passed%'" in datastore predicate?
The most similar implementation is to use notContains, but It is not correct in essential.
Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataStore Related to DataStore category feature-request Request a new feature
Projects
None yet
Development

No branches or pull requests

4 participants