Skip to content

Pagination with offset does not scale with var blocks #5808

@EnricoMi

Description

@EnricoMi

What version of Dgraph are you using?

20.03.3

Have you tried reproducing the issue with the latest release?

Yes

What is the hardware spec (RAM, OS)?

Unrelated

Steps to reproduce the issue (command/config used to run Dgraph).

I have a predicate for 40m uids, one triple per uid (if it exists). When I read the first 1k uids that have this predicate as:

{
  pred as var(func: has(<http://www.w3.org/2000/01/rdf-schema#label>))

  result (func: uid(pred), first: 1000, offset: 0) {
    uid
    <http://www.w3.org/2000/01/rdf-schema#label>
  }
}

the query takes minutes. When I read them as:

{
  result (func: has(<http://www.w3.org/2000/01/rdf-schema#label>), first: 1000, offset: 0) {
    uid
    <http://www.w3.org/2000/01/rdf-schema#label>
  }
}

the query takes milliseconds.

My impression from the timings is that the pred as var evaluates the entire result list (hence the constant high query time that depends on the size of the result set) and result (func: uid(…)) then picks only the first 1000 results. Either result (func: uid(…)) should push a limit into the evaluation of pred as var, or pred as var implements a better evaluation that allows result (func: uid(…)) to seek and iterate.

If I modify the first query as:

{
  pred as var(func: has(<http://www.w3.org/2000/01/rdf-schema#label>), first: 1000)

  result (func: uid(pred), first: 1000, offset: 0) {
    uid
    <http://www.w3.org/2000/01/rdf-schema#label>
  }
}

I get my result in milliseconds.

With pred as var(func: has(<http://www.w3.org/2000/01/rdf-schema#label>), first: {limit + offset}) this query scales with the offset as result (func: uid(…)) does (see #5807). For more details see this forum comment.

The motivation why I am using pred as var here is that I want to read multiple predicates this way (see this forum comment).

Expected behaviour and actual result.

Both queries should be equally fast.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/performancePerformance related issues.area/querylang/paginationRelated to pagination: first, offset, etcarea/querylang/varsIssues related to queries with GraphQL variablesexp/intermediateFixing this requires some experience with the project.kind/enhancementSomething could be better.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions