Skip to content

The query time is very long, 7 seconds on average. How to optimize it #4190

@tiankonghewo

Description

@tiankonghewo

dgrapg version is 1.0.15
import stage:
rdf data size is 400G,
node memory size is 256GB
Originally I wanted to use the new version v1.1.0, but when importing data, the new version always reported an OOM error at map stage. So I had to use the old version 1.0.x

when run up, my cluster server configuration is:
3 nodes:16core64G
each data:170G,380G,440G

second, my query is very simple, use pydgraph==1.2.0,my dgraph version is 1.0.15
but the query time is very long, 7 seconds on average

query = """query all($a: string, $value: string) {
all(func: eq(type, $a))@filter(eq(value, $value)) {
uid
type
value
}
}"""
variables = {'$a': 'PERSON', '$value': value}
t0 = time.time()
res = client.txn(read_only=True).query(query, variables=variables)
ppl = json.loads(res.json)

Finally, I would like to ask, for TB data size,
What is the recommended cluster size and node configuration?
Is there any production case for reference?
If I need to add nodes, how many should I add? 3 or 6

Metadata

Metadata

Assignees

No one assigned

    Labels

    status/more-info-neededThe issue has been sent back to the reporter asking for clarifications

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions