Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow join query in clustered setup #9408

Closed
vzamashka opened this issue Jul 4, 2019 · 0 comments
Closed

Slow join query in clustered setup #9408

vzamashka opened this issue Jul 4, 2019 · 0 comments
Labels

Comments

@vzamashka
Copy link

My Environment

  • ArangoDB Version: 3.4.6
  • Storage Engine: RocksDB
  • Deployment Mode: Cluster
  • Deployment Strategy: ArangoDB Starter
  • Infrastructure: own local network
  • Operating System: MacOS
  • Total RAM in your machine: 2 machines wiht 8 Gb of RAM, 1 machine with 16 Gb
  • Disks in use: SSD

Component, Query & Data

AQL query (if applicable):
FOR pobj IN PlatformObj let user = (for user, link in outbound pobj ParentChildLink RETURN {vertex: user, edge: link}) Return {r: pobj, c: user[0].vertex}

AQL explain (if applicable):
Execution plan:
Id NodeType Site Est. Comment
1 SingletonNode DBS 1 * ROOT
2 EnumerateCollectionNode DBS 100001 - FOR pobj IN PlatformObj /* full collection scan, 3 shard(s) /
13 RemoteNode COOR 100001 - REMOTE
14 GatherNode COOR 100001 - GATHER
7 SubqueryNode COOR 100001 - LET user = ... /
subquery /
3 SingletonNode COOR 1 * ROOT
4 TraversalNode COOR 1 - FOR user /
vertex /, link / edge / IN 1..1 / min..maxPathDepth / OUTBOUND pobj / startnode / ParentChildLink
5 CalculationNode COOR 1 - LET #8 = { "vertex" : user, "edge" : link } /
simple expression /
10 LimitNode COOR 1 - LIMIT 0, 1
6 ReturnNode COOR 1 - RETURN #8
8 CalculationNode COOR 100001 - LET #10 = { "r" : pobj, "c" : user[0].vertex } /
simple expression / / collections used: pobj : PlatformObj */
9 ReturnNode COOR 100001 - RETURN #10

Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
4 edge ParentChildLink false false n/a [ _from ] base OUTBOUND

Traversals on graphs:
Id Depth Vertex collections Edge collections Options Filter / Prune Conditions
4 1..1 ParentChildLink uniqueVertices: none, uniqueEdges: path

Optimization rules applied:
Id RuleName
1 optimize-subqueries
2 scatter-in-cluster
3 remove-unnecessary-remote-scatter

Dataset:
2 collections
PlatformObj - document, 100k rows
ParentChildLink - edge, 100k rows

Replication Factor & Number of Shards (Cluster only):
3 shards

Problem:
When executing join query on standalone setup, it takes 3-5 seconds
on clustered setup - 4-5 minutes
Can you help, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants