Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query planning and traversal slowdowns #6161

Closed
flyingsilverfin opened this issue Feb 10, 2021 · 2 comments
Closed

Query planning and traversal slowdowns #6161

flyingsilverfin opened this issue Feb 10, 2021 · 2 comments

Comments

@flyingsilverfin
Copy link
Member

flyingsilverfin commented Feb 10, 2021

In a synthetic benchmark RuleScalingTest (found at https://github.com/flyingsilverfin/grakn/tree/reasoner-bench), the following findings were made:

  • In the synthetic benchmark, when generating many rules (set N = 40, generates 404=160 rules, and 240=80 relation types, and 1*40=40 entity types), validating rules takes a long time. This can be because running type resolution can lead to query plans with between 30-50 variables and edges.
    • Allowing a large amount of time, the query planner reaches OPTIMAL in 13 seconds, or even more.
    • Executing the traversal can take a while if not planning sufficiently long, up to 10 seconds or more. In the optimal plans, the traversals finish quickly. Each type resolver query for the rule validation only returns 256 results, no matter the size of the synthetic benchmark

I also found:

  • setting the "HIGHER_TIME_LIMIT_MILLIS" parameter in the QP (which the type resolver triggers) to 0 means the planner always takes as much time as it requires to finish
  • setting it to 1ms, means that we constantly plan a small amount, as we dont really hit OPTIMAL. In these plans, we also execute the queries slowly
  • Observing that some of the slow query plans start at a Role type, arbitrarily adding a 4x cost multiplier to vertices with scoped labels can improve the execution time of plans

In this benchmark, we also execute the benchmark query rather slowly, as for each of the rules we trigger, we load the rule, then run type resolver (again) to analyse the rule. This takes a significant amount of time. Preloading the rules avoids this overhead and measures "just" reasoning time.

If we allow the schema to load fully (this can take a while with N = 40), and do the rule pre-loading trick, we find:

  • the reasoning query finishes in ~3 seconds
  • if we add printouts of how much time was spent query planning during the execution of the reasoning query, we find it takes a total of 7 seconds -- this is possible because reasoning will execute different queries in parallel.

In general, we could use this test to determine how to optimise query planning, type resolution, and how/when to pre-load rules if at all.

@haikalpribadi
Copy link
Member

This issue will be solved by #6194. I will work on it next week, and have it out in the subsequent release.

@flyingsilverfin
Copy link
Member Author

This has been implemented and is no longer an issue. Migrating the RuleScaling test is noted in #6580

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants