-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide framework for generic lazily evaluated operation results #1350
base: master
Are you sure you want to change the base?
Conversation
result._resultPointer->resultTable()->idTable().numColumns(); | ||
LOG(DEBUG) << "Computed result of size " << resultNumRows << " x " | ||
<< resultNumCols << std::endl; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this debug message provide any real benefit to make it worth somehow incorporating it into lazily evaluated operations?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1350 +/- ##
==========================================
- Coverage 88.89% 88.06% -0.84%
==========================================
Files 327 329 +2
Lines 28974 29430 +456
Branches 3210 3271 +61
==========================================
+ Hits 25756 25917 +161
- Misses 2066 2331 +265
- Partials 1152 1182 +30 ☔ View full report in Codecov by Sentry. |
This PR contains all the changes from the infrastructure for lazy operation evaluation (#1350) that are simple and repetitive, but touch many files. In particular: * Rename the `ResultTable` class to `Result` (a TODO suggested by @hannahbast some time ago). * Add a new parameter `bool requestLaziness` to `Operation::computeResult`. This parameter is currently unused.
51eaabf
to
93a5892
Compare
c7ebab6
to
000af28
Compare
This makes the code much simpler, and makes no difference for almost all queries. The expensive part (reading from disk and decompressing) is still done in parallel, only the writing to the `IdTable` is now serialized + there is an additional copy compared to before. An example query that is slower now because of this change is: materialize a large index scan (for example, for the predicate `rdf:type`) and group by subject (there is a shortcut for grouping by object when there are few objects). But such queries will become lazy soon anyway (see #1350) and then this will be irrelevant.
…eiburg#1323) This makes the code much simpler, and makes no difference for almost all queries. The expensive part (reading from disk and decompressing) is still done in parallel, only the writing to the `IdTable` is now serialized + there is an additional copy compared to before. An example query that is slower now because of this change is: materialize a large index scan (for example, for the predicate `rdf:type`) and group by subject (there is a shortcut for grouping by object when there are few objects). But such queries will become lazy soon anyway (see ad-freiburg#1350) and then this will be irrelevant.
|
Still WIP. Currently missing: