Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide framework for generic lazily evaluated operation results #1350

Draft
wants to merge 64 commits into
base: master
Choose a base branch
from

Conversation

RobinTF
Copy link
Collaborator

@RobinTF RobinTF commented May 18, 2024

Still WIP. Currently missing:

  • Discussion about remaining TODOs
  • Lots of unit tests
  • Also most likely some functions need to be broken up into smaller pieces once we found everything else to be working "correctly".
  • Documentation of all newly introduced functions once they're becoming somewhat "final"
  • Cold Fusion & World domination?

result._resultPointer->resultTable()->idTable().numColumns();
LOG(DEBUG) << "Computed result of size " << resultNumRows << " x "
<< resultNumCols << std::endl;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this debug message provide any real benefit to make it worth somehow incorporating it into lazily evaluated operations?

Copy link

codecov bot commented May 18, 2024

Codecov Report

Attention: Patch coverage is 47.47664% with 281 lines in your changes missing coverage. Please review.

Project coverage is 88.06%. Comparing base (f9e730c) to head (c465685).

Files Patch % Lines
src/engine/Result.cpp 25.89% 182 Missing and 4 partials ⚠️
src/engine/Operation.cpp 58.33% 32 Missing and 3 partials ⚠️
src/engine/Filter.cpp 56.75% 15 Missing and 1 partial ⚠️
src/engine/IndexScan.cpp 5.88% 15 Missing and 1 partial ⚠️
src/engine/ExportQueryExecutionTrees.cpp 73.33% 7 Missing and 5 partials ⚠️
src/util/Cache.h 82.08% 0 Missing and 12 partials ⚠️
src/util/CacheableGenerator.h 0.00% 2 Missing ⚠️
src/engine/QueryExecutionTree.cpp 83.33% 0 Missing and 1 partial ⚠️
src/engine/QueryPlanner.cpp 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1350      +/-   ##
==========================================
- Coverage   88.89%   88.06%   -0.84%     
==========================================
  Files         327      329       +2     
  Lines       28974    29430     +456     
  Branches     3210     3271      +61     
==========================================
+ Hits        25756    25917     +161     
- Misses       2066     2331     +265     
- Partials     1152     1182      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

joka921 pushed a commit that referenced this pull request May 23, 2024
This PR contains all the changes from the infrastructure for lazy operation evaluation (#1350)  that are simple and repetitive, but touch many files. In particular:

* Rename the `ResultTable` class to `Result` (a TODO suggested by @hannahbast some time ago).
* Add a new parameter `bool requestLaziness` to `Operation::computeResult`. This parameter is currently unused.
hannahbast pushed a commit that referenced this pull request Jun 13, 2024
This makes the code much simpler, and makes no difference for almost all queries. The expensive part (reading from disk and decompressing) is still done in parallel, only the writing to the `IdTable` is now serialized + there is an additional copy compared to before. An example query that is slower now because of this change is: materialize a large index scan (for example, for the predicate `rdf:type`) and group by subject (there is a shortcut for grouping by object when there are few objects). But such queries will become lazy soon anyway (see #1350) and then this will be irrelevant.
realHannes pushed a commit to realHannes/qlever that referenced this pull request Jun 15, 2024
…eiburg#1323)

This makes the code much simpler, and makes no difference for almost all queries. The expensive part (reading from disk and decompressing) is still done in parallel, only the writing to the `IdTable` is now serialized + there is an additional copy compared to before. An example query that is slower now because of this change is: materialize a large index scan (for example, for the predicate `rdf:type`) and group by subject (there is a shortcut for grouping by object when there are few objects). But such queries will become lazy soon anyway (see ad-freiburg#1350) and then this will be irrelevant.
Copy link

sonarcloud bot commented Jun 15, 2024

Quality Gate Passed Quality Gate passed

Issues
32 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant