Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent handling of implicit and explicit LIMIT clauses #1355

Merged
merged 7 commits into from
Jun 6, 2024

Conversation

RobinTF
Copy link
Collaborator

@RobinTF RobinTF commented May 23, 2024

There were some inconsistencies between explicit LIMIT clauses and the HTTP parameter send (which is used by the QLever UI) which are fixed by this PR.

Notable Changes after this PR is merged:

  • The analysis tree will now always display implicit limits applied during export
  • The "executed-implicitly-during-query-export" detail for the runtime information will only get applied if a limit is actually applied. Note that this will always be true for JSON exports because a default limit is always set there.
  • The "send" HTTP parameter that limits the maximum output is now also applied correctly to csv/tsv/turtle exports (but not set by default, previously it was ignored completely).

Copy link

codecov bot commented May 23, 2024

Codecov Report

Attention: Patch coverage is 98.27586% with 1 line in your changes missing coverage. Please review.

Project coverage is 88.87%. Comparing base (f9c3132) to head (205399a).
Report is 2 commits behind head on master.

Files Patch % Lines
src/engine/ExportQueryExecutionTrees.cpp 98.21% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1355      +/-   ##
==========================================
+ Coverage   88.77%   88.87%   +0.10%     
==========================================
  Files         324      326       +2     
  Lines       28676    28926     +250     
  Branches     3172     3205      +33     
==========================================
+ Hits        25456    25709     +253     
+ Misses       2067     2065       -2     
+ Partials     1153     1152       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mediaType == MediaType::qleverJson)) {
maxSend = MAX_NOF_ROWS_IN_RESULT;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the exporter only applied maxSend to json results, tsv/csv results were not affected. For now I decided to keep this behaviour, because we can easily stream csv/tsv results without having to aggregate a giant string in memory, whereas we can't do that in JSON due to limitations of the JSON framework we use.

src/engine/Server.cpp Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a very minor suggestion for renaming things.

src/engine/ExportQueryExecutionTrees.cpp Outdated Show resolved Hide resolved
Copy link

sonarcloud bot commented Jun 5, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@joka921 joka921 changed the title Refactoring preliminaries for lazy operations (Part 2) Consistent handling of implicit and explicit LIMIT clauses. Jun 6, 2024
@joka921 joka921 changed the title Consistent handling of implicit and explicit LIMIT clauses. Consistent handling of implicit and explicit LIMIT clauses Jun 6, 2024
@joka921 joka921 merged commit ccf592e into ad-freiburg:master Jun 6, 2024
19 checks passed
@RobinTF RobinTF deleted the max-send-changes branch June 6, 2024 09:07
hannahbast pushed a commit that referenced this pull request Sep 8, 2024
#1355 changed the effect of
the QLever-specific `send` parameter by clamping the `LIMIT` of the
query accordingly, and adding one if none was there. This has the effect
that, in the QLever UI, the result size of all queries with a result
with more tha 100 rows was since shown as `100`.

This PR tries to revert this unwanted effect while keeping the good
stuff from #1355 (which unified the code for the various export formats
and fixed various inconsistencies). The behavior is now as follows:

1. The query is first processed according to the `LIMIT` specified in
   the query. In particular, if no `LIMIT` is specified, the full result
   is materialized internally.

2. At most `send` bindings are exported. In the QLever JSON, there are
   now two fields: `resultsize` (the size of the internally computed
   result) and `sent` (the number of bindings exported).

The idea is to add an option in the QLever UI for automatically adding a
`LIMIT' to each query. That option should be activated by default (in
order to save computation time), but a user should be able to deactivate
it in case they are interested in the full result size.

Future work: When one is only interested in the size of the full result,
that can als be computed without fully materializing it. That would save
RAM compared to the current approach.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants