Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FedX: Missing limit pushing of ASK queries with single statement pattern causes poor query performance #5033

Closed
aschwarte10 opened this issue Jun 18, 2024 · 0 comments · Fixed by #5034
Assignees
Labels
🐞 bug issue is a bug 📦 fedx fedx: optimized federated query support
Milestone

Comments

@aschwarte10
Copy link
Contributor

aschwarte10 commented Jun 18, 2024

Current Behavior

Currently for ASK queries having a single statement pattern the limit 1 is not pushed into the sub query, while it is implement for select queries.

The effect is poor behavior.

Example:

ASK { ?person a foaf:Person }

Currently the implementation of the federation will fetch all persons available in the federation members (and locally checks that there is at least one binding), though we are just interested in the existence

Note: for SELECT queries FedX already has an optimizer that pushes the limit for such trivical cases into the sub-query.

Expected Behavior

The limit is pushed and the same optimizations as for SELECT queries with a single statements is applied.

Steps To Reproduce

To seee poor performance, run an ASK query on a large database with millions of instances.

The difference in the query plan is the "Upper Limit: N" attached to the statement pattern

QueryRoot
   Slice (limit=1)
      StatementSourcePattern
         Var (name=person)
         Var (name=_const_f5e5585a_uri, value=http://www.w3.org/1999/02/22-rdf-syntax-ns#type, anonymous)
         Var (name=_const_e1df31e0_uri, value=http://xmlns.com/foaf/0.1/Person, anonymous)
         StatementSource (id=sparql_localhost:18080_repositories_endpoint1, type=REMOTE)
         StatementSource (id=sparql_localhost:18080_repositories_endpoint2, type=REMOTE)
         Upper Limit: 1

Version

4.3.12

Are you interested in contributing a solution yourself?

Yes

Anything else?

No response

@aschwarte10 aschwarte10 added the 🐞 bug issue is a bug label Jun 18, 2024
@aschwarte10 aschwarte10 self-assigned this Jun 18, 2024
@aschwarte10 aschwarte10 added the 📦 fedx fedx: optimized federated query support label Jun 18, 2024
@aschwarte10 aschwarte10 added this to the 5.0.0 milestone Jun 18, 2024
aschwarte10 added a commit that referenced this issue Jun 18, 2024
This change makes sure to push limits for simple ASK queries with a
single statement patterns into the query.

The optimization is the same as applied for simple SELECT queries with a
LIMIT.

Rational: if the limit is not pushed, the federation engine will first fetch all data for the statement pattern and only then locally check if there is at least one, i.e it will cause performance issues and memory pressure when there are many triples matching the statement pattern (for instance millions of persons).
aschwarte10 added a commit that referenced this issue Jun 18, 2024
This change makes sure to push limits for simple ASK queries with a
single statement patterns into the query.

The optimization is the same as applied for simple SELECT queries with a
LIMIT.

Rational: if the limit is not pushed, the federation engine will first fetch all data for the statement pattern and only then locally check if there is at least one, i.e it will cause performance issues and memory pressure when there are many triples matching the statement pattern (for instance millions of persons).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug issue is a bug 📦 fedx fedx: optimized federated query support
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant