Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Innocuous query churns through all triples #200

Closed
kentcb opened this issue Feb 18, 2015 · 6 comments
Closed

Innocuous query churns through all triples #200

kentcb opened this issue Feb 18, 2015 · 6 comments
Assignees
Milestone

Comments

@kentcb
Copy link
Contributor

kentcb commented Feb 18, 2015

When my app starts it runs this harmless-looking (to me) query to determine the version of the data store:

var version = entityContext
    .VersionEntities
    .OrderByDescending(x => x.VersionMajor)
    .OrderByDescending(x => x.VersionMinor)
    .OrderByDescending(x => x.VersionBuild)
    .FirstOrDefault();

This causes the StoreSparqlDataset.Triples member to be invoked, enumerating all triples in the store, which takes about 10 seconds even for my fairly modest data set.

I tried an alternative approach:

var version = entityContext
    .VersionEntities
    .OrderByDescending(x => x.Timestamp)
    .FirstOrDefault();

But that has the same result. So, too, does this:

var test = entityContext
    .VersionEntities
    .ToList();

A query that discriminates on ID returns very quickly:

var version = entityContext
    .VersionEntities
    .Where(x => x.Id == "0.0.1")
    .FirstOrDefault();

Something seems severely wrong here. It seems that if there is no explicit discrimination on a property of the entity then all entities are traversed, even though I've specified a particular entity set.

@kal
Copy link
Contributor

kal commented Feb 18, 2015

I'll take a look at the SPARQL that we are generating but it may be something that I have no control over at the B* level.

@kentcb
Copy link
Contributor Author

kentcb commented Feb 18, 2015

Ouch, really? So it might be impossible to obtain all entities of a given type without waiting literally minutes on a mobile device? This is a major blocker if that's the case, and I honestly can't see how it would ever be practical to use B*. Please say it isn't so! 😨

@kal
Copy link
Contributor

kal commented Feb 19, 2015

As I say, I need to look at the SPARQL being generated and then how that is executed by the SPARQL query engine. The LINQ query should be resulting in a SPARQL query that matches on type. If it isn't then that is something that should be relatively easy to fix. However, if the SPARQL query is matching on type but the underlying query engine is still iterating all triples, fixing that requires overriding the dotNetRDF query engine which is something I can do, but it will not be a quick fix.

Don't give up yet ;-)

@kal kal added this to the 1.10 milestone Feb 23, 2015
@kal kal self-assigned this Feb 23, 2015
@kal kal closed this as completed in 247a42f Feb 23, 2015
@kal
Copy link
Contributor

kal commented Feb 23, 2015

I just pushed an update which I hope will address this issue. It turns out that the SPARQL generated by BrightstarDB is being processed in dotNetRDF as a join between a wildcard match on the left and a subquery on the right. The optimisation I've added in detects this case and refactors the query as a LeftJoin with the subquery on the left and the wildcard match on the right. This should avoid enumerating all triples. Let me know if it improves performance for you!

@kentcb
Copy link
Contributor Author

kentcb commented Feb 24, 2015

Thanks Kal. I just grabbed the latest, and my startup time (in simulator, debug build) dropped from 30 seconds to 2. 🎉

@kal
Copy link
Contributor

kal commented Feb 24, 2015

w00t! This is a big performance win across all platforms. Thank you for the report and simple repro that made it easy to track down!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants