Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Caching #282

Closed
wants to merge 6 commits into from
Closed

Improve Caching #282

wants to merge 6 commits into from

Conversation

niklas88
Copy link
Member

@niklas88 niklas88 commented Sep 6, 2019

This adds cache pinning, meaning an element can be marked so that it is never cleared from the cache unless erased explicitly. Together with this major feature we don't keep aborted Operations in the cache, fix wrongly calculated variable column maps for IndexScan and use unique_ptr for the _rootOperation in QueryExecutionTree to make it explicit and tested that this is owned.

The theory behind caching aborted Operations was that they would fail
again anyway. However, under memory pressure Operations may fail due to
failing allocations and will succeed in the future once memory pressure
drops, if we kept them in the cache we wouldn't try again.

While we are at it, improve wording and use an extra abort() method on
Operation.
this allows to mark elements as pinned, meaning they will not be dropped
due to capacity constraints or clear() but only with an explicit erase
(e.g. as used for aborted queries).
The Operation is owned so it should be a unique_ptr not shared_ptr. This
will hopefully save quite a few memory barriers (for atomic ref
counting) and is cleaner.

We also prevent the copying of QueryExecutionTrees which was previously
done in several places and may be expensive but is definitely
unexpected.
They are actually kept in the order of the triple file as seen by the
setVariableColumn() calls in QueryPlanner.cpp
The name always suggested that it carried context per query while in
fact it held the SubtreeCache which should really live for many queries.
Since it is indeed useful to have a per query context we move the
SubtreeCache outside the QueryExecutionContext and make it truly per
query.
This allows to pin all subtrees of a query to the cache and have that as
a per query option. On the QLever server this is exposed as the
"pinsubtrees" URL param. If set to true all subtrees in the query will
be cached. This is especially useful for completion queries and the
like.
@niklas88
Copy link
Member Author

@floriankramer can you take a look so we can merge this? It's already live on the Wikidata Full backend and was used during the SPP presentation so it seems to work

@niklas88 niklas88 closed this Oct 1, 2019
@niklas88 niklas88 deleted the feat_cache_improve_two branch October 1, 2019 15:52
@niklas88 niklas88 restored the feat_cache_improve_two branch October 1, 2019 15:53
@niklas88 niklas88 reopened this Oct 1, 2019
@niklas88
Copy link
Member Author

niklas88 commented Oct 1, 2019

@lehmann-4178656ch for some weird reason Travis CI tries to build the wrong commit cb8e8f3 for this PR, do you have any idea why it doesn't use the correct commit 1c85b1f I see locally as well as in the PR on GitHub?

@lehmann-4178656ch
Copy link
Member

lehmann-4178656ch commented Oct 1, 2019

That's a strange issue, not the first time travis has had such a hickup but there it happened because of a change between triggering the build and checkout. This should not apply here because the selected commit is completely unrelated.

If I follow the steps mentioned in the forum in a different topic I get cb8e8f31488422c7562552583d21b6d88258c057 as HEAD:

$ git clone git@github.com:ad-freiburg/QLever.git
$ cd QLever
$ git fetch origin +refs/pull/282/merge
$ git checkout -qf FETCH_HEAD
$ git log
commit cb8e8f31488422c7562552583d21b6d88258c057 (HEAD)
Merge: f2ece56 1c85b1f
...

Sadly I don't know why this happens, but I think travis is using something like that to get the commit.

@niklas88
Copy link
Member Author

niklas88 commented Oct 1, 2019

That's a strange issue, not the first time travis has had such a hickup but there it happened because of a change between triggering the build and checkout. This should not apply here because the selected commit is completely unrelated.

If I follow the steps mentioned in the forum in a different topic I get cb8e8f31488422c7562552583d21b6d88258c057 as HEAD:

$ git clone git@github.com:ad-freiburg/QLever.git
$ cd QLever
$ git fetch origin +refs/pull/282/merge
$ git checkout -qf FETCH_HEAD
$ git log
commit cb8e8f31488422c7562552583d21b6d88258c057 (HEAD)
Merge: f2ece56 1c85b1f
...

Sadly I don't know why this happens, but I think travis is using something like that to get the commit.

Thanks for investigating!

I guess I have to recreate the PR

@niklas88
Copy link
Member Author

niklas88 commented Oct 1, 2019

Addendum I think it might be because I accidentally closed the PR and reopened it. Instead I should create a new PR from the branch.

@niklas88 niklas88 closed this Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants