Improve Caching #282

niklas88 · 2019-09-06T15:09:56Z

This adds cache pinning, meaning an element can be marked so that it is never cleared from the cache unless erased explicitly. Together with this major feature we don't keep aborted Operations in the cache, fix wrongly calculated variable column maps for IndexScan and use unique_ptr for the _rootOperation in QueryExecutionTree to make it explicit and tested that this is owned.

The theory behind caching aborted Operations was that they would fail again anyway. However, under memory pressure Operations may fail due to failing allocations and will succeed in the future once memory pressure drops, if we kept them in the cache we wouldn't try again. While we are at it, improve wording and use an extra abort() method on Operation.

this allows to mark elements as pinned, meaning they will not be dropped due to capacity constraints or clear() but only with an explicit erase (e.g. as used for aborted queries).

The Operation is owned so it should be a unique_ptr not shared_ptr. This will hopefully save quite a few memory barriers (for atomic ref counting) and is cleaner. We also prevent the copying of QueryExecutionTrees which was previously done in several places and may be expensive but is definitely unexpected.

They are actually kept in the order of the triple file as seen by the setVariableColumn() calls in QueryPlanner.cpp

The name always suggested that it carried context per query while in fact it held the SubtreeCache which should really live for many queries. Since it is indeed useful to have a per query context we move the SubtreeCache outside the QueryExecutionContext and make it truly per query.

This allows to pin all subtrees of a query to the cache and have that as a per query option. On the QLever server this is exposed as the "pinsubtrees" URL param. If set to true all subtrees in the query will be cached. This is especially useful for completion queries and the like.

niklas88 · 2019-09-25T10:40:39Z

@floriankramer can you take a look so we can merge this? It's already live on the Wikidata Full backend and was used during the SPP presentation so it seems to work

niklas88 · 2019-10-01T17:33:04Z

@lehmann-4178656ch for some weird reason Travis CI tries to build the wrong commit cb8e8f3 for this PR, do you have any idea why it doesn't use the correct commit 1c85b1f I see locally as well as in the PR on GitHub?

lehmann-4178656ch · 2019-10-01T18:37:08Z

That's a strange issue, not the first time travis has had such a hickup but there it happened because of a change between triggering the build and checkout. This should not apply here because the selected commit is completely unrelated.

If I follow the steps mentioned in the forum in a different topic I get cb8e8f31488422c7562552583d21b6d88258c057 as HEAD:

$ git clone git@github.com:ad-freiburg/QLever.git
$ cd QLever
$ git fetch origin +refs/pull/282/merge
$ git checkout -qf FETCH_HEAD
$ git log
commit cb8e8f31488422c7562552583d21b6d88258c057 (HEAD)
Merge: f2ece56 1c85b1f
...

Sadly I don't know why this happens, but I think travis is using something like that to get the commit.

niklas88 · 2019-10-01T18:58:49Z

That's a strange issue, not the first time travis has had such a hickup but there it happened because of a change between triggering the build and checkout. This should not apply here because the selected commit is completely unrelated.

If I follow the steps mentioned in the forum in a different topic I get cb8e8f31488422c7562552583d21b6d88258c057 as HEAD:
$ git clone git@github.com:ad-freiburg/QLever.git
$ cd QLever
$ git fetch origin +refs/pull/282/merge
$ git checkout -qf FETCH_HEAD
$ git log
commit cb8e8f31488422c7562552583d21b6d88258c057 (HEAD)
Merge: f2ece56 1c85b1f
...
Sadly I don't know why this happens, but I think travis is using something like that to get the commit.

Thanks for investigating!

I guess I have to recreate the PR

niklas88 · 2019-10-01T19:01:37Z

Addendum I think it might be because I accidentally closed the PR and reopened it. Instead I should create a new PR from the branch.

niklas88 added 6 commits September 4, 2019 17:52

Add basic pinning functionality to LRUCache

08fbfd2

this allows to mark elements as pinned, meaning they will not be dropped due to capacity constraints or clear() but only with an explicit erase (e.g. as used for aborted queries).

Fix wrong variable columns in IndexScan

bace316

They are actually kept in the order of the triple file as seen by the setVariableColumn() calls in QueryPlanner.cpp

niklas88 requested a review from floriankramer September 6, 2019 15:09

niklas88 mentioned this pull request Sep 6, 2019

Remove variable/column map redundancy (WIP) #281

Closed

niklas88 closed this Oct 1, 2019

niklas88 deleted the feat_cache_improve_two branch October 1, 2019 15:52

niklas88 restored the feat_cache_improve_two branch October 1, 2019 15:53

niklas88 reopened this Oct 1, 2019

niklas88 closed this Oct 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Caching #282

Improve Caching #282

niklas88 commented Sep 6, 2019

niklas88 commented Sep 25, 2019

niklas88 commented Oct 1, 2019

lehmann-4178656ch commented Oct 1, 2019 •

edited

niklas88 commented Oct 1, 2019 •

edited

niklas88 commented Oct 1, 2019

Improve Caching #282

Improve Caching #282

Conversation

niklas88 commented Sep 6, 2019

niklas88 commented Sep 25, 2019

niklas88 commented Oct 1, 2019

lehmann-4178656ch commented Oct 1, 2019 • edited

niklas88 commented Oct 1, 2019 • edited

niklas88 commented Oct 1, 2019

lehmann-4178656ch commented Oct 1, 2019 •

edited

niklas88 commented Oct 1, 2019 •

edited