Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use separate BitSet cache in Doc Level Security #43669

Merged
merged 12 commits into from
Jul 3, 2019

Conversation

tvernum
Copy link
Contributor

@tvernum tvernum commented Jun 27, 2019

Document level security was depending on the shared
"BitsetFilterCache" which (by design) never expires its entries.

However, when using DLS queries - particularly templated ones - the
number (and memory usage) of generated bitsets can be significant.

This change introduces a new cache specifically for BitSets used in
DLS queries, that has memory usage constraints and access time expiry.

The whole cache is automatically cleared if the role cache is cleared.
Individual bitsets are cleared when the corresponding lucene index
reader is closed.

Closes: #30974

Document level security was depending on the shared
"BitsetFilterCache" which (by design) never expires its entries.

However, when using DLS queries - particularly templated ones - the
number (and memory usage) of generated bitsets can be significant.

This change introduces a new cache specifically for BitSets used in
DLS queries, that has memory usage constraints and access time expiry.

The whole cache is automatically cleared if the role cache is cleared.
Individual bitsets are cleared when the corresponding lucene index
reader is closed.
@tvernum tvernum added :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v7.3.0 v8.0.0 labels Jun 27, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

@tvernum
Copy link
Contributor Author

tvernum commented Jun 27, 2019

@elasticmachine update branch

@tvernum tvernum marked this pull request as ready for review June 28, 2019 07:30
@tvernum tvernum requested review from jkakavas and jpountz June 28, 2019 07:30
@tvernum
Copy link
Contributor Author

tvernum commented Jun 28, 2019

I haven't written docs for these new cache config properties (ttl and max_bytes) but will do so on Monday.

I'm curious about other people's opinions on the default values for these.

  • I've set max_bytes to 50Mb, because it seems like it would be a problem to use more than that on a small (1Gb) node, but I don't really have the information at hand to know whether that's big enough to store typical workloads or whether users will have cache thrashing.

  • I set ttl to a week (7 * 24 hours) because I think we want to depend on the max_bytes setting to keep the cache under control, rather than aggresively expiring entries if they aren't used for a day or so.

I could easily be persuaded to a different set of values, but since the main purpose of this change it to reduce the amount of memory used by DLS bitsets, we don't want to make those numbers too big.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to find a better way to evict on closed segments. We used to have other caches that required linear scans for eviction and this caused issues, eg. when closing a node because then all its indices would close one after the other, and since close listeners are called synchonously from IndexReader#close, this would run in O(n^2) as a function of the number of segments on the node.


@Override
public void onClose(IndexReader.CacheKey ownerCoreCacheKey) {
bitsetCacheHelper.removeKeysIf(key -> key.matchesIndex(ownerCoreCacheKey));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have had performance issues with using linear scans over cache keys in close listeners in the past. I'm not sure how to address it, some other caches have two levels of keys, ie. they look like a Map<IndexReader.CacheKey, Map<Query, Value>>. But maybe we could also separately maintain a mapping between indexreader cache keys and queries that have a cache entry on that particular segment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look at maintaining a secondary mapping. The problem with the two-level cache is that we don't have good ways to manage expiry.
The lower level caches could have access-time expiry, but it's hard to do memory-size expiry.
The upper level caches can do memory-size expiry, but will expire everything under a IndexReader.CacheKey rather than expiring the LRU queries.

this.bitsetCache = CacheBuilder.<BitsetCacheKey, BitSet>builder()
.setExpireAfterAccess(ttl)
.setMaximumWeight(size.getBytes())
.weigher((key, bitSet) -> bitSet.ramBytesUsed()).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe return 0 if bitSet == NULL_MARKER?

# Conflicts:
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/accesscontrol/SecurityIndexSearcherWrapper.java
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/security/authz/accesscontrol/SecurityIndexReaderWrapperIntegrationTests.java
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/security/authz/accesscontrol/SecurityIndexSearcherWrapperUnitTests.java
#	x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/Security.java
@tvernum tvernum requested a review from jpountz July 1, 2019 11:46
Copy link
Member

@jkakavas jkakavas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes / logic looks good to me, I just left some wording nits and a suggestion about the naming of the related settings. I don't feel particularly confident with this part but thankfully @jpountz makes up for it

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments but otherwise it looks good to me, including the mapping from queries to cache keys.

- Rename "max_bytes" to "size"
- Rename "dls_fls" to "dls"
- Fix typos in Javadoc
@tvernum
Copy link
Contributor Author

tvernum commented Jul 3, 2019

@elasticmachine update branch

@tvernum tvernum merged commit bb130f5 into elastic:master Jul 3, 2019
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Jul 3, 2019
Document level security was depending on the shared
"BitsetFilterCache" which (by design) never expires its entries.

However, when using DLS queries - particularly templated ones - the
number (and memory usage) of generated bitsets can be significant.

This change introduces a new cache specifically for BitSets used in
DLS queries, that has memory usage constraints and access time expiry.

The whole cache is automatically cleared if the role cache is cleared.
Individual bitsets are cleared when the corresponding lucene index
reader is closed.

The cache defaults to 50MB, and entries expire if unused for 7 days.

Backport of: elastic#43669
tvernum added a commit that referenced this pull request Jul 3, 2019
Document level security was depending on the shared
"BitsetFilterCache" which (by design) never expires its entries.

However, when using DLS queries - particularly templated ones - the
number (and memory usage) of generated bitsets can be significant.

This change introduces a new cache specifically for BitSets used in
DLS queries, that has memory usage constraints and access time expiry.

The whole cache is automatically cleared if the role cache is cleared.
Individual bitsets are cleared when the corresponding lucene index
reader is closed.

The cache defaults to 50MB, and entries expire if unused for 7 days.

Backport of: #43669
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Jul 9, 2019
Two new settings were introduced in elastic#43669 (bb130f5) to control the
behaviour of the Document Level Security BitSet cache.

This change adds documentation for these 2 settings.
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Jul 16, 2019
Two new settings were introduced in elastic#43669 (bb130f5) to control the
behaviour of the Document Level Security BitSet cache.

This change adds documentation for these 2 settings.

Backport of: elastic#44100
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Jul 16, 2019
Two new settings were introduced in elastic#43669 (bb130f5) to control the
behaviour of the Document Level Security BitSet cache.

This change adds documentation for these 2 settings.

Backport of: elastic#44100
tvernum added a commit that referenced this pull request Jul 16, 2019
Two new settings were introduced in #43669 to control the
behaviour of the Document Level Security BitSet cache.

This change adds documentation for these 2 settings.

Backport of: #44100
tvernum added a commit that referenced this pull request Jul 16, 2019
Two new settings were introduced in #43669 to control the
behaviour of the Document Level Security BitSet cache.

This change adds documentation for these 2 settings.

Backport of: #44100
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Dec 31, 2019
The Document Level Security BitSet Cache (see elastic#43669) had a default
configuration of "small size, long lifetime". However, this is not
a very useful default as the cache is most valuable for BitSets that
take a long time to construct, which is (generally speaking) the same
ones that operate over a large number of documents and contain many
bytes.

This commit changes the cache to be "large size, short lifetime" so
that it can hold bitsets representing billions of documents, but
releases memory quickly.

The new defaults are 10% of heap, and 2 hours.

This also adds some logging when a single BitSet exceeds the size of
the cache and when the cache is full.
tvernum added a commit that referenced this pull request Jan 13, 2020
The Document Level Security BitSet Cache (see #43669) had a default
configuration of "small size, long lifetime". However, this is not
a very useful default as the cache is most valuable for BitSets that
take a long time to construct, which is (generally speaking) the same
ones that operate over a large number of documents and contain many
bytes.

This commit changes the cache to be "large size, short lifetime" so
that it can hold bitsets representing billions of documents, but
releases memory quickly.

The new defaults are 10% of heap, and 2 hours.

This also adds some logging when a single BitSet exceeds the size of
the cache and when the cache is full.

Resolves: #49260
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Jan 14, 2020
The Document Level Security BitSet Cache (see elastic#43669) had a default
configuration of "small size, long lifetime". However, this is not
a very useful default as the cache is most valuable for BitSets that
take a long time to construct, which is (generally speaking) the same
ones that operate over a large number of documents and contain many
bytes.

This commit changes the cache to be "large size, short lifetime" so
that it can hold bitsets representing billions of documents, but
releases memory quickly.

The new defaults are 10% of heap, and 2 hours.

This also adds some logging when a single BitSet exceeds the size of
the cache and when the cache is full.

Resolves: elastic#49260
Backport of: elastic#50535
tvernum added a commit that referenced this pull request Jan 14, 2020
The Document Level Security BitSet Cache (see #43669) had a default
configuration of "small size, long lifetime". However, this is not
a very useful default as the cache is most valuable for BitSets that
take a long time to construct, which is (generally speaking) the same
ones that operate over a large number of documents and contain many
bytes.

This commit changes the cache to be "large size, short lifetime" so
that it can hold bitsets representing billions of documents, but
releases memory quickly.

The new defaults are 10% of heap, and 2 hours.

This also adds some logging when a single BitSet exceeds the size of
the cache and when the cache is full.

Backport of: #50535
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
The Document Level Security BitSet Cache (see elastic#43669) had a default
configuration of "small size, long lifetime". However, this is not
a very useful default as the cache is most valuable for BitSets that
take a long time to construct, which is (generally speaking) the same
ones that operate over a large number of documents and contain many
bytes.

This commit changes the cache to be "large size, short lifetime" so
that it can hold bitsets representing billions of documents, but
releases memory quickly.

The new defaults are 10% of heap, and 2 hours.

This also adds some logging when a single BitSet exceeds the size of
the cache and when the cache is full.

Resolves: elastic#49260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Give document-level security its own cache.
5 participants