Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy initialize checkpoint tracker bit sets #27179

Merged
merged 8 commits into from Nov 2, 2017

Conversation

jasontedor
Copy link
Member

This local checkpoint tracker uses collections of bit sets to track which sequence numbers are complete, eventually removing these bit sets when the local checkpoint advances. However, these bit sets were eagerly allocated so that if a sequence number far ahead of the checkpoint was marked as completed, all bit sets between the "last" bit set and the bit set needed to track the marked sequence number were allocated. If this sequence number was too far ahead, the memory requirements could be excessive. This commit opts for a different strategy for holding on to these bit sets and enables them to be lazily allocated.

Relates #10708

@jasontedor jasontedor changed the title Lazy seq no bit sets Lazy initialize checkpoint tracker bit sets Oct 30, 2017
This local checkpoint tracker uses collections of bit sets to track
which sequence numbers are complete, eventually removing these bit sets
when the local checkpoint advances. However, these bit sets were eagerly
allocated so that if a sequence number far ahead of the checkpoint was
marked as completed, all bit sets between the "last" bit set and the bit
set needed to track the marked sequence number were allocated. If this
sequence number was too far ahead, the memory requirements could be
excessive. This commit opts for a different strategy for holding on to
these bit sets and enables them to be lazily allocated.
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some nits.

FixedBitSet current = processedSeqNo.getFirst();
long bitArrayKey = getBitArrayKey(checkpoint);
FixedBitSet current = processedSeqNo.get(bitArrayKey);
if (current == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when can this happen? it seems we only clean bit sets here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we clean the set (e.g., the checkpoint is equal to size - 1) but the checkpoint can not advance to size. In this case, when size is marked as completed, checkpoint will still be equal to size - 1 but we have already cleaned the set.

"updateCheckpoint is called but the bit following the checkpoint is not set";
try {
// keep it simple for now, get the checkpoint one by one; in the future we can optimize and read words
FixedBitSet current = processedSeqNo.getFirst();
long bitArrayKey = getBitArrayKey(checkpoint);
FixedBitSet current = processedSeqNo.get(bitArrayKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is a FixedBitSet I wonder why renamed everything to use bitArray? I don't mind. Just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I did not like the inconsistency that in some places it's referred to as a "bit array", and in some it's referred to as a "bit set". I wanted to make them all "bit set" but the problem is the setting index.seq_no.checkpoint.bit_arrays_size (currently never released to the world) refers to "bit array". If, and only if, you would be okay with changing the name of this setting to index.seq_no.checkpoint.bit_sets_size (in 6.0) then I would be okay with normalizing everything to "bit set".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with renaming the setting. I think I added it just be able to control things from tests and as an extra escape hatch if the default size proved wrong. We might want to just remove it. We can solve the test part differently and I'm less concerned now about the default size (I'm also thinking that 1024 is maybe too small)..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #27191.

* The checkpoint always falls in the current bit set or we have already cleaned it; if it falls on the last bit of the
* current bit set, we can clean it.
*/
if (checkpoint == (1 + bitArrayKey) * bitArraysSize - 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we maybe cache (1 + bitArrayKey) * bitArraysSize - 1 to something like lastSeqNoInArray ? I think it will be easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 91fbab3.

* Previously this would allocate the entire chain of bit sets to the one for the sequence number being marked; for very large
* sequence numbers this could lead to excessive memory usage resulting in out of memory errors.
*/
tracker.markSeqNoAsCompleted(randomNonNegativeLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. can we assert we allocated just one array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed b2cbfc5.

@@ -253,7 +271,17 @@ public void testResetCheckpoint() {
tracker.resetCheckpoint(localCheckpoint);
assertThat(tracker.getCheckpoint(), equalTo((long) localCheckpoint));
assertThat(tracker.getMaxSeqNo(), equalTo((long) maxSeqNo));
assertThat(tracker.processedSeqNo, empty());
assertThat(tracker.processedSeqNo, new BaseMatcher<LongObjectHashMap<FixedBitSet>>() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I despise assertTrue. Note that this gives a nicer error message if the assertion fails:

java.lang.AssertionError: 
Expected: empty
     but: was <[512=>org.apache.lucene.util.FixedBitSet@98761236]>
Expected :empty
     
Actual   :<[512=>org.apache.lucene.util.FixedBitSet@98761236]>

as opposed to

java.lang.AssertionError
	at __randomizedtesting.SeedInfo.seed([F9E3DE31F5D979AE:C4C54170CE328934]:0)
	at org.junit.Assert.fail(Assert.java:86)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.junit.Assert.assertTrue(Assert.java:52)
	at org.elasticsearch.index.seqno.LocalCheckpointTrackerTests.testResetCheckpoint(LocalCheckpointTrackerTests.java:276)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)

@jasontedor
Copy link
Member Author

Thanks @bleskes, I've responded to your feedback.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left one suggestion.

return seqNo / bitArraysSize;
}

private FixedBitSet getBitArrayForSeqNo(final long seqNo) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use indexOf(...) and then indexInsert(...) and indexGet(...) respectively to avoid determining what the slot is for a key several times?

final int slot = processedSeqNo.indexOf(bitArrayKey);
if (processedSeqNo.indexExists(slot) == false) {
   processedSeqNo.indexInsert(slot, bitArrayKey, new FixedBitSet(bitArraysSize));
}
return processedSeqNo.indexGet(slot);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martijnvg, I pushed cabda87.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And 68c4eab. 😐

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the indexGet at the end is not correct because the returned index is negative if the slot is not actually used. Yet, ~slot also does not work because the table could have been resized after the insert. I pushed 97bb3ca.

Copy link
Member

@martijnvg martijnvg Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I didn't realize that. Maybe we should do this to make optimal use of the slot?

final int slot = processedSeqNo.indexOf(bitArrayKey);
if (processedSeqNo.indexExists(slot) == false) {
   FixedBitSet bitSet = new FixedBitSet(bitArraysSize));
   processedSeqNo.indexInsert(slot, bitArrayKey, bitSet);
   return bitSet;
} else {
   return processedSeqNo.indexGet(slot);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you see a chance to avoid == false, you seize it. 😉

I pushed f4f0dae.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

* master:
  Remove checkpoint tracker bit sets setting
  Fix stable BWC branch detection logic
  Fix logic detecting unreleased versions
  Enhances exists queries to reduce need for `_field_names` (elastic#26930)
  Added new terms_set query
  Set request body to required to reflect the code base (elastic#27188)
  Update Docker docs for 6.0.0-rc2 (elastic#27166)
  Add version 6.0.0
  Docs: restore now fails if it encounters incompatible settings (elastic#26933)
  Convert index blocks to cluster block exceptions (elastic#27050)
  [DOCS] Link remote info API in Cross Cluster Search docs page
  Fix Laplace scorer to multiply by alpha (and not add) (elastic#27125)
  [DOCS] Clarify migrate guide and search request validation
  Raise IllegalArgumentException if query validation failed (elastic#26811)
  prevent duplicate fields when mixing parent and root nested includes (elastic#27072)
  TopHitsAggregator must propagate calls to `setScorer`. (elastic#27138)
@jasontedor jasontedor merged commit 59657ad into elastic:master Nov 2, 2017
jasontedor added a commit that referenced this pull request Nov 2, 2017
This local checkpoint tracker uses collections of bit sets to track
which sequence numbers are complete, eventually removing these bit sets
when the local checkpoint advances. However, these bit sets were eagerly
allocated so that if a sequence number far ahead of the checkpoint was
marked as completed, all bit sets between the "last" bit set and the bit
set needed to track the marked sequence number were allocated. If this
sequence number was too far ahead, the memory requirements could be
excessive. This commit opts for a different strategy for holding on to
these bit sets and enables them to be lazily allocated.

Relates #27179
@jasontedor jasontedor deleted the lazy-seq-no-bit-sets branch November 2, 2017 01:27
@clintongormley clintongormley added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Sequence IDs labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v6.1.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants