Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RAM usage estimation of LiveVersionMap. #20123

Merged

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Aug 23, 2016

I was writing tests for RAM usage estimation of LiveVersionMap and found a
couple issues:

  • The BytesRef objects used as uids were oversized since they were created
    via new BytesRef(CharSequence) which creates a byte[] whose size is 3x
    the length of the provided char sequence. Given that our uids are most of
    times ASCII sequences, this is a waste of memory.
  • VersionValue was using translogLocation.size instead of
    translogLocation.ramBytesUsed() for RAM estimation, which is completely
    unrelated to the memory footprint of the Translog.Location object.

In particular, the latter issue could cause RAM usage estimation to be
significantly overestimated, especially on large documents.

I also added tests for ram accounting.

Relates #19787

@jpountz
Copy link
Contributor Author

jpountz commented Aug 23, 2016

Fortunately this bug was never released, it is 5.0-only.

@@ -146,7 +157,7 @@ VersionValue getUnderLock(final Term uid) {

/** Adds this uid/version to the pending adds map. */
void putUnderLock(BytesRef uid, VersionValue version) {

assert uid.bytes.length == uid.length; // otherwise we are wasting memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe put the actual and expected length in the message

@s1monw
Copy link
Contributor

s1monw commented Aug 23, 2016

left a minor LGTM otherwise - good catch :)

I was writing tests for RAM usage estimation of LiveVersionMap and found a
couple issues:
 - The BytesRef objects used as uids were oversized since they were created
   via `new BytesRef(CharSequence)` which creates a `byte[]` whose size is 3x
   the length of the provided char sequence. Given that our uids are most of
   times ASCII sequences, this is a waste of memory.
 - `VersionValue` was using `translogLocation.size` instead of
   `translogLocation.ramBytesUsed()` for RAM estimation, which is completely
   unrelated to the memory footprint of the `Translog.Location` object.

In particular, the latter issue could cause RAM usage estimation to be
significantly overestimated, especially on large documents.

I also added tests for ram accounting.
@jpountz jpountz force-pushed the fix/live_version_map_ram_usage_estimation branch from 85507fa to 5d6c9b0 Compare August 24, 2016 07:54
@jpountz jpountz merged commit 5d6c9b0 into elastic:master Aug 24, 2016
@clintongormley clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Translog :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v5.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants