Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hive-22760] Adding Clock based eviction strategy. #944

Closed
wants to merge 7 commits into from

Conversation

b-slim
Copy link
Contributor

@b-slim b-slim commented Mar 7, 2020

This Pr adds a new Eviction strategy based on simple clock algorithm.
Main goal is to bring down the memory overhead of the cache accounting and make it lock free.
To reduce the memory overhead i have removed the unwanted fields used by lrfu strategy.
To Avoid locking i have added one bit as part of the buffer state that is modified as cas operation.


This change is Reviewable

Change-Id: I6f2877f92557b05f9334700b007244dde3dc52c4

add methods to set and unset the clock ref bit

Change-Id: I0d8406befb214cf696d4f48982cff2fccc58a2a4
Change-Id: I0e3966e3ef5fcc3ecab1a5fbf0a6c1d2f57fcc2d
Change-Id: I17abbae9573a41faf969558d148051f0004caef8
Copy link
Contributor

@odraese odraese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments about style/efficiency but no real bugs.

@@ -4187,6 +4188,8 @@ private static void populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
"partitions and tables) for reporting."),
LLAP_USE_LRFU("hive.llap.io.use.lrfu", true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you want to flag this as deprecated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point.

@@ -4187,6 +4188,8 @@ private static void populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
"partitions and tables) for reporting."),
LLAP_USE_LRFU("hive.llap.io.use.lrfu", true,
"Whether ORC low-level cache should use LRFU cache policy instead of default (FIFO)."),
LLAP_IO_CACHE_STRATEGY("hive.llap.io.replacement.strategy", "lrfu", new StringSet("fifo", "clock", "lrfu"),
"Cache replacement strategy used by low-level (default to LRFU)."),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used by low-level what?
would rewrite to "Replacement strategy for IO/data cache"

@@ -4196,6 +4199,9 @@ private static void populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
LLAP_LRFU_BP_WRAPPER_SIZE("hive.llap.io.lrfu.bp.wrapper.size", 64, "thread local queue "
+ "used to amortize the lock contention, the idea hear is to try locking as soon we reach max size / 2 "
+ "and block when max queue size reached"),
LLAP_IO_MAX_CLOCK_ROTATION("hive.llap.io.clock.max.rotation",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to configure this? Sounds to me like something that a customer would never override (and which we would have a const)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you this is not for customer at all, it is more for dev nob in case of issue we can push the nob on the customer side without deploying code. Will add more comments about it. this makes me thing that we need to add some annotation about what configs are suppose to be dev configs and the one that can be used by customer. Any other suggestion will be welcomed.

*
* @param buffer buffer that just got unlocked after a read.
*/
@Override public void notifyUnlock(LlapCacheableBuffer buffer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe unlock would be the better position to set the clock bit (this is the point until the buffer was used). While it is locked, the buffer is protected anyway and no-one is really looking at the clock bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bit set is lock free so i do not think will change much TBH. to avoid changing lot of the tests setup i will keep it as is and change it in the future if we see a strong case for it. Thanks

// clock position as soon we are done with looping
LlapCacheableBuffer currentClockHead = clockHand;

while (evicted < memoryToReserve && currentClockHead != null && fullClockRotation < maxCircles) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is currently no code path, which could lead to currentClockHead being null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point.


public final class LrfuCacheAttribute implements LlapCacheableBuffer.CacheAttribute {
/** Priority for cache policy (should be pretty universal). */
public double priority;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these have public setter/getter methods. there isn't any reason to have the fields public as well...?

if (strings == null) {
return new int[0];
}
return Arrays.stream(strings).map(x -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not that I don't appreciate a nice stream function but I suspect this is the least efficient method to covnert integer strings to int. each instance is wrapped into an optional just fo filter out the non-parsable directly again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed the wrapping

@@ -406,7 +405,8 @@ private boolean lockBuffer(LlapBufferOrBuffers buffers, boolean doNotifyPolicy)
for (int i = 0; i < bufferArray.length; ++i) {
if (lockOneBuffer(bufferArray[i], doNotifyPolicy)) continue;
for (int j = 0; j < i; ++j) {
unlockSingleBuffer(buffer, true);
//@TODO this is a bug by it self.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to either fix the bug while you're at it or to add more documentation to this comment, why this is a bug.

@@ -32,6 +32,11 @@
import org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator;
import org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator.ObjectEstimator;

/**
* Metadata Cache entry that does hold information about Files.
* No actual allocated bytebuffers is used by this class,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytebuffers -> bytebuffer

@@ -239,7 +239,7 @@ public int hashCode() {
/**
* CacheTag for tables with more than one partition level.
*/
public static final class MultiPartitionCacheTag extends PartitionCacheTag {
public static final class MultiPartitionCacheTag extends PartitionCacheTag {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume, these additional spaces were unintentional?

Change-Id: I81f896b067cda067dc4e695945851672701b5e84
Change-Id: Iccad6704f4f98020364f4ac05ebe63bbc62e1deb
Change-Id: I2bff7d5c044844a461ed439c26c71543922cd302
Change-Id: I807142e9ec312cf735dce079f85b9270b4e2caf0
* @return actual time where the buffer was touched last.
*/
long getLastUpdate();
void setTouchTime(long time);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add interface doc for the public methods?

* Eviction start at the current clock hand following the next pointer.
*
*/
public class ClockCachePolicy implements LowLevelCachePolicy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I was expecting this to be ring buffer based lock free implementation as well. This can be done in a follow up too.

* @param buffer new entry to be added.
* @return the ring head.
*/
private static LlapCacheableBuffer appendToCircularList(LlapCacheableBuffer head, LlapCacheableBuffer buffer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes compiler/inlining better and testing much simpler.

* @param buffer input
* @return buffer
*/
private static LlapCacheableBuffer linkToItSelf(LlapCacheableBuffer buffer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. why static?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above those method are generic enough to be static

/**
* Signals to the policy that it has to evict some entries from the cache.
* Policy has to at least evict the amount memory requested.
* Not that is method will block until at least {@code memoryToReserve} bytes are evicted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Note?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be good to log how long it took to allocate the requested memory for debugging.

}
}

@VisibleForTesting protected Iterator<LlapCacheableBuffer> getIterator() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is required only for testing. Should there be protected access? Or can it be package private?

static final int FLAG_MEM_RELEASED = 0b01000; // The memory was released to memory manager.
static final int FLAG_NEW_ALLOC = 0b10000; // New allocation before the first use; cannot force-evict.

static final int FLAGS_WIDTH = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment of how these width's are laid out?

/**
* Utility class to manipulate the buffer state.
*/
static final class State {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be many global state manipulation. Can you lease a comment that the state manipulations are global. Can rename the class to be CacheState/GlobalCacheState to be more precise about what state it represents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will but, it will make the diff bigger therefore more code review :D

static LowLevelCachePolicy provideFromConf(Configuration conf) {
final long totalMemorySize = HiveConf.getSizeVar(conf, HiveConf.ConfVars.LLAP_IO_MEMORY_MAX_SIZE);
final int minAllocSize = (int) HiveConf.getSizeVar(conf, HiveConf.ConfVars.LLAP_ALLOCATOR_MIN_ALLOC);
String policyName = HiveConf.getVar(conf,HiveConf.ConfVars.LLAP_IO_CACHE_STRATEGY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or make the policy name lowercase here to be case insensitive always?

@github-actions
Copy link

github-actions bot commented Jun 6, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.

@github-actions github-actions bot added the stale label Jun 6, 2020
@github-actions github-actions bot closed this Jun 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants