-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove caching for CapacityBaseRandomPolicy #16187
Remove caching for CapacityBaseRandomPolicy #16187
Conversation
Automated checks report:
Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks. |
Automated checks report:
All checks passed! |
9aefd2f
to
a2f321f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a minor question
@Override | ||
protected long randomInCapacity(long totalCapacity) { | ||
protected long randomInCapacity(Long blockId, long totalCapacity) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you change the long to Long?
return null; | ||
// blockId base hash value to decide which worker to cache data, | ||
// so the same block will be routed to the same set of worker. | ||
long sourceValue = blockId + ThreadLocalRandom.current().nextInt(mMaxReplicaSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this policy wants to take block replication into consideration. I think master has a dedicated background job taking care of the replication adjustment periodically. On the other hand, the other policies like LocalFirstPolicy
etc. do not care about replications, either. WDYT @jiacheliu3 @beinan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand here either. @dengweisysu could you explain your routing logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbw9580 You are right, master can take care of replications, so we can remove this random logic. But different from LocalFirstPolicy, this policy is much like DeterministicHashPolicy which use 'alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards' to random target worker.
Anyway, I agree with you to remove this random logic in this policy.
// blockId base hash value to decide which worker to cache data, | ||
// so the same block will be routed to the same set of worker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have randomness introduced, so even if you are hashing that still doesn't give you the same set every time?
bf196ce
to
0add647
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dengweisysu @maobaolong as per our discussion offline, we want a separate CapacityBaseHashPolicy
, and the CapacityBaseRandomPolicy
will continue to be a random policy, but with the caching issue fixed.
core/client/fs/src/main/java/alluxio/client/block/policy/CapacityBaseRandomPolicy.java
Outdated
Show resolved
Hide resolved
core/client/fs/src/main/java/alluxio/client/block/policy/CapacityBaseRandomPolicy.java
Outdated
Show resolved
Hide resolved
I couldn't agree more. should I close the PR first? @dbw9580 |
Can you please make this policy a truly random one, as opposed to #16237 ? i.e. make |
@dbw9580 @maobaolong is it necessary to keep the local cache logic ? I prefer to remove and rollback to the first version of CapacityBaseRandomPolicy to keep it simple. Local cache solution only solve little problem and make the the logic much more complicated. |
This reverts commit 3b1bdb3
@dengweisysu Yes, I just reverted 3b1bdb3 |
@@ -5907,25 +5907,6 @@ public String toString() { | |||
.setConsistencyCheckLevel(ConsistencyCheckLevel.WARN) | |||
.setScope(Scope.CLIENT) | |||
.build(); | |||
public static final PropertyKey USER_UFS_BLOCK_READ_LOCATION_POLICY_CACHE_SIZE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's our policy for removing PropertyKey
s? Should we deprecate them instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If alluxio client can ignore unknown config key without exception throwed, I prefer to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Jiacheng Liu <jiacheliu3@gmail.com>
Co-authored-by: Jiacheng Liu <jiacheliu3@gmail.com>
alluxio-bot, merge this please |
### What changes are proposed in this pull request? This effectively reverts Alluxio#15423 (Alluxio@3b1bdb3) For the problem that Alluxio#15423 was trying to solve, a new policy is proposed in Alluxio#16237. ### Why are the changes needed? 1. local cache is useless because BlockLocationPolicy will be created every time, because it is not a singleton 2. local cache can only be useful to limit replica in each client, but useless for a cluster with many client (eg. presto workers ) because each client has different cache result. 3. additionally, in general, block location policies should not be concerned about block replication, as master will take care of that. ### Does this PR introduce any user facing changes? Two property keys are deprecated: `alluxio.user.ufs.block.read.location.policy.cache.size` and `alluxio.user.ufs.block.read.location.policy.cache.expiration.time`, as the cache is removed from the policy. pr-link: Alluxio#16187 change-id: cid-32c4cc50d580bbc05c31d8bdd361889e4f70f66d
### What changes are proposed in this pull request? Add a new block location policy `CapacityBaseDeterministicHashPolicy`. ### Why are the changes needed? We want a `CapacityBaseRandomPolicy` that is deterministic. See also #16187. ### Does this PR introduce any user facing changes? Yes, a new block location policy is available for config item `alluxio.user.ufs.block.read.location.policy` and `alluxio.user.block.write.location.policy.class`. pr-link: #16237 change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
### What changes are proposed in this pull request? This effectively reverts Alluxio#15423 (Alluxio@3b1bdb3) For the problem that Alluxio#15423 was trying to solve, a new policy is proposed in Alluxio#16237. ### Why are the changes needed? 1. local cache is useless because BlockLocationPolicy will be created every time, because it is not a singleton 2. local cache can only be useful to limit replica in each client, but useless for a cluster with many client (eg. presto workers ) because each client has different cache result. 3. additionally, in general, block location policies should not be concerned about block replication, as master will take care of that. ### Does this PR introduce any user facing changes? Two property keys are deprecated: `alluxio.user.ufs.block.read.location.policy.cache.size` and `alluxio.user.ufs.block.read.location.policy.cache.expiration.time`, as the cache is removed from the policy. pr-link: Alluxio#16187 change-id: cid-32c4cc50d580bbc05c31d8bdd361889e4f70f66d
### What changes are proposed in this pull request? Add a new block location policy `CapacityBaseDeterministicHashPolicy`. ### Why are the changes needed? We want a `CapacityBaseRandomPolicy` that is deterministic. See also Alluxio#16187. ### Does this PR introduce any user facing changes? Yes, a new block location policy is available for config item `alluxio.user.ufs.block.read.location.policy` and `alluxio.user.block.write.location.policy.class`. pr-link: Alluxio#16237 change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
…sticHashPolicy Add a new block location policy `CapacityBaseDeterministicHashPolicy`. We want a `CapacityBaseRandomPolicy` that is deterministic. See also Alluxio#16187. Yes, a new block location policy is available for config item `alluxio.user.ufs.block.read.location.policy` and `alluxio.user.block.write.location.policy.class`. pr-link: Alluxio#16237 change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
…sticHashPolicy Add a new block location policy `CapacityBaseDeterministicHashPolicy`. We want a `CapacityBaseRandomPolicy` that is deterministic. See also Alluxio#16187. Yes, a new block location policy is available for config item `alluxio.user.ufs.block.read.location.policy` and `alluxio.user.block.write.location.policy.class`. pr-link: Alluxio#16237 change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
What changes are proposed in this pull request?
This effectively reverts #15423 (3b1bdb3)
For the problem that #15423 was trying to solve, a new policy is proposed in #16237.
Why are the changes needed?
Does this PR introduce any user facing changes?
Two property keys are deprecated:
alluxio.user.ufs.block.read.location.policy.cache.size
andalluxio.user.ufs.block.read.location.policy.cache.expiration.time
, as the cache is removed from the policy.