Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove caching for CapacityBaseRandomPolicy #16187

Merged
merged 8 commits into from
Nov 1, 2022

Conversation

dengweisysu
Copy link
Contributor

@dengweisysu dengweisysu commented Sep 13, 2022

What changes are proposed in this pull request?

This effectively reverts #15423 (3b1bdb3)

For the problem that #15423 was trying to solve, a new policy is proposed in #16237.

Why are the changes needed?

  1. local cache is useless because BlockLocationPolicy will be created every time, because it is not a singleton
  2. local cache can only be useful to limit replica in each client, but useless for a cluster with many client (eg. presto workers ) because each client has different cache result.
  3. additionally, in general, block location policies should not be concerned about block replication, as master will take care of that.

Does this PR introduce any user facing changes?

Two property keys are deprecated: alluxio.user.ufs.block.read.location.policy.cache.size and alluxio.user.ufs.block.read.location.policy.cache.expiration.time, as the cache is removed from the policy.

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: FAIL
    • The title of the PR does not pass all the checks. Please fix the following issues:
      • First word must be capitalized

Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks.

@dengweisysu dengweisysu changed the title use blockId hash value to decide worker instead of local cache Use blockId hash value to decide worker instead of local cache Sep 13, 2022
@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: PASS

All checks passed!

@alluxio-bot alluxio-bot added the API Change Changes covering public API label Sep 13, 2022
@dengweisysu dengweisysu changed the title Use blockId hash value to decide worker instead of local cache Use blockId to decide worker instead of local cache in CapacityBaseRandomPolicy Sep 13, 2022
@maobaolong maobaolong changed the title Use blockId to decide worker instead of local cache in CapacityBaseRandomPolicy Support to use blockId to decide worker instead of local cache in CapacityBaseRandomPolicy Sep 15, 2022
Copy link
Contributor

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a minor question

@Override
protected long randomInCapacity(long totalCapacity) {
protected long randomInCapacity(Long blockId, long totalCapacity) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you change the long to Long?

return null;
// blockId base hash value to decide which worker to cache data,
// so the same block will be routed to the same set of worker.
long sourceValue = blockId + ThreadLocalRandom.current().nextInt(mMaxReplicaSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this policy wants to take block replication into consideration. I think master has a dedicated background job taking care of the replication adjustment periodically. On the other hand, the other policies like LocalFirstPolicy etc. do not care about replications, either. WDYT @jiacheliu3 @beinan?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand here either. @dengweisysu could you explain your routing logic here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbw9580 You are right, master can take care of replications, so we can remove this random logic. But different from LocalFirstPolicy, this policy is much like DeterministicHashPolicy which use 'alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards' to random target worker.

Anyway, I agree with you to remove this random logic in this policy.

Comment on lines 80 to 81
// blockId base hash value to decide which worker to cache data,
// so the same block will be routed to the same set of worker.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have randomness introduced, so even if you are hashing that still doesn't give you the same set every time?

@jiacheliu3 jiacheliu3 assigned dbw9580 and unassigned tcrain Sep 23, 2022
Copy link
Contributor

@dbw9580 dbw9580 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dengweisysu @maobaolong as per our discussion offline, we want a separate CapacityBaseHashPolicy, and the CapacityBaseRandomPolicy will continue to be a random policy, but with the caching issue fixed.

@dengweisysu
Copy link
Contributor Author

dengweisysu commented Sep 27, 2022

@dengweisysu @maobaolong as per our discussion offline, we want a separate CapacityBaseHashPolicy, and the CapacityBaseRandomPolicy will continue to be a random policy, but with the caching issue fixed.

I couldn't agree more. should I close the PR first? @dbw9580

@dbw9580
Copy link
Contributor

dbw9580 commented Sep 27, 2022

@dengweisysu @maobaolong as per our discussion offline, we want a separate CapacityBaseHashPolicy, and the CapacityBaseRandomPolicy will continue to be a random policy, but with the caching issue fixed.

I couldn't agree more. should I close the PR first? @dbw9580

Can you please make this policy a truly random one, as opposed to #16237 ? i.e. make randomInCapacity pick a random worker instead of computing the hash of the block id and pick one based on that.

@dengweisysu
Copy link
Contributor Author

@dbw9580 @maobaolong is it necessary to keep the local cache logic ? I prefer to remove and rollback to the first version of CapacityBaseRandomPolicy to keep it simple. Local cache solution only solve little problem and make the the logic much more complicated.

@dbw9580
Copy link
Contributor

dbw9580 commented Sep 28, 2022

@dengweisysu Yes, I just reverted 3b1bdb3

@dbw9580 dbw9580 changed the title Support to use blockId to decide worker instead of local cache in CapacityBaseRandomPolicy Revert "Add cache for CapacityBaseRandomPolicy" Sep 28, 2022
@@ -5907,25 +5907,6 @@ public String toString() {
.setConsistencyCheckLevel(ConsistencyCheckLevel.WARN)
.setScope(Scope.CLIENT)
.build();
public static final PropertyKey USER_UFS_BLOCK_READ_LOCATION_POLICY_CACHE_SIZE =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's our policy for removing PropertyKeys? Should we deprecate them instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If alluxio client can ignore unknown config key without exception throwed, I prefer to remove it.

@dbw9580 dbw9580 changed the title Revert "Add cache for CapacityBaseRandomPolicy" Remove caching for CapacityBaseRandomPolicy Oct 18, 2022
Copy link
Contributor

@dbw9580 dbw9580 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jiacheliu3 jiacheliu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

core/common/src/main/java/alluxio/conf/PropertyKey.java Outdated Show resolved Hide resolved
core/common/src/main/java/alluxio/conf/PropertyKey.java Outdated Show resolved Hide resolved
dbw9580 and others added 2 commits November 1, 2022 11:03
Co-authored-by: Jiacheng Liu <jiacheliu3@gmail.com>
Co-authored-by: Jiacheng Liu <jiacheliu3@gmail.com>
@jiacheliu3
Copy link
Contributor

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit 8a66a64 into Alluxio:master Nov 1, 2022
jja725 pushed a commit to jja725/alluxio that referenced this pull request Jan 27, 2023
### What changes are proposed in this pull request?

This effectively reverts Alluxio#15423
(Alluxio@3b1bdb3)

For the problem that Alluxio#15423 was trying to solve, a new policy is
proposed in Alluxio#16237.

### Why are the changes needed?

1. local cache is useless because BlockLocationPolicy will be created
every time, because it is not a singleton
2. local cache can only be useful to limit replica in each client, but
useless for a cluster with many client (eg. presto workers ) because
each client has different cache result.
3. additionally, in general, block location policies should not be
concerned about block replication, as master will take care of that.

### Does this PR introduce any user facing changes?

Two property keys are deprecated:
`alluxio.user.ufs.block.read.location.policy.cache.size` and
`alluxio.user.ufs.block.read.location.policy.cache.expiration.time`, as
the cache is removed from the policy.

pr-link: Alluxio#16187
change-id: cid-32c4cc50d580bbc05c31d8bdd361889e4f70f66d
alluxio-bot pushed a commit that referenced this pull request Feb 23, 2023
### What changes are proposed in this pull request?

Add a new block location policy `CapacityBaseDeterministicHashPolicy`.

### Why are the changes needed?

We want a `CapacityBaseRandomPolicy` that is deterministic.

See also #16187.

### Does this PR introduce any user facing changes?

Yes, a new block location policy is available for config item
`alluxio.user.ufs.block.read.location.policy` and
`alluxio.user.block.write.location.policy.class`.

pr-link: #16237
change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
bzheng888 pushed a commit to bzheng888/alluxio that referenced this pull request Mar 8, 2023
### What changes are proposed in this pull request?

This effectively reverts Alluxio#15423
(Alluxio@3b1bdb3)

For the problem that Alluxio#15423 was trying to solve, a new policy is
proposed in Alluxio#16237.

### Why are the changes needed?

1. local cache is useless because BlockLocationPolicy will be created
every time, because it is not a singleton
2. local cache can only be useful to limit replica in each client, but
useless for a cluster with many client (eg. presto workers ) because
each client has different cache result.
3. additionally, in general, block location policies should not be
concerned about block replication, as master will take care of that.

### Does this PR introduce any user facing changes?

Two property keys are deprecated:
`alluxio.user.ufs.block.read.location.policy.cache.size` and
`alluxio.user.ufs.block.read.location.policy.cache.expiration.time`, as
the cache is removed from the policy.

pr-link: Alluxio#16187
change-id: cid-32c4cc50d580bbc05c31d8bdd361889e4f70f66d
YangchenYe323 pushed a commit to YangchenYe323/alluxio that referenced this pull request Apr 16, 2023
### What changes are proposed in this pull request?

Add a new block location policy `CapacityBaseDeterministicHashPolicy`.

### Why are the changes needed?

We want a `CapacityBaseRandomPolicy` that is deterministic.

See also Alluxio#16187.

### Does this PR introduce any user facing changes?

Yes, a new block location policy is available for config item
`alluxio.user.ufs.block.read.location.policy` and
`alluxio.user.block.write.location.policy.class`.

pr-link: Alluxio#16237
change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request May 16, 2023
…sticHashPolicy

Add a new block location policy `CapacityBaseDeterministicHashPolicy`.

We want a `CapacityBaseRandomPolicy` that is deterministic.

See also Alluxio#16187.

Yes, a new block location policy is available for config item
`alluxio.user.ufs.block.read.location.policy` and
`alluxio.user.block.write.location.policy.class`.

pr-link: Alluxio#16237
change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request May 16, 2023
…sticHashPolicy

Add a new block location policy `CapacityBaseDeterministicHashPolicy`.

We want a `CapacityBaseRandomPolicy` that is deterministic.

See also Alluxio#16187.

Yes, a new block location policy is available for config item
`alluxio.user.ufs.block.read.location.policy` and
`alluxio.user.block.write.location.policy.class`.

pr-link: Alluxio#16237
change-id: cid-47ba9b1d197b5ad546ac1a993590d49e963c3811
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Change Changes covering public API priority-high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants