-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Node Assignment Service #958
Cache Node Assignment Service #958
Conversation
75e651e
to
da142b3
Compare
We already have code that does cache node assignment. How is this different from current logic? It seems upon closer reading of the code, we are adding dynamic chunks. It would be good to clarify the design intent here. If we want to support more chunks per cache node, why not increase the number of slots dynamically? Why do we need to change chunk assignment? |
Hi Suman! Great to see you. Good question! Yes, we are adding dynamic chunks. However, the goal is not just to support more chunks per cache node but also to improve disk space utilization. I'm sure you already know, but chunks are not always a fixed size. While we aim for uniform size, they can vary. Since slots assume each chunk is the same size, this can lead to under-utilization of disk space if chunks are smaller than the assumed size. For example, with a 3TB disk and 15GB per slot, we get 200 slots. If a chunk is only 10GB, there’s still 5GB of unused space allocated for that slot. The change in assignment algorithm addresses this. We plan to use a bin-packing algorithm to pack all chunks into the least number of cache nodes, optimizing disk space usage and reducing costs. Let me know if that answers your question! And apologies for the lack of context in the PR—it's still very much a work in progress 😅 |
Hi Kai, great to see you again! Thanks for the context. It would ideal to write a design doc or a discussion for this before we start coding to document the goals of this change. It seems that the goal of this change is improve disk utilization. This is a great goal. I am however curious why the there is so much wasted disk space unused in practice. If you are using size based roll over, at any given time only a small percentage of chunks would not be the same size as the cache slot but in steady state the chunks should be of the same size. If it is happening regularly, is it not a tuning issue? Also, even if the temporarily pack smaller chunks better(say after a deployment), would we not get back to a the same configuration as a fixed slot config once the roll over happens in an hour? I am asking since I may be missing a case where there are longer term advantages of this change. Also, the current implementation of static number of slots results employs a constant work paradigm for the entire cluster. This makes cluster operations mechanical, predictable and easy to automate. The dynamic slot model proposed here is much more dynamic and makes the system less predictable over all. This is big sacrifice from a design perspective. To carefully evaluate the effect of these changes, it may be better to break this change down into a few smaller changes:
It is possible that we can get back disk utilization after the first 2 changes. It seems we are also changing the cache assignment nodes in this PR. I think that change may be better done independent of this change. Overall, I think it may be ideal if we approach this change in a more incremental manner. |
da142b3
to
44f1592
Compare
99baf84
to
7c90558
Compare
astra/src/main/java/com/slack/astra/chunk/ReadOnlyChunkImpl.java
Outdated
Show resolved
Hide resolved
astra/src/test/java/com/slack/astra/chunk/ReadOnlyChunkImplTest.java
Outdated
Show resolved
Hide resolved
astra/src/test/java/com/slack/astra/chunkManager/CachingChunkManagerTest.java
Outdated
Show resolved
Hide resolved
astra/src/test/java/com/slack/astra/metadata/cache/CacheNodeAssignmentSerializerTest.java
Outdated
Show resolved
Hide resolved
astra/src/test/java/com/slack/astra/metadata/cache/CacheNodeAssignmentStoreTest.java
Outdated
Show resolved
Hide resolved
astra/src/test/java/com/slack/astra/metadata/cache/CacheNodeAssignmentTest.java
Outdated
Show resolved
Hide resolved
astra/src/test/java/com/slack/astra/metadata/cache/CacheNodeMetadataStoreTest.java
Outdated
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/clusterManager/CacheNodeAssignmentService.java
Outdated
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/clusterManager/CacheNodeAssignmentService.java
Show resolved
Hide resolved
75e8c77
to
5e26cf0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good! This is a pretty huge change, but you've done a good job of isolating the new behavior behind a "default false" system property.
I took a look over the code coverage as well, and it looks like you did a decent job of exercising the new paths. I think with a couple small tweaks this is probably good to merge up.
astra/src/main/java/com/slack/astra/chunkManager/CachingChunkManager.java
Outdated
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/chunkManager/CachingChunkManager.java
Outdated
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/clusterManager/CacheNodeAssignmentService.java
Outdated
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/clusterManager/ClusterHpaMetricService.java
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/clusterManager/ClusterHpaMetricService.java
Outdated
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/clusterManager/ClusterMonitorService.java
Outdated
Show resolved
Hide resolved
astra/src/main/java/com/slack/astra/clusterManager/ClusterMonitorService.java
Outdated
Show resolved
Hide resolved
1d2075c
to
a660974
Compare
ace315d
to
01852ca
Compare
ad99d00
to
9a17959
Compare
Includes tests, serializers, proto file changes
Includes changes to ReadOnlyChunkImpl
9a17959
to
6c0c33e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple nits, but I would consider addressing them in a future PR. I'm not seeing anything that sticks outs that we haven't discussed, and you have already successfully deployed this to several test environments.
@@ -128,6 +171,142 @@ public ReadOnlyChunkImpl( | |||
LOG.debug("Created a new read only chunk - zkSlotId: {}", slotId); | |||
} | |||
|
|||
/* | |||
====================================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: These types of comments tend to go stale pretty quick, as folks refactor and miss a comment block several methods up when they're adding new methods.
snapshotMetadataStore.close(); | ||
searchMetadataStore.close(); | ||
|
||
LOG.debug("Closed chunk"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This would be more useful if it contained information about what chunk was closed, otherwise this isn't much different than a metric.
Summary
This PR adds a basic draft of the cache node assignment service + tests. Several other changes were added in order to support the service, they include:
CacheNodeAssignment
andCacheNodeMetadata
classes (plus serializers, ZK store classes, proto file changes)CachingChunkManager
listen for changes in assignments and download data as appropriateRequirements