-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assignment Metadata Store #423
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the intial check in for the future development of the WAGED rebalancer. All the components are placeholders. They will be implemented gradually.
* Adding the configuration items of the WAGED rebalancer. Including: Instance Capacity Keys, Rebalance Preferences, Instance Capacity Details, Partition Capacity (the weight) Details. Also adding test to cover the new configuration items.
* Introduce the cluster model classes to support the WAGED rebalancer. Implement the cluster model classes with the minimum necessary information to support rebalance. Additional field/logics might be added later once the detailed rebalance logic is implemented. Also add related tests.
…ead of IdealState. (apache#398) ResourceAssignment fit the usage better. And there will be no unnecessary information to be recorded or read during the rebalance calculation.
…nment. (apache#399) This is to avoid unnecessary information being recorded or read.
* Implement Cluster Model Provider. The model provider is called in the WAGED rebalancer to generate CLuster Model based on the current cluster status. The major responsibility of the provider is to parse all the assignable replicas and identify which replicas need to be reassigned. Note that if the current best possible assignment is still valid, the rebalancer won't need to calculate for the partition assignment. Also, add unit tests to verify the main logic.
In order to efficiently react to changes happening to the cluster in the new WAGED rebalancer, a new component called ChangeDetector was added. Changelist: Add ChangeDetector interface Implement ResourceChangeDetector Add ResourceChangeCache, a wrapper for critical cluster metadata
This config will be applied to the instance when there is no (or empty) capacity configuration in the Instance Config. Also add unit tests.
This is a constant that is no longer used.
…apache#365) Issue: CurrentStateCache updating snapshot would miss all the existing partitions that having state change. RoutingTableProvider callback on the main event thread. Time is not accounted in log. Description: fix the bug by updating the snapshot with the correct reloadkeys. enhanced log to accout for user callback code separately. Tests: mvn test passed.
Previously, ClusterConfig would be read from ZK every pipeline run. This PR makes it a selective read and also add to the set of all changed types so that cluster change detector could more easily tell whether ClusterConfig changed without having to store two copies of ClusterConfig objects.
Stablize the REST tests by following changes: 1. Remove temporary cluster which impact the ClusterAccessor test 2. Add all start/end message for test debug purpose. 3. Disable unstable monitoring test for default MBeans. Sometimes we can query it sometimes not. It is not critical test path. Let's make it stable later.
Current HealthReport read is single call for each participant. Improve it will batch call to ZK to reduce the number of calls.
Upon a Participant disconnect, the Participant would carry over from the last session. This would copy all previous task states to the current session and set their requested states as DROPPED (for INIT and RUNNING states). It came to our attention that sometimes these Participants experience connection issues and the tasks happen to be in TASK_ERROR or COMPLETED states. These tasks would get stuck on the Participant and never be dropped. This issue proposes to add the logic that would get all tasks whose requested states are DROPPED to be dropped immediately. Changelist: 1. Make sure all tasks whose requested state is DROPPED get added to tasksToDrop 2. Add a unit test: TestDropTerminalTasksUponReset
…on (apache#395) * Fix the CallbackHandler registration logic in DistributedLeaderElection that may cause a leader node has no callback registered. Our current initialization logic assumes a strict leader acquire/relinquish events sequence. However, due to the possible carried over ZK events from the previous ZK session, the controller node change event might be triggered in the following sequence: 1. CALLBACK (from the previous session): Create new leader node and add handlers. 2. FINALIZE (Handle the previous session expire): Clean up handlers. 3. INIT (For the new session establishment): Expect to add the handlers back again. As a result, if the INIT event processing does not recover the handlers, the leader controller won't be able to manage anything. This fix ensures all the acquireLeadership call will try to initialize the leader controller's callback handlers. Also, add the additional test logic in TestHandleNewSession to verify the fix. * Improve the leader history update logic so there is no duplicate entry recorded.
This reverts commit f2746c8.
This reverts commit c7e8e63.
Also fix the missing helix-agent snapshot update logic in the bump-up.comand.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.