Skip to content

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also .

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also .
Checking mergeability… Don’t worry, you can still create the pull request.
Commits on Sep 09, 2012
@ctasada ctasada - bumped up release version (0.96)
- updated release notes (0.96)
- Build: Parametrized Java version
- Build: Added -source parameter when compiling
- Build: Don't include dist/classes folders in the release files
Commits on Sep 23, 2012
@ctasada ctasada Merge remote-tracking branch 'voldemort/master'
# By Lei Gao (18) and others
# Via Chinmay Soman (4) and Lei Gao (2)
* voldemort/master: (37 commits)
  Updating release notes for 0.96 open source release
  Final review comments correction for autobootstrapper: Copyright, documentation and variable naming convention
  increased sleep time in RoutedStoreTest
  Removed unnecessary variables from AsyncMetadataVersionManager
  Bug fix in AsyncMetadataVersionManager. Client config parameters added for System store. Changed AdminClient to use a Timestamp instead of a counter for metadata version
  Modified behavior of getall to comply javadoc when key does not exist in store
  Updating release version to 0.96 in
  Bug fix: initializing system store client in AdminClient during every operation to account for cluster.xml changes
  Added a instantInit flag to LazyClientStore, removed DefaultSocketStoreClientTest, returning DefaultStoreClient for Http protocol
  Code cleanup, bug fixes for system stores and auto-rebootstrapper
  Added file backed storage engine. Factored out ZenStoreClient from DefaultStoreClient (with a configurable switch). Added unit tests.
  undo changes to mbean registration names - don't use client context as part of the name
  add close method for shutting down client's SchedulerService in AbstractStoreClientFactory
  avoid jmx id from being incremented when factory is created for system stores
  code clean up
  Adding tests for system stores, local pref routing and Version manager. Fixed some things in PipelineRoutedStore
  add SchedulerService in voldemort client
  allows fetch-keys to fetch keys from system stores
  resolve merge conflict in stores.xml
  merge code for automatic reboostrap
Commits on Oct 18, 2012
@ctasada ctasada Merge branch 'master' of a235956
Commits on Mar 26, 2013
Chinmay Soman Basic working prototype for Coordinator and Sample Thin client 4140295
Chinmay Soman Added the coordinator package. Modified Benchmark to use thin client 939d2cb
Chinmay Soman Creating FatClientWrapper for Thread pool isolation. 4cd607d
Chinmay Soman First working version of Coordinator service. Includes REST request a…
…nd response handling, Error handling, automatic checking of metadata changes, fat client config management
Chinmay Soman Fixed .classpath which had an illegal entry 791ae59
Chinmay Soman A working implementation of the Coordinator and thin client. Includes…
… following things:

- Creating AbstractStore and AbstractStorageEngine to refactor
  the corresponding Store and StorageEngine interfaces.
- Refactored the fat client to accomodate dynamic per call timeout.
- Isolated Fat client wrapper to safeguard multitenancy
- Autobootstrap mechanism added to the Coordinator service
- Basic HTTP request/response parsing and Error handling
Chinmay Soman Adding the missing R2Store file e7c2ae9
Chinmay Soman Moving thin client to contrib. Also fixing Benchmark to use DefaultSt…
…oreClient instead of Thin client
Chinmay Soman - Added GetAll and Delete implementations on the Coordinator (and the…
… temporary rest client)

- Converted Coordinator into an AbstractService and added CoordinatorConfig
- Refactored Composite Voldemort request into different types
Commits on Mar 31, 2013
@ctasada ctasada Merge remote-tracking branch 'voldemort/master' 90c0b5f
Commits on Apr 03, 2013
@zhongjiewu zhongjiewu make working in Mac per ctasada ceb7de8
Commits on Apr 04, 2013
Chinmay Soman Added unit tests to ensure that slops are registered for different as…
…ynchronous put operation failures
Commits on Apr 05, 2013
Chinmay Soman Bug fixes to HintedHandoffFailureTest and added more tests to handle …
…3-2-2 config. Removed SleepyForceFailStore
Commits on Apr 10, 2013
@vinothchandar vinothchandar Tool to forklift data over for store migrations 7234b65
Commits on Apr 11, 2013
@vinothchandar vinothchandar Adding unit test for fork lift tool 560bd67
@vinothchandar vinothchandar Clarifying arbitrary choice to return BEFORE for equal vector clocks. 74bdfde
Commits on Apr 12, 2013
@vinothchandar vinothchandar Forklift tool fix to equally spread fetches b8e6f78
Commits on Apr 15, 2013
Chinmay Soman Adding another Hinted handoff failure test to ensure main thread retu…
…rns with failure when all replicas dont respond
Chinmay Soman Added bigger timeout to testNoSlopsOnAllReplicaFailures test and doin…
…g better exception handling in HintedHandoffFailureTest
Chinmay Soman Added Null pointer check in teardown of ReadOnlyStorageEngineTest db6ac44
Commits on Apr 17, 2013
@ctasada ctasada Merge commit 'db6ac447895255c84ef0e0cbd0303ffa6b45e05b' 266221c
Commits on Apr 18, 2013
@vinothchandar vinothchandar Adding basic support for abortable rebalances, more to follow fb8f1a8
@vinothchandar vinothchandar Adding proxy write tests 927b02c
@vinothchandar vinothchandar Proxy to donor implementation + tests d24d0b1
@vinothchandar vinothchandar Fixing test cases, have RebalanceController use ProtoBuf 881d7b3
Commits on Apr 22, 2013
@vinothchandar vinothchandar Adding @deprecated for proxy put param + cosmetic debug msg fixes 736ccc8
Commits on Apr 23, 2013
@abh1nay abh1nay Added a mode to the forklift tool to allow all conflicting versions
to be copied over to the destination
@abh1nay abh1nay Added unit test b245e3b
@abh1nay abh1nay Cleaned up code# Please enter the commit message for your changes. Li…
…nes starting
Chinmay Soman Making the methods of RebalancePartitionsInfo synchronized so as to a…
…void the race condition in asMap and removeStore
Chinmay Soman Removing synchronized keyword from the static and private methods cc89784
Commits on Apr 24, 2013
@vinothchandar vinothchandar Zone N ary helpers + StoreRoutingPlan tests 626ea52
Chinmay Soman Adding a missing else block for throwing a store def mismatch exception 9991605
@abh1nay abh1nay Remove redundant callbacks 29da9ef
@abh1nay abh1nay Cleaned up logger statements aefddee
@abh1nay abh1nay More cleanup# with '#' will be ignored, and an empty message aborts t…
…he commit.
Commits on Apr 25, 2013
@abh1nay abh1nay Updated release notes 6ca2f83
Commits on Apr 26, 2013
Chinmay Soman * Added monitoring (JMX) for fat client wrapper, netty server and req…
…uest handler

* Fixed some bugs and cleanup
* Added unit tests for dynamic timeout store and REST API validation
Chinmay Soman Moved slow store to single-slow-store.xml. Previously it was breaking…
… the existing tests
Chinmay Soman Added the ability to do Versioned puts. Added another test case for t…
…he same
Chinmay Soman Standardized config and cleaned up some code 18e3346
Commits on May 01, 2013
Chinmay Soman Fixing the faulty read-repair logic (removed duplicates, removed unne…
…cessary repairs). Added a test case to verify this in the presence of concurrent versions
Commits on May 03, 2013
@vinothchandar vinothchandar Added hook to checkpoint all BDB envs c5a143f
@vinothchandar vinothchandar Fix Typo 73802e0
Commits on May 08, 2013
@vinothchandar vinothchandar reimplement proxy put based on zone n-ary replicas logic 82f97c4
@vinothchandar vinothchandar Eliminate unnecessary proxy fetches 63beb58
@vinothchandar vinothchandar Follow fixes to abortable rebalancing f31e48f
@zhongjiewu zhongjiewu vector clock fix 8fd7ea6
@zhongjiewu zhongjiewu using treemap to store clocks in VectorClock 3a5e6f6
Commits on May 13, 2013
@vinothchandar vinothchandar Implementing atomic multi version puts to storage b743678
@vinothchandar vinothchandar Rewrite of InMemoryStorageEngine + config to control multiVersionPuts b3147d3
@vinothchandar vinothchandar 1.Enabling proxy puts by dafault
2. Bug fix in proxy put stats
3. Changing order of state change updates for correctness
4. Setting proxy put tests to do one batch rebalancing
Commits on May 15, 2013
@vinothchandar vinothchandar Rewording comments 22686c8
Commits on May 16, 2013
@zhongjiewu zhongjiewu added additional logging; minor change in test store classes 7e60cc9
@zhongjiewu zhongjiewu added new hintedHandoffSendHint test and remove old hintedHandOff test cbd4dbd
@zhongjiewu zhongjiewu refactored PerformParallelPutAction and related to ensure a slop is r…
@zhongjiewu zhongjiewu more refactoring to make sure response is handled once and only once 208e78c
@zhongjiewu zhongjiewu added end-to-end test for slops a1ac2dd
@zhongjiewu zhongjiewu additional logging 7ef135d
@zhongjiewu zhongjiewu additional slop fix c1400bd
@zhongjiewu zhongjiewu deprecate send hint serial 204615f
@zhongjiewu zhongjiewu more commits on slop fix ffe892d
Commits on May 17, 2013
@abh1nay abh1nay Allow update metadata to take both stores and cluster xml 6250c10
@abh1nay abh1nay atomic update of stores and cluster xml during rebalance 818d96b
@abh1nay abh1nay Added new end to end test for verifying the atomic update is consistent
on bootstrap
cleaned up code based off last code review
@abh1nay abh1nay Made the getserverStateLocked explicit
adding the new test case this time around
@abh1nay abh1nay Cleaned up the test 361bc02
@abh1nay abh1nay more cleanup on the test aaf95d0
@abh1nay abh1nay cleanup on tests 995b30a
@abh1nay abh1nay Fix unused variable fbe2589
Chinmay Soman Adding a null check for the versioned value object in convertStringTo…
…Object in MetadataStore. This was causing a small problem while restarting the Voldemort server
Commits on May 20, 2013
@voldemort Make sure elapsed time is not negative
System.nanoTime() can sometimes go backwards as it relies on
performance counters. This commit fixes exceptions that can
surface due to requestTime being negative. See github issue:
Commits on May 21, 2013
Chinmay Soman Small bug fixes and cleanup in R2Store fc89e8b
@voldemort Add MANIFEST.MF to git ignore is an autogenerated file. Adding it to the ignore list.
Commits on May 24, 2013
@vinothchandar vinothchandar Fixes to KeySampler/KeyVersionFetcher
-- Handle larger fetches without OOME
-- Adding support to use query-keys on hexstring keys, so as to be compatible with the keySampler/keyVersionFetcher
Commits on May 29, 2013
@jayjwylie jayjwylie Dropped META-INF/MANIFEST.MF a53be08
@voldemort Upgrade Google Collections lib to Guava lib 797b7c6
Commits on May 30, 2013
@ctasada ctasada Merge branch 'master' of
# By Vinoth Chandar (16) and others
# Via Siddharth Singh
* 'master' of (56 commits)
  Upgrade Google Collections lib to Guava lib
  Fixes to KeySampler/KeyVersionFetcher
  Add MANIFEST.MF to git ignore
  Small bug fixes and cleanup in R2Store
  Make sure elapsed time is not negative
  Adding a null check for the versioned value object in convertStringToObject in MetadataStore. This was causing a small problem while restarting the Voldemort server
  Fix unused variable
  cleanup on tests
  more cleanup on the test
  Cleaned up the test
  Made the getserverStateLocked explicit adding the new test case this time around
  Added new end to end test for verifying the atomic update is consistent on bootstrap cleaned up code based off last code review
  atomic update of stores and cluster xml during rebalance
  Allow update metadata to take both stores and cluster xml
  more commits on slop fix
  deprecate send hint serial
  additional slop fix
  additional logging
  added end-to-end test for slops
@ctasada ctasada Solves compilation problem with Java 7 113d234
Commits on Jun 03, 2013
@zhongjiewu zhongjiewu Merge pull request #146 from ctasada/issue132
Issue 132 - Java 7 Compilation Bug tested for compilation and smoke test
Commits on Jun 04, 2013
@vinothchandar vinothchandar Adding a Production worthy sample configuration
do a
   $ bin/ config/prod_single_node_cluster
to get started
Commits on Jun 05, 2013
@abh1nay abh1nay Reverting back change to on wire protocol for update metadata
to preseve backwards compatibility
Commits on Jun 06, 2013
@voldemort Check that zone value for a node is not null 96ff6d9
@abh1nay abh1nay Added functionality to get store schema

 curl http://localhost:8080/schemata/dGVzdA==
{"key-serializer": "SerializerDefinition(name = string, schema-info =
{}, compression = null)", "value-serializer": "SerializerDefinition(name
= string, schema-info = {}, compression = null)"}
@abh1nay abh1nay Cleaned up code based on Vinoth's review f6e7b9c
@vinothchandar vinothchandar Adding sample configs for Voldemort Coordinator 8db2a31
Commits on Jun 11, 2013
@vinothchandar vinothchandar Implement metadata bootstrapping for Java Rest Client 3f04f58
Commits on Jun 14, 2013
Chinmay Soman Bug fix in RESTErrorHandler - adding the right content length in the …
…response header
Commits on Jun 15, 2013
@vinothchandar vinothchandar Update CONTRIBUTORS
Long missing update for the current Voldemort Team at Linkedin
Commits on Jun 18, 2013
@abh1nay abh1nay Fixed case with statsTrackingStore not being wrapped around properly
with DynamicTimeoutStoreClient since ser. and compression were disabled
@abh1nay abh1nay For some reason the method that I had added to retrieve storedefs was
missing adding it back
@abh1nay abh1nay Some cleanup 4846ecc
@abh1nay abh1nay Added comments b8a8a4f
Commits on Jun 20, 2013
@voldemort Add more locations to gitignore 7daac20
@voldemort Remove unneeded check
adminClient doeesn't change in the try block so no
check needed in the finally block.
Chinmay Soman Added 3 zone tests to, fixed a small bug in GetA…
Chinmay Soman Removed BannagePeriodFailureDetector parameter from RoutedStoreTest a…
…nd some beautification changes
@jayjwylie jayjwylie Partition balance analysis tool for more than two zones.
- Started a directory for all of our tools (src/java/voldemort/tools)

- PartitionBalance :
  - broke this util out of ClusterInstance to make it more self-contained
  - this changed the declaration of a lot of the methods that invoked analyzeBalance and analyzeBalanceVerbose

- ClusterInstanceTest  :  Added a bunch of tests to confirm that partition balance can be constructed (or not) for combinations of 2 & 3 zones
@jayjwylie jayjwylie Added RepartitionCLI. It is the subset of the RebalanceCLI that does …
@jayjwylie jayjwylie Initial version of repartitioning tool 'RepartitionCLI'.
- removed all "generate" options and code from this tool (now tools/RepartitionCLI)
- removed "analysis" option from this tool (now tools/PartitionAnalysisCLI)
- Fixed errors in help/usage docs

- Removed unnecessary options that restricted partition movement to be 'within a zone'. This decouples the repartition tool from the planning tool. This also assumes that optimizations can be done at plan time to keep all movement within a zone (when possible).
- cleaned up option names to be more concise
- Added options for target cluster/stores to allow zone expansion to be fully specified

- New class extracted from ClusterInstance
- Made all partition analysis zone-aware. This prevents zones with fewer nodes, or partitions, from skewing any analysis. I.e., each zone's balance analysis is now correctly normalized to be comparable with other zones' balance analyses.
- added inner ZoneBalanceStats class that makes stats tracking clearer
- added getUtility methods to hide the exact utiliyt method from the interfaces
- Changed utility methods to sum over all zones and to combine zone primary analysis with nary analysis.
- changed/extended balancePrimaryPartitions to handle zone expansion
- Stripped out a bunch of unnecessary options/code. I.e., options/code that tried to minimize cross zone moves during repartitioning (rather than during planning!)
- Added some helper methods to clean up repeated code.

- test new helper methods
@jayjwylie jayjwylie Refactoring. Renamed RebalanceClusterUtils->RepartitionUtils and move…
…d some tests into new file RepartitionUtilsTest.
@jayjwylie jayjwylie Added tests of repartitioning algorithms.
- added arg for greedy zone IDs; not bothering exposing it via CLI though
- split method that identifies contiguous partition ID runs into two, better tested and now correct, methods

- fix div by zero error

- Documented the repartition method
- Reused the internal balancePrimaryPartitions method in the method that breaks apart contiguous partitoin runs
- dropped unused methods
- switched greedy algorithms to do either zone-by-zone or cluster as a whole. unless we find cluster-as-a-whole leads to too much data movement, is the behavior that we want.

- made all the get*Cluster and get*StoreDefs methods static and added some more. These are generally useful for tests beyond ClusterInstance.

- Actually test all of hte repartitioning algorithms!
@jayjwylie jayjwylie copyright fix 0e17e99
@jayjwylie jayjwylie Initial work on RebalancePlanCLI.
Extracted the planning logic out of RebalanceController. This will reduce cut-and-paste coding, clarify which arguments are needed to tune planning (versus execution of a plan), and isolate rebalance planning logic from execution logic. There are a ton of TODOs added in this commit. As the refactoring continues, and the controller is changed to use the new RebalancePlan, these TODOs will be addressed. This commit has a working RebalancePlanCLI that handles the use cases rebalance planning has historically handled (rebalance in place and cluster expansion). This commit does not address zone expanison. The output of plan statistics is much richer and includes detailed statistics on cross zone moves (from which zone to which zone) and nodes (how much in and how much out). All these statistics are at the partition-store level which should be more informative than prior statistics. Finally, these statistics include an estimation of storage overhead per-node which can be used to assess how much free disk space is required to execute a plan safely.

- TODOs to deprecate this. This logic belongs in the plan, not the execution.

- TODOs for refactoring
- Has logic for doing things "the old way" and the "new way". The new way has simpler data structures nad better encapsulation.
- Many more getters for detailed plan stats

- Minor TODOs and some renaming of varaibles

- A ton of refactoring TODOs
- Fixed all move counting getters to count partition-stores

- getStateString method to pretty print ID, host name and ports.

- more argument handling

- getters for partition-store accounting of current cluster+store defs

- Some methods to deprecate
- validation methods for cluster arguments

- New class to encapsulate all rebalancing planning

- Sub-classes of RebalancePlan that are needed for converting static plan into executable plan

- CLI for generating a plan

MoveMap & MoveMapTest
- 2-d map counter used in partitoin-store accounting
@jayjwylie jayjwylie I mangled the merge after a rebase. This commit manually fixes the er…
…rors I introduced. Mea culpa.
@jayjwylie jayjwylie RebalancePlanCLI working for 2-to-3 zone expansion
Stealers steal from first donor with desired partition-store. This currently biases all work towards donors with lowest node IDs. This is truly imbalanced plan!

- Decorated with more TODOs...
- Changed stealer to pick first donor with desired partition-store
- Added commented-out code section that outlines next step for better planning

- Decorated with more TODOs

- Much clean up refactoring. Construction now generates plan.
- Methods added to get the plan (getPlan) and to print the plan (toString)

- Helper method that derives targetCluster from initialCluster and finalCluster.
@jayjwylie jayjwylie RebalancePlan de-insanification
This is a messy commit:

1) There are a ton of TODOs, a ton of code marked to be deprecated, and a lot of gymnastics to continue using data structures that ought to be simplified or removed. All of this is necessary until the RebalanceController is switched to use the RebalancePlan, and until the abortable rebalance work and this work are merged to master. Cleaning up shared data structures is too high risk while on separate branches.

2) Some of the RebalanceTest tests are currently failing. Going to debug and fix in next commit.

Commit notes:

- renamed cluster members to be target & final rather than current & target
- determines batch plan at construction time
- added batchPlan() method that
  - is more clear than prior planning method
  - optimized to do n-ary to n-ary migration within a cluster so that only new zones require cross zone moves

- minor change for method name change

- added internal member & method to track all nary-partition-stores per node
- get zone replica type methods that take partition id as input

- added / refactored suite of methods that validate pairs of clusters (current&target, current&final, target&final)

There are tests missing at this point in development:
- simple & basic tests of RebalancePlan
- test StoreRoutingPlan.getNaryPartitionIds (and possibly other methods too)
- tests for RebalanceUtils.validate.* methods
@jayjwylie jayjwylie Fixed broken junit tests. Documented and commented out other parts of…
… unit tests that rely on donor-based rebalancing or that rely on partition-stores being deleted during rebalance. Both features are currently (necessarily) broken.

- Fix latent bug that treated node ids as dense/contiguous

- minor refactor

- handle the case of an "empty" plan differently. Because unneccessary moves are optimized out at plan time, some of the small tests now yield plans with no data movement. I.e., primary partition IDs move among nodes, but data does not because all nodes already host all data. This exposed a corner case in which rebalance was successful, but servers never updated their cluster xml with new partitoin ID mapping.
- Removed random/unnecessary 10s sleep from control path. Annotated with a TODO for discussion to confirm this is right thing to do.

- minor code clarification

- Fixed latent bug (wrong value passed to sleep). Added TODO wondering why this sleep is here in the first place.

- How are isPartitoinScanSupported() and isPartitionAware() related?

- add isPartitionAware that returns true. Is this correct!?

- Commented out asserts that test deletion during rebalancing

- Removed donor-based rebalancing tests from this test suite.
@jayjwylie jayjwylie Added unit tests for RebalancePlan and for cluster xform & verificati…
…on methods in RebalanceUtils

- Test the core use cases for rebalance planning:
  - no-op,
  - rebalance (zone/node topology same, but partition id layout changed),
  - cluster expansion (zone topology same, but new nodes added & partitions moved to them),
  - zone expansion (zone topoloigy changes with partitions moved to new zone).

- rename method

- added getters for aggregate plan statistics

- Changed getLocalZonedCluster to accept array of node IDs. Necessary to generalize other test methods.

- pass storage type into helper methods that construct store defs
- chnagned all get???Cluster.*() methods to use new getLocalZonedCluster interface. Also added more such helper methods.

- Test the cluster transformation & verification methods

- simple clean up
@jayjwylie jayjwylie copyright updates. 80dc13a
@jayjwylie jayjwylie Addressing reviews of balance analysis & repartitionerCLI
- print out invalid metadata rate analysis

- added members that allow reverse lookup of partitionID to either ZoneId or NodeId

- Added reverse lookup of nodeID to Zone Primaries hosted.

- moved constants into RepartitionUtils
- Added validation of clusters & storedefs passed in.

- use lookup from Cluster object
- changed all contig partition ID methods to call the one method that does all the complicated work.

- Added constants for use in the utility function
- Broke the core, complicated method into small, reasonably-sized pieces.
- Added verbose print out of ZoneReplicaTypes (to complement old "replica type" print out)

- Document the recommend default constants for repartitioning
- Dump invalid metadata rate
- Clean up comments and java doc
- dumpInvalidMetadataRate analyzes and pretty-prints how many pseudo-masters per-zone per-unique-store-def lead to invalid metadata exception.
@jayjwylie jayjwylie Refactoring: cleaned up util methods and moved them to appropriate ut…
…il classes

- all dump cluster methods are now here with more consistent interfaces
- analyzeInvalidMetadataRate is here (for lack of better place to put it)

- moved all helper/util methods out of this class

- added removeItemsToSplitListEvenly, distributeEvenlyIntoList, and distributeEvenlyIntoMap

Moved test methods around to match new locations of util methods.
@jayjwylie jayjwylie Additional minor refactoring of utils and utils tests. cb5d292
@jayjwylie jayjwylie Refactoring: Renamed some classes and reorganized their location
RepartitionCLI -> RepartitionerCLI
RepartitionUtils -> Repartitioner

PartitionBalance & Repartitioner : voldemort/utils -> voldemort/tools

Test classes & files updated to correspond with src/java.
@jayjwylie jayjwylie Refactoring: More class renaming and file reorganization
ClusterInstance has been dropped.

ClusterInstanceTest is now PartitionBalanceTest and is in tools.

ClusterTestUtils is a new util class. All of the getFOOClusterBar and getFOOStoreDef(s)BAR methods have been moved into this test helper.
@jayjwylie jayjwylie Minor changes inr esponse to review feedback and documenting a set of…
… tests that are known to fail at this time.
@jayjwylie jayjwylie Initial commit of new rebalance controller.
This is an interim commit. The new RebalanceControllerCLI has been added. Portions of the RebalanceController have been updated to use the RebalancePlan. All the tests that should pass, still pass for the old RebalanceController. About to switch tests to use new RebalanceController that works with a RebalancePlan.
@jayjwylie jayjwylie Unit tests work with new rebalance controller.
- helper method to construct RebalancePlan based on cluster read from servers
- divide by zero fix
- many TODOs and Deprecate annotations...

- drop RebalanceclientConfig from construction and direclty pass in the config parameters needed.

- do not test donor-based rebalancing. (TODOs to add it back.)
- do not test that keys are actually deleted when delete is true in rebalancing.
- use the new controller/plan code path in all rebalancing unit tests.
@jayjwylie jayjwylie Switched to new RebalanceController based on RebalancePlan and droppe…
…d tons of now deprecated code.

- delete of partition-stores during rebalancing has been removed and is no longer an option.
- the old RebalanceCLI now only offers the entropy tool features. cannot be dropped until SREs are OK with KeySampler/KeyVersionFetcher tool chain for verifying rebalances.
- Dropped RebalanceClientConfig. Configuration parameters are now explicit in pertinent constructors.
- Tests are failing or disabled:
  - donor-based rebalancing is disabled in test suites. Does not currently work with new rebalancing.
  - RebalanceClusterPlanTest tests mostly fail at this time.
  - RebalanceTest.serverSideRouting[1] may timeout
  - ZonedRebalanceTest.testProxyGetDuringRebalancing[1] may timeout

- many small changes to work with all other changes
- this class is to-be-deprecated regardless

- removed tons of deprecated code that permitted old controller to continue to work

- Now uses RebalancePlan
- No longer offers delete-after-rebalancing feature
- Dropped tons of deprecated code...

- no longer extends RebalanceClusterPlan
- the classes that make a generic plan executable as either stealer- or donor-based
- still need to be renamed so purpose of these key classes is more clear

- moderate clean up

- dropped unnecessary helper mehods

- Signficant clean up:
  - use Rebalance*BatchPlan and RebalancePlan
  - don't check keys deleted since no longer supported
@jayjwylie jayjwylie Continued refactoring and clearing out deprecated code.
- removed unnecessary members:

- renamed some member variables for sake of clarity
- switched from orderedClusterTransition.getId to batchCount to uniquely ID batches of work...

- added TODOs
- removed unnecessary members:

Patched up all tests to not have 'compile time' errors. Have not run tests though...
@jayjwylie jayjwylie interim commit. Some work on taking logic of OrderedClusterTransition…
… and moving it into ExecutableBatches. Logic is broken though, and, more importantly, unnecessary. Given proxy-gets server from local zone and there can only be nominal perf gain from ordering rebalancing tasks at node level, OrderedClusterTransition needs to be removed from teh code, not incorporated into new planning/execution code.
@jayjwylie jayjwylie Interim commit: OrderedClusterTransition has been removed from the co…
…de base.
@jayjwylie jayjwylie Interim commit before dropping ExecutableBatch objects
- removed TODOs about trying to "intelligently" order rebalance tasks

- switched to just using REbalancePlan until tasks are actually constructed that are either stealer- or donor-based.

- static method for pretty printing a list thereof

- clean up to deal with other type & interface changes...
@jayjwylie jayjwylie Dropped Rebalance*, the classes I thought we needed to …
…encapsulate an executable plan.
@jayjwylie jayjwylie Dropping RebalanceNodePlan, the class that sorted lists of RebalanceP…
…artitionsInfo by node id (either stealer or donor node), and then flattened the sorted list out and returned it again. AKA the identity operation on a list of RebalancePartitionsInfo.
@jayjwylie jayjwylie Fix on-wire protocol for RebalancePartitionInfoMap
Completed deprecation of member 'attempt' of RebalancePArtitionInfoMap by making it optional and prefixing name with OBSOLETE.

This leads to big changes to auto generated file

Some other minor tweaks to RebalancePArtitionsInfo and comments as well.
@jayjwylie jayjwylie Fixed RebalanceClusterPlanTest to be junit 4 and to work as well as p…
…ossible at this time. Test needs to be re-written to test new planning rather than old planning.
@jayjwylie jayjwylie Renamed RebalanceClusterPlan(Test).java to RebalanceBatchPlan(Test).j…
…ava. Cleaned up TODOs and comments and variable names in RebalanceBatchPlan.
@jayjwylie jayjwylie Cleaned up existing TODOs and added some javadoc comments. a2b950c
@jayjwylie jayjwylie Renamed Rebalance(Long)Test to Added …
@jayjwylie jayjwylie Clean up Rebalance(Plan|Controller)CLI
- set batch size default to INTEGER.MAX_VALUE

- Cleaned up handling of all optional arguments
@jayjwylie jayjwylie Remove no-longer needed test from RebalancePartitionsInfoTest; annota…
…ted another test as mostly failing..
@jayjwylie jayjwylie Address review comments for RebalancePlan
- factored logic to decide which donor to steal from out of constructBatchPlan.
- huge header comment about other policies to consider

- dropped comment about historic (arbitrary) sleep command

- added helper hacky method to "clone" cluster

- TODO to add a .clone() method

- cleaned up javadoc, TODOs, and method names

- set sleep to 30 seconds. Made comments and logger messages consistent with code.

- cleaned up javadoc for isPartitionAware()

- fix validation method to use safe(r) comparison.

- fixed tests to (mostly) only test plan invariants. Prior test code focussed on exact plan details and so was hard-coded to the implementation, rather than the interface.
@jayjwylie jayjwylie Minor changes to address second round of review feedback on rebalance…
… plan.
@jayjwylie jayjwylie Fix to store routing plan to correctly track zone-primaries per-node. ad4b726
@jayjwylie jayjwylie Addressed review feedback on new RebalanceController. 91c0b18
@jayjwylie jayjwylie Added unit tests for RebalancePlan, RebalanceBatchPlan, and AbstractZ…

Added following rebalance test case coverage:
- RebalancePlanTest / RebalanceBatchPlanTest / AbstractZonedRebalanceTest
- no-op / shuffle / cluster expansion / zone expansion
- 2-zone / 3-zone

Also did some clean up of rebalance test utils and rebalance tests in general.
@jayjwylie jayjwylie Fixes after rebase/dropped propagateCluster method
After the rebase with the atomic update of cluster/stores branch, had
some clean up to do. Got code running and unit tests passing again. Note
that had to be regenerated after the merge.

Noted that RebalanceUtils.propagate cluster was unnecessary (and
potentially dangerous) and so dropped it from the code.
@jayjwylie jayjwylie RebalanceController uses atomic cluster/store
- split storeDefs into currentStoreDefs & finalStoreDefs
- clean up some variable naming

- marked all override methods with @Override

- Implemented getPartitionList/getReplicatingPartitionList/getMasterPartition
- Use a single imaginary parition id ("0") to implement these methods
- This allows StoreRoutingPlan to be constructed for this strategy

- fix proxy put test
@jayjwylie jayjwylie Added more zone expansion test for RebalanceBatchPlan d344725
@jayjwylie jayjwylie Completed unit tests for zone expansion
Added zone expansion unit tests to AbstractZonedRebalanceTest. This required tweaking the helper methods in RebalanceUtils and ClusterTestUtils.
@jayjwylie jayjwylie Added junit-rebalance and junit-long-rebalance build targets 92faebd
@jayjwylie jayjwylie Address review of initial RebalanceController work
- Removed 'timeout' option for rebalancing since always setting timeouts
  to infintie (Long.MAX_VALUE) is the recommended practice.

- Cleaned up some TODOs and added some javadoc in response to code
  review from Lei
@jayjwylie jayjwylie Fix invalid metadata rate calculation
Fixed the calculation of invalid metadata rate in RebalanceUtils. Also added more/different pretty printing of a rebalance plan.
@jayjwylie jayjwylie Minor TODO and copyright cleanup. c794504
@jayjwylie jayjwylie Fix some bugs I introduced and added more TODOs
Fixed overflow introduced in AdminClient.waitForCompletion by passing in
Long.MAX_VALUE for duration.

Verify cluster store definition in StoreRoutingPlan. This requires
working around existing problems with how system stores are handled (the
store definition is hard-coded for two zones). Left some TODOs about
testing and fixing all of this.

Added TODOs about currentCluster vs interimCluster. Need to tweak
interface to RebalanceController and RebalancePlan to be consistent with
recommended usage (i.e., deploying interimCluster before starting

Minor tweaks to tests based on above changes.
@jayjwylie jayjwylie Fix potential overflow in AdminClient.waitForCompletion 2bf1543
@jayjwylie jayjwylie Fix arg checking typo in RebalanceControllerCLI. d7f81d7
@jayjwylie jayjwylie Add 'proxy pause' between cluster change & rebalance
The proxy pause is a window during which clients can pick up the new
cluster metadata, and servers can establish proxy bridges, before
servers start moving data around for rebalancing. This allows us to
observe the cost of proxying separate from rebalance and allows clients
to pick up new metadata before any data is moved.

Also added some more TODOs about documenting the various rebalanceState
@jayjwylie jayjwylie Added "progress bar" for rebalance batch plan
Added RebalanceBatchPlanProgressBar
- progress tracking object for each rebalancing batch
- integrated with RebalanceController, AsyncRebalanceTask, and

Did other general clean up of logging during rebalance to make it
appropriately verbose (more verbose in some places, less verbose in
@jayjwylie jayjwylie Fixed cluster nomenclature: current, interim, final
Touched a lot of code to normalize variable names, method names, and
comments about clusters:
- current : current cluster in prod
- interim : current cluster + new nodes/zones w empty partitions
- final : final cluster w shuffled partitions and/or populated new

Added TODO about voldemort server excepting on invalid cluster xml

Dropped TODOs that were unnecessary
@jayjwylie jayjwylie Change variable name in RouteToAllStrategy. e876dd1
@jayjwylie jayjwylie Fixed typo. f816900
@jayjwylie jayjwylie Address minor review feedback
Mostly, made variable names more clear

- cleaned up comments & TODO

- fix subtle bug in when proxyPause invoked

- changed zone expansion test to cover two use cases: when invoked from
  RepartitoinerCLI with a current cluster versus with an interim
@jayjwylie jayjwylie Make KeyVersionFetcherCLI ZoneNAry aware ffc9b68
@jayjwylie jayjwylie Fix to print one timestamp per-line from KeyVersionFetcherCLI e40f793
@voldemort Add comments to the tools 3122693
@jayjwylie jayjwylie Make KeyVersionFetcherCLI have generous timeouts 0ef8faf
@jayjwylie jayjwylie Minor fix to KeyVersionFetcherCLI ec9cd1e
@jayjwylie jayjwylie Correctly print RebalanceTask at creation time. 77f3b69
@jayjwylie jayjwylie Fix final-cluster to be required arg of RebalancePlanCLI f55aace
@voldemort Print the utility value 9ee4ebe
@voldemort Write plan to a file inside the output dir aa96bbf
@voldemort Add zone expansion script c1fb3aa
@voldemort Add an option for choosing zone id for shuffle
Add a new option that lets the user choose if they want the
shuffle to operate only on one of the zone ids.
@voldemort Edit script to use the new option 407d4e4
@voldemort Address code review comments 9d9b47e
@voldemort Add rebalance new cluster script f9936c1
@voldemort Copy final cluster xml file to output dir
When the final cluster.xml is generated it is placed in
outputdir/step3 folder. Copy it to outputdir as well.
@voldemort Move rebalance scripts to bin directory
Moving the script from top level to the bin directory
@voldemort Add dummy-stores.xml + Update the path in script d2848a1
@abh1nay abh1nay Test for 3 zones 4f135d5
@abh1nay abh1nay Addressed code review comments
added more assertions
check for getZoneNary and getNodeIdForZoneNary  in zone 1 and 2
Added a test for method 'storeRoutingPlan.zoneNAryExists()
@vinothchandar vinothchandar Modifying RoutedStoreTest to use Consistent Strategy 3512cff
@abh1nay abh1nay Tests for system store with a 3 zone cluster a758c2a
@abh1nay abh1nay Fixed systemstore constants and got rid of the superfluous zone rep
@jayjwylie jayjwylie Fix RebalanceUtils.getLatestCluster.
- changed getLatestCluster to examine contents of cluster.xml on different nodes rather than timestamps since timestamps are incomparable across servers.

- added a specific rebalance test that goes from current -> shuffle and then shuffle -> current. This confirms that rebalance can be invoked repeated times (if need be).
Commits on Jun 21, 2013
@voldemort Step 1 : Generate cluster script cleanup d2452a9
@voldemort Step 2: Add logic to handle zoned clusters 5d1e8e3
@voldemort Cleanup the arguments and add sanity check 3944a3c
@voldemort Add separate dummy stores for 2 and 3 zone cluster 3d8eb81
@voldemort Fix comments and typos 155b934
@singhsiddharth singhsiddharth Add 1.3.4 and 1.4.0 release notes 4d72ef4
@voldemort Bump release version 93b9495
Commits on Jun 24, 2013
@zhongjiewu zhongjiewu bug fixes in coordinator 33ea454
Commits on Jun 28, 2013
@jayjwylie jayjwylie Tweaks to PartitionBalance and RebalancePlan
- calculate partition-stores per zone.
- This measure provides more context to evaluate the size of any plans to rebalance the cluster.

- fix typo in verbose usage message
@jayjwylie jayjwylie Initial hack at new rebalance scheduler.
Added RebalanceController.scheduler
- limits each node to participating in a single task as either a stealer or donor.
- randomizes the order in which tasks are attempted to be scheduled
- not a clean implementation, but enough to evaluate.
@jayjwylie jayjwylie Tweak new rebalance scheduler
Randomized the order of rebalance tasks in each stealer's list. This
will avoid biasing the rebalance based on the order tasks were
@jayjwylie jayjwylie Addressed review feedback on rebalance scheduler
- Changed default parallelism to "infinite" since scheduler throttles parallelism
- Added TODOs for cleanup of scheduler
- Added javadoc to document scheduler and its methods
- Catch exceptions, log them, re-throw as VoldemmortRebalancingException

A ton of white space changes due to futzing with eclipse code formatter preferences. Sorry.
@jayjwylie jayjwylie Refactor StoreRoutingPlan. Break Base from rest to expedite construct…
@jayjwylie jayjwylie Adding BaseStoreRoutingPlan. bf7daad
@jayjwylie jayjwylie Making StoreRoutingPlan even lighter-weight 5459942
@vinothchandar vinothchandar Update release_notes and for 1.4.2 release 4a95638