1) Sometimes starting the voldemort server fails with the error ObsoleteVersion Exception. This is an override only used by the tests, so it is safe to modify it. 2) On Java 8, ConsistencyFetcher test fails as the set has a different ordering, probably because of the hashCode implementation difference. In this method ordering is not important so I modified the test to ignore the ordering.
If you truncate a read-only store and send read operations get/getall for one of the existent keys it crashes the JVM with SegFault. Some of the data is cached in java collections and it tries to access the memory mapped file which has been closed by the truncate. This causes the JVM to crash. Now the java collections are cleared and the code tries to handle this gracefully. But this is not an ideal code more of a workaround. Fixed spewing of log messages to the standard output, as they are already captured in the log4j loggers. Added unit test for the read after truncate scenario.
1) Made ReplaceNodeCLI unit testing friendly, removed System.exits with throw Exception and quitting in the main method instead of each methods. 2) Added post condition checks to the ReplaceNodeCLI to verify for consistent stores.xml and cluster.xml. Added the testing for ReplaceNode 1) Check if a node can be swapped when it is down 2) Check if a node can be swapped when it is up 3) Check if you can move a hard disk and replace a node down While doing the tests, there are many client threads simulating the traffic to make sure that they dont see any exceptions.
Share the same Constant from SystemStoreConstants for stores.xml cluster.xml and metadata-versions. Previously each class declared its own constant and searching for the usage within the source was a nightmare. I had to do multiple file searches to piece together information. This will help in the future to make the dependencies easier to track.
Refactored the client traffic verifier into its own class. No code change is done. This is just moving code into a new seperate class file and renaming the instantiation and references. Other refactorings will follow this checkin.
* Fixing StoreClientConfigService and refactoring FileBasedStoreClientConfigService * Fixing Coordinator unit tests
1) Previously when the value bytes can't be deserialized it errored out immediately and it does not print values of other nodes. You can do node by node, but failed nodes can never be retrieved. Now if it fails, byte array output will be printed. 2) When node does not have a key, it printed invalid metadata exceptions for all other nodes. Now they are skipped in the output. 3) Does not report what all nodes had same value and which ones differed.
…onfigs now works.
Problems: 1) Exception handling of delete is very different at 4 places ( on normal response, required failure, quorum failure) and after pipeline is finished. 2) The exceptions are reported again and again ( They are not removed from the map). 3) Some places ignore obsoleteVersionException, some others report it. 4) There is a zombie state abort, there is no way to reach this state. 5) Multiple slops could be sent, because of the issue 2. When the pipeline is aborted, no slops could be sent. 6) Refactored QuotaLimitingStore test to add delete test cases 7) Combined the PUT and GET quotas into 2 quotas. Solution: Defined a common method, so that all 4 places call into the same method. Only the condition for calling is different. Race conditions still exist, after zone failure check but before pipeline finishes the exception will go missing, slops will not happen, but the chances are reduced. Got rid of the state PerformDeletedHintHandoff as QuotaExceededExceptions will not be reported as failures. Now doing it in place.
1) When Server fails to start because of an invalid store, the store is logged in the error message. So that it is actionable. 2) Added checks to zone proximity list to avoid same zone and duplicate zone ids. 3) Modified Zones from LinkedList to ArrayList, as get operation is more efficient in ArrayList than the LinkedList. 4) Refactor the common code in zone calculations to common functions. 5) Added unit tests to cover the new checks added to zone proximity list.