Skip to content

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also .

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also .
Checking mergeability… Don’t worry, you can still create the pull request.
This comparison is big! We’re only showing the most recent 250 commits
Commits on Jun 19, 2012
@zhongjiewu zhongjiewu added unregister MBeans when closing pool 1197e9b
@zhongjiewu zhongjiewu registers MBean in so that ClientS…
…ocketStats object will exist for testing
@zhongjiewu zhongjiewu moved registering JMX inside ClientRequestExecutorPool 320203f
@zhongjiewu zhongjiewu moved unregistering of the child stats beans into the parent stats f527daa
@zhongjiewu zhongjiewu changed checkout record code to make sure timeouts are recorded be22441
@zhongjiewu zhongjiewu Added concurrent tests for Histogram related code 32aa7d5
Commits on Jun 22, 2012
@vinothchandar vinothchandar finer timeouts and partial getalls. 10211c7
@vinothchandar vinothchandar Refactor TimeoutConfig 35f5681
@vinothchandar vinothchandar Reuse VoldemortOpCode f70b3c2
Chinmay Soman Fixed the conflicts ea0c80b
Commits on Jun 27, 2012
Lei Gao add a java file to define all system store constants, including syste…
…m store defs
Lei Gao add clientId for voldemort client b59973e
Chinmay Soman Adding System store functionality a3e9359
Chinmay Soman Added the Voldemort Client automated re-bootstrap mechanism 441a936
Lei Gao add clientId for voldemort client a21bf21
Chinmay Soman Adding System store functionality 234ac9b
Chinmay Soman Added the Voldemort Client automated re-bootstrap mechanism b2cbdf8
Chinmay Soman Adding System store functionality ea64953
Lei Gao client registry impl 7b3c8e2
Lei Gao add version and update time to client registry dbc2f54
Lei Gao fix a merge error ba4127c
Lei Gao fix merge conflicts 3fbef94
Commits on Jun 28, 2012
Lei Gao fixed addtional bugs during merge d4e28a8
Lei Gao merge code for automatic reboostrap e695a64
Commits on Jun 29, 2012
Lei Gao resolve merge conflict in stores.xml cb23d9f
Lei Gao allows fetch-keys to fetch keys from system stores 9a582b4
Commits on Jul 03, 2012
Lei Gao add SchedulerService in voldemort client 20b47b7
Chinmay Soman Adding tests for system stores, local pref routing and Version manage…
…r. Fixed some things in PipelineRoutedStore
Commits on Jul 06, 2012
@vinothchandar vinothchandar BDB+ Data cleanup Monitoring changes 88cb37c
Commits on Jul 10, 2012
Lei Gao code clean up 05f23ec
@zhongjiewu zhongjiewu Deleted unnescessary mbean registration 70718b4
Lei Gao Merge remote branch 'leigao/client-registry' into creg 35a887e
Lei Gao avoid jmx id from being incremented when factory is created for syste…
…m stores
@pbailis pbailis Adding extra debug messages for tracing in Voldemort fa3d6f6
@zhongjiewu zhongjiewu Merge pull request #83 from pbailis/master
Debug messages for initial Voldemort profiling
Commits on Jul 11, 2012
@stotch stotch Added default behavior, handling of long and short options for modify…
…ing defaults, zone support, random seed generation, removed text width formatting as it is not necessary, cleaned up format, refactored code, sys module no longer necessary (now that argparse is used) so removed it, added default interpreter path
@stotch stotch Fixing typo a560629
@zhongjiewu zhongjiewu Added log4j properties folder for junit test ffff0f7
Commits on Jul 12, 2012
@stotch stotch Fixed --seed argument handling so that it actually works, now 3aa9d12
Commits on Jul 13, 2012
@vinothchandar vinothchandar BDB Cache partitioning 986c8f2
@vinothchandar vinothchandar add minimumSharedCache param + more tests. 4e628fe
Commits on Jul 20, 2012
@stotch stotch Getting rid of some unused and deprecated code, added in a comment to…
… help explain and separate the partition shuffling
Commits on Jul 23, 2012
@pbailis pbailis Include log4j resources.dir in all tests via build.xml 3886bb1
@pbailis pbailis PUT returns null, leading to a NullPointerException when debug loggin…
…g is enabled in SocketStore
Commits on Jul 24, 2012
@zsimic zsimic Removed 'test-view' from config/single_node_cluster as bin/voldemort-…
… fails to start with it
Commits on Jul 25, 2012
Lei Gao add close method for shutting down client's SchedulerService in Abstr…
Commits on Jul 26, 2012
@stotch stotch Per Jay Wylie's review, no longer using a dict to store the partition…
… sets and fixed two typos
Commits on Jul 27, 2012
@jayjwylie jayjwylie Merge pull request #86 from stotch/gencluster-enhance enhancements
Commits on Jul 28, 2012
@jghoman jghoman Don't make a copy of the buffer, as it may shrink down to one elemnent.
Also, use a bufferedoutputstream.
Commits on Jul 30, 2012
Lei Gao undo changes to mbean registration names - don't use client context a…
…s part of the name
Commits on Jul 31, 2012
@icefury71 icefury71 Merge pull request #88 from jghoman/badarraycopy
Don't make a copy of the buffer, as it may shrink down to one elemnent.
@jayjwylie jayjwylie Merge pull request #77 from glinmac/python-client-disconnect
Handle disconnection
Commits on Aug 01, 2012
Chinmay Soman Updating the checksum code to handle computation for a buffer range d15b605
@zhongjiewu zhongjiewu Merge pull request #85 from pbailis/master
Enable in all tests and fix hidden debug NPE
Commits on Aug 02, 2012
Chinmay Soman Added file backed storage engine. Factored out ZenStoreClient from De…
…faultStoreClient (with a configurable switch). Added unit tests.
Commits on Aug 06, 2012
@jayjwylie jayjwylie Merge pull request #87 from zsimic/master
Removed test-view from config/single_node_cluster
Commits on Aug 07, 2012
@zhongjiewu zhongjiewu Added CNAME for github docs to accept ec190a0
@zhongjiewu zhongjiewu Changed CNAME file for test 3feca72
Commits on Aug 09, 2012
@pbailis pbailis Improving debug messages in request tracing 98643ea
Commits on Aug 10, 2012
@zhongjiewu zhongjiewu Deleted CNAME in master 4f8c978
@jayjwylie jayjwylie Cleaned up help/usage messages within the client shell. In particular…
…, print out possible values of meta_key for getmetadata and clarified how to use fetchkeys and fetch.
Commits on Aug 13, 2012
@zhongjiewu zhongjiewu Merge pull request #89 from pbailis/master
Improve debug messages in Voldemort
Commits on Aug 17, 2012
@zhongjiewu zhongjiewu Revert "Improving debug messages in request tracing"
This reverts commit 98643ea.
Commits on Aug 28, 2012
@icefury71 icefury71 Merge pull request #90 from jayjwylie/client-shell
Cleaned up help/usage messages within the client shell.
Commits on Aug 30, 2012
@vinothchandar vinothchandar - 'retention-frequency' to control frequency for running DataCleanupJob
- Use same jmxid as the factory across the board
- add server config to control socket backlog
- add count() jmx call to obtain number of k-v pairs from store
@vinothchandar vinothchandar Adding unregisterMBean calls 33c809e
Commits on Sep 06, 2012
@zhongjiewu zhongjiewu Fixed a bug that will make GetAll to go to one more node than preferred 2b048f3
@zhongjiewu zhongjiewu More tests for getall fix da04c85
@zhongjiewu zhongjiewu Minor changes to tests 604324d
@zhongjiewu zhongjiewu refactored streaming and writing of voldemort admin data printing 614eab9
@zhongjiewu zhongjiewu Added admin option to query keys on specified nodes b18840b
@zhongjiewu zhongjiewu Deleted multiple store detection for --store option ef91db7
@zhongjiewu zhongjiewu Reuses --stores option and added multiple store query support. Fixed …
…bug that prevents queryKeys, fetchEntries, fetchKeys to print results beyond the first store
@zhongjiewu zhongjiewu Readded peter's change 878af04
Chinmay Soman Added tests for auto-rebootstrap, Failure detector fix to track just …
…one state of the topology (instead of immutable states), added ZenStoreClient
@zhongjiewu zhongjiewu Added possible fix for rebalance tests random failures bca4a12
@zhongjiewu zhongjiewu Corrected help message for a33b280 31bab19
Commits on Sep 11, 2012
Chinmay Soman Code cleanup, bug fixes for system stores and auto-rebootstrapper 440832e
Chinmay Soman Added a instantInit flag to LazyClientStore, removed DefaultSocketSto…
…reClientTest, returning DefaultStoreClient for Http protocol
Commits on Sep 12, 2012
Chinmay Soman Bug fix: initializing system store client in AdminClient during every…
… operation to account for cluster.xml changes
Chinmay Soman Updating release version to 0.96 in c54100d
Commits on Sep 13, 2012
@zhongjiewu zhongjiewu Modified behavior of getall to comply javadoc when key does not exist…
… in store
Commits on Sep 14, 2012
Chinmay Soman Bug fix in AsyncMetadataVersionManager. Client config parameters adde…
…d for System store. Changed AdminClient to use a Timestamp instead of a counter for metadata version
Commits on Sep 17, 2012
Chinmay Soman Removed unnecessary variables from AsyncMetadataVersionManager 78ae5d5
Commits on Sep 19, 2012
@zhongjiewu zhongjiewu increased sleep time in RoutedStoreTest c26373f
Commits on Sep 20, 2012
Chinmay Soman Fixed the merge issues from master. Fixed a bug in SocketStoreClientF…
@brentdmiller brentdmiller [#108] allow getting byte and object arrays using 87914fe
Commits on Sep 21, 2012
Chinmay Soman Final review comments correction for autobootstrapper: Copyright, doc…
…umentation and variable naming convention
Chinmay Soman Merge branch 'master' of into autoboot…
Chinmay Soman Updating release notes for 0.96 open source release 5b70b50
Commits on Oct 02, 2012
@zhongjiewu zhongjiewu Merge pull request #109 from brentdmiller/master
allow to dump byte & object arrays
Commits on Oct 04, 2012
@abh1nay abh1nay Avro build and push support removes azkaban dependancies and refactors
out all classes from batch commons
@abh1nay abh1nay Refactoring reducer logic 9a78082
@abh1nay abh1nay Added support for specifying the key and value field in the avro record 65e8bee
@abh1nay abh1nay some clean up 29d370b
@abh1nay abh1nay Rfactored partitioner and mapper d5f9c9c
@abh1nay abh1nay Rfactoring mapper logic out a63be3c
@abh1nay abh1nay Fixed bug in partitioner d74d185
@abh1nay abh1nay Fixed error message in case fetcher raises an exception 67627cb
@abh1nay abh1nay Added retry logic to fetcher 4732f8d
@abh1nay abh1nay Added destination name to the status message in the fetcher b9a91c4
@abh1nay abh1nay Added copyright and fixed code formatting 8e403b2
Commits on Oct 09, 2012
@jayjwylie jayjwylie Minor cleanup in preparation of making socket checkouts along non-blo…
…cking code paths asynchronous.

- fix missing ns to ms conversion

- remove unnecessary selectionKey from checkTimeout interface

- removed extraneous maxCreateAttempts
- attemptGrow returns true if pool grew
- refactored checkoutOrCreateResource
  - renamed to attemptCheckout
  - attemptCheckout is non blocking
  - attemptCheckout does not check timeouts
- refactored checkout
  - removed unreachable code (only one attempt could ever be made so attempts and maxCreateAttempts are extraneous.
  - made it clearer that exceptions control flow through try block
* even though code looks a lot different, I believe exact
  functionality is preserved except for one thing: the
  possibility of a single additional non-blocking get on the
  pool (resources) when attemptGrow returns true.
@jayjwylie jayjwylie Adding a SlowStorageEngine to permit end-to-end testing with slow
servers in a cluster.

- The SlowStorageEngine/Configuration is inspired by the
  SlowStore used in other unit tests. The SlowStorageEngine
  produces delays on a per-operation-type basis. Delays can
  either be concurrent or queued: concurrent delays can overlap
  in time, queued delays occur in serial.

- Added config options for SlowStorageEngine

- Unit test to confirm queued/concurrent delay behavior of a single SlowStorageEngine
@jayjwylie jayjwylie he current nonblocking API can actually block if a server slows
down. The server slowdown can result in a socket checkout
blocking because all sockets are exhausted and there are a queue
of threads doing a blockingGet to get the socket. Adding an
integration test, E2ENonblockingCheckoutTest, that demonstrates
this undesirable behavior. Once a fix is in place, this
integration test may become a unit test for the correct behavior.

- Demonstrates the blocking behavior of the current socket
  checkout code. The integration test sets up a local voldemort
  instance with three servers, one of which is slow. Puts are
  done by two threads to a specific key that uses a fast node as
  the master, then the slow node, followed by the other fast node
  for the parallel puts. The second thread to attempt parallel
  operations blocks upon the first returning the checkout
  socket. This is observed in the console output.

- Added another startVoldemortServer method that takes a VoldemortConfig

- minor fix to logger.debug print out so .toString is not called on null object
@jayjwylie jayjwylie Code clean up and refactoring in preparation of adding truly nonblock…
…ing requests.

- Moved logic from request async into ClientRequestExecutorPool::submitAsync
- Also moved private class NonblockingStoreCallbackClientRequest into ClientRequestExecutorPool

- added private class AsyncRequestContext to start parceling up request context
- added submitAsync and submitAsyncRequest methods. These currently do exactly what was done by SocketStore:requestAsync
- I.e., I do not believe I have changed the behavior of any of this code. I just moved it to where I need to change it.

- made some private methods protected with the expectation of
  extending this class with a subclass that can queue up async
  requests for resources
- changed many variable names to be more consistent in naming/usage
- refactored attemptCheckout to be simpler; added attemptCheckoutGrowCheckout to parcel up old behavior of attemptCheckout
- Do not believe I changed any behavior with these code changes.
- added getResourcePoolForExistingKey method
  - Throws IllegalArgumentException whenever a non existing pool is requested
  - changed all bare resourcePoolMap.get(key) accesses in other methods to use this method
  - May have changed behavior in that NPEs due to pool==null will now be IllegalArgumentExceptions in some methods
- Documented that some getFooCount methods are approximate in face of concurrency via comments in the code
@jayjwylie jayjwylie Additional refactoring and code prep before making connection checkou…
…t async.


Substantial refactoring and preparation of
ClientRequestExecutorPool to actually do an asynchronous
checkout. I believe I have essentially preserved prior
behavior. I.e., checkout of destination still blocks. But, the
pattern now looks a lot more like what is needed to do the
checkout asyncrhnously and then callback with the checkedout
resource. The exact exception strings likely changed during this
refactoring (but control flow should be the same).


Skeleton of new class QueuedKeyedResourcePool that extends
@jayjwylie jayjwylie Initial commit of asynchronous checkouts.
- print out more concise timestamp (don't print the date)
- print thread at end of message

- reverted all protected data members back to private
- refactored attemptGrow to do the size check to determine if it
  is worth trying to attempt to grow. This made the method more
  useful to subclasses and cleaned up the local member that
  called it.
- added an internalClose method that returns whether or not the
  caller is "the one thread" responsible for closing everything

- first complete implementation of this class
- Four TODOs left in the code for the sake of interim code review

- minor tweaks to the test
@jayjwylie jayjwylie Clean up of KeyedResourcePool and significant hardening of the unit t…

- Documented the invariants (or lack thereof) guaranteed by this class.
- Documented this classes expectations of its users.
- Moved attemptGrow into the inner Pool class.
- Got rid of the attemptCheckoutGrowCheckout method. ;)

- tweaked to match revised attemptGrow interface

- Added a bunch of 'negative' tests. I.e., they demonstrate
  non-desirable behavior of current KeyedResourcePool.
- Added a contention test that has many threads checkout,
  possibly invalidate, and then checkin resources for some key.
@jayjwylie jayjwylie Added unit test for QueuedKeyedResourcePool. Minor clean up of
QueuedKeyedResourcePool and of KeyedResourcePoolTest. Still one
outstanding TODO in QueuedKeyedResourcePool wrt semantics of
close(K key) method.
@jayjwylie jayjwylie Made ant target 'junit-test' produce a report like all the other
junit ant targets. The report is useful because for a test class
with many sub-tests, the report clearly explains which sub-tests
failed. Unlike other ant junit targets, this target also produce
a plain text report. The plain text report is useful for running
a single test in a loop until it fails.

Clarified the help message for the 'junit-test' ant target.
@jayjwylie jayjwylie Change SlowStorageEngineTest to avoid JUnit assertions in worker
threads. Also removed JUnit4 style annotations (@Test) since this
is actually a Junit3 style test (because it extends other Junit3
style tests that extend TestCase).
@jayjwylie jayjwylie Wrapped tests that hang because of my changes with timeouts. This
is necessary for me to debug these tests and easily run all the
other tests. (By 'hang', I mean get into a state where clients
run forever complaining that there are not enough servers.)
Wrapping with timeouts is done both at the Junit
level (i.e., "@Test(timeout = ...")  and at the ant level (i.e.,
for an entire unit test).

Switched the tests I was touching to use Junit4 idiom rather than
Junit3 idiom. I.e., removed the 'extends TestCase' from the class

Upped the memory allowed for all junit tets in ant from 1024m to
2048m since I ran into ant out of memory errors.
@jayjwylie jayjwylie Addressed most feedback from reviews by refactoring:
- Made ResourceRequest a first class entity rather than a nested interface
- Refactored TimeoutConfig to tease apart an OpTimeMap which may be more generally useful.
- Renamed slow storage configs in Voldemort config with to make their testing nature more clear.
- Dropped OperationDelays object from SlowStorage in favor of OpTimeMap
@jayjwylie jayjwylie Moved SlowStorageEngine into tests/integration. 0c2bd87
@jayjwylie jayjwylie Fix the problems with the RebalanceTest when using the revised
KeyedResourcePool. The test setup worked before the changes
because the KeyedResourcePool did not aquire resources until
full. With the new behavior, there is some mismatch between pool
sizes and numbers of threads. Increasing the number of admin
threads fixes the issue. There may be a better way of fixing this
@jayjwylie jayjwylie Addressed most of the feedback from the code review.
- Renamed many variables, methods & classes
- Addressed most of the TODOs in my changes based on the feedback
@jayjwylie jayjwylie src/java/voldemort/store/routed/action/
- clean up handling of ObsoleteVersionException to do what the
  client said should be done. This will stop
  ObsoleteVersionExceptions from being escalated to
  InsufficientOperationalNodes exceptions.

- Add connections/node to configuration of benchmark tool
- Set client timeouts to recommended values

- refactored a synchronized method to isolate the bare minimum
  steps that need to be synchronized. This allows local.complete
  to be called outside of 'synchronized' which avoids deadlocking
  nio selector threads.
@jayjwylie jayjwylie Implementations of various async vs sync queueing policies for socket…
… checkout. Two commented out policies are included in this commit.

- a few TODOs to be investigated before completing work on async checkouts

- a couple helper methods for implementing/debugging queueing policies

- refactor to clean up checkin method
- TODOs for further code cleanup
- cleaned up all methods for tracking stats, added stats tracking of length of synchronous queue
- various aspects of (commented out) socket checkout queuing policies

- fixes to async socket checkout
- various aspects of (commented out) socket checkout queuing policies
- TODOs for further code cleanup
- cleaned up stats tracking for async queue length

- minor tweaks/cleanup
@jayjwylie jayjwylie Removed the commented out implementations of distinct policies for as…
…ync socket checkout.
@jayjwylie jayjwylie Added Jmx interfaces for all queue stats we now track. Updated Client…
…SocketStatsTest as well.

Added a big TODO expressing concern over how statistics are tracked with suggestions for improvements.
@jayjwylie jayjwylie Minor cleanup --- changed some todos to documentation and comments. a28bcd9
@jayjwylie jayjwylie Minor changes to deal with remaining TODOs in this change. I still be…
…lieve there are some ugly code paths that fire off too many exceptions when we tear down a connection. Hopefully, the connection re-write that is starting off will clean up these ugly code paths.
@jayjwylie jayjwylie Additional Jmx Getters so that we can better understand stats sample …
@jayjwylie jayjwylie Copyright statement cleanup. f2466fa
@jayjwylie jayjwylie bumped up test timeouts since Hudson seems slower than local machine …
…for contention experiments.
@jayjwylie jayjwylie synchronize the reset of a specific keyed pool to avoid invoking dest…
…royResource on the same resource multiple times.
@jayjwylie jayjwylie Fixed E2E non blocking checkout test to actually check for non-blocki…
…ng checkouts. (Addresses review feedback from Chinmay Soman.)
@jayjwylie jayjwylie Switched SlowStorageEngine to take a StorageEngine<K,V,T> in the cons…
…tructor to be more flexible.
@jayjwylie jayjwylie Clean up get stats methods in (Queued)KeyedResourcePool. Fix error in…
… test case.
@jayjwylie jayjwylie Changes to make tight timing tests for QueuedKeyedResourcePool and Sl…
…owStorageEngine less sensitive on slower machines.
@jayjwylie jayjwylie Added @override to some methods as Eclipse asked me to. Fixed missing…
… include in test.
Commits on Oct 10, 2012
@jayjwylie jayjwylie Minor fixes to various comments to clarify some implementation/usage …
@jayjwylie jayjwylie Fixed up ClientSocketStatsTest to match change in return code from -1…
… to 0.
Commits on Oct 11, 2012
@jayjwylie jayjwylie Changed both serial (sync) operations and parallel (async) operations…
… to deduct the elapsed checkout time from the operation (routing) timeout for specific requests.
@jayjwylie jayjwylie Added comment to explain why NPEs can end up in the log during shutdo…
…wn if there async requests are queued up.
@abh1nay abh1nay Added support for schema evolution for the Avro generic serializer 4b83abd
@abh1nay abh1nay Added unit test 6aaecc8
@abh1nay abh1nay Fixed comment
 Please enter the commit message for your changes. Lines starting
@abh1nay abh1nay Added check for schema backwards compatibility for Avro
in Admin Client tool and server startup
@abh1nay abh1nay Refactored checkcompatibility into Validator class as a static method f66c20d
@abh1nay abh1nay Changed test case to Junit 4 dc4d7b6
@abh1nay abh1nay fixed test case d919558
Commits on Oct 12, 2012
@abh1nay abh1nay code cleanup 5e00c76
Chinmay Soman Added the ability to auto-bootstrap on store definition changes 8804f79
Commits on Oct 15, 2012
Chinmay Soman Updated AsyncMetadataVersionManagerTest for checking individual store…
… definition updates. Added mechanism in VoldemortAdminTool to update individual store metadata version
@abh1nay abh1nay Added support for Avro schema evolution in RO Stores 1ba1779
@abh1nay abh1nay Added check in serializer
if writer's schema greater than reader raise an exception
@abh1nay abh1nay - Refactored schema check method
- added fix to the verioned serializer to support writing of objects
  created using the old schema
Commits on Oct 16, 2012
@abh1nay abh1nay Added comments to the new functions 5aacdc7
@abh1nay abh1nay Added testcase for schema evolution check 1524721
@abh1nay abh1nay fixed try catch in versioned avro serializer d685b67
@jayjwylie jayjwylie Adding a specific mini test that exercises ServerTestUtils.startVolde…
…mortServer. This ~15 line program that simply starts some Voldemort servers using test utils can tickle two different intermittent failures:

(1) ObsoleteVersionException when loading cluster.xml

Testcase: startMultipleVoldemortServers took 0.385 sec
	Caused an ERROR
A successor version version()  to this version() exists for key cluster.xml
voldemort.versioning.ObsoleteVersionException: A successor version version()  to this version() exists for key cluster.xml


(2) A bind issue characterized as follows:

Testcase: startMultipleVoldemortServers took 2.066 sec
        Caused an ERROR Address already in use
voldemort.VoldemortException: Address already in use
	at voldemort.server.niosocket.NioSocketService.startInner(
	at voldemort.server.AbstractService.start(
	at voldemort.server.VoldemortServer.startInner(
	at voldemort.server.AbstractService.start(
	at voldemort.ServerTestUtils.startVoldemortServer(
	at voldemort.utils.ServerTestUtilsTest.setUp(
Caused by: Address already in use
	at Method)
	at voldemort.server.niosocket.NioSocketService.startInner(
@jayjwylie jayjwylie Hardening test utils and tests to reduce the number of intermittent
BindException errors due to a TOCTOU issue with getLocalCluster.

The main improvement is the addition of
ServerTestUtils.startVoldemortCluster that wraps getLocalCluster and a
bunch of startVoldemortServer calls in a retry loop based on wether a
BindException occurs. This is suitable to ~75% of our test cases that
use getLocalCluster.

- Added startVoldemortCluster

- Tests for ServerTestUtils to reproduce intermittent failures

- TODO note about method that plays a role in another intermittent test failure invovling cluster.xml

Switched test to use startVoldemortCluster

Junit3 -> Junit 4

Annotated with a TODO about the test still needing to be hardened
against TOCTOU issue with getLocalCluster:
@jayjwylie jayjwylie Additional hardening of tests to reduce the number of intermittent
BindException errors due to a TOCTOU issue with getLocalCluster.

- switched to startVoldemortCluster

- hand-coded test-specific startParallelVoldemortCluster. Not pretty. Not pretty at all. But, should retry in the face of such exceptions.

Switched TODO to comment about possible susceptability to BindExceptions:
- test/unit/voldemort/client/rebalance/
- test/unit/voldemort/scheduled/
@jayjwylie jayjwylie updated copyrights on touched files. 7d60d6f
@jayjwylie jayjwylie Fixed two small errors I introduced while fixing tests (or merging wi…
…th master).
@jayjwylie jayjwylie Cleaned up comments and TODOs from prior commits. 61c11c7
@jayjwylie jayjwylie Added two tools for repeatedly running junit tests. Either specific t…
…ests, or all of junit. The benefit of these scripts is that the results from each run are archived toa temp directory. This allows you to stress test big changes, find intermittent failures, and so on.
Commits on Oct 17, 2012
@jayjwylie jayjwylie Hardened junit long test Other test hardening.
- bumped all maxmemory settings to 2048m
- Placed a 90 minute timeout on the long test at ant level.

- null out some objects in the hopes of reducing the overall memory footprint of these tests. We are truly abusing junit with a long-running, multi-threaded test, that has 10 sub tests and 4 distinct parameter settings.

- start of tests is not clear in junit log output. Added to start of tests to make grepping through the log when tests have failed badly and/or are running in an infitie loop easier.
- Bumped each test timeout up to 10 minutes. Again, note abuse of junit: tests should not be defined at the abstract class level. This makes it hard to set appropriate limits (such as timeout) for each specific test. Long tests should have a different timeout than short tests...
@jayjwylie jayjwylie Bumped release to 1.0.0 and added release notes. c6a6e21
@jayjwylie jayjwylie Tweaked release notes b6447a5
@jayjwylie jayjwylie Explained the versioning number change. 922ec3a
Commits on Oct 18, 2012
@jayjwylie jayjwylie Revised NOTE in the release_notes about version numbering. 5a021db
Commits on Oct 19, 2012
@vinothchandar vinothchandar initial commit - new duplicate handling 692b63f
@vinothchandar vinothchandar Code review changes 0d449a7
@vinothchandar vinothchandar Upgrading to JE 4.1.17 07e509d
@vinothchandar vinothchandar Implementing partition scans 4c1064e
@vinothchandar vinothchandar Partition scans - more tests, typo fixes fbe56d2
@vinothchandar vinothchandar Code review - partition scan 98182b6
@vinothchandar vinothchandar Adding Partition Scan support for rebalancing 68b31b9
@vinothchandar vinothchandar Add BDB params -- background_proactive_migration, level based eviction 7c9e2b0
@vinothchandar vinothchandar BDB data conversion utility 5e48136
@vinothchandar vinothchandar Fixing SlowStorageEngine and FileBackedCachingStorageEngine build issues 3d51f85
@vinothchandar vinothchandar Added parameters to control retention job
1. day of the week the retention job starts
2. if the retention job starts at the same hour each day
@vinothchandar vinothchandar Add RetentionEnforcingStore, with support for online retention on reads cd16456
@vinothchandar vinothchandar Resolving Conflict: Adding Imports back in test 3f1ec39
@vinothchandar vinothchandar Updating release notes and version c227e34
Commits on Oct 29, 2012
@jayjwylie jayjwylie Fixes for connection leak and ZenStoreClient config
- Applied fix for socketChannel leak in ClientRequestExecutorFactory.create()
- Added comments to document other code paths at risk of leaking socketDestinations
- changed ClientConfig default from ZenStoreClient to DefaultStoreClient
- updated release notes
Commits on Oct 30, 2012
@jayjwylie jayjwylie Bumped curr.release to 1.1.1 3c99c01
Commits on Oct 31, 2012
@jayjwylie jayjwylie Revert return type of Versioned.getVersion() to be Version rather tha…
…n VectorClock
@jayjwylie jayjwylie Prepared release 1.1.2 84eda3a
Commits on Nov 21, 2012
@abh1nay abh1nay Fixed mapper issue 60a987c
Commits on Nov 26, 2012
@abh1nay abh1nay fixed avro mapper d56a614
Commits on Nov 27, 2012
@abh1nay abh1nay Changed release number 349f852
Commits on Nov 29, 2012
@abh1nay abh1nay Implemented a jna mlock to map and ping index files for RO stores in
@abh1nay abh1nay Code cleanup 32fb76a
@abh1nay abh1nay Hardcoding indexmlock to true 763ff89
@abh1nay abh1nay debug msgs# Explicit paths specified without -i nor -o; assuming --on…
…ly paths...
@abh1nay abh1nay Fixed bug with typesetting of the native args by wrapping it into a
native wrapper
@abh1nay abh1nay Fixed constructor to actually take the mlock parameter
made it true by default
@abh1nay abh1nay Adding mlock for RO stores 99dc97f
@abh1nay abh1nay Added release notes 374d02a
Commits on Dec 06, 2012
@abh1nay abh1nay Add support for kerberized grids in the job by supporting protocols daa49bf
Chinmay Soman Added configurable Kerberos support to HdfsFetcher and upgraded hadoo…
…p jars to 1.0.2
Chinmay Soman Fixed the main method params in HdfsFetcher 7693d6b
Chinmay Soman Doing authentication in a synchronized block for the Hdfs fetcher, se…
…tting correct permission for the hadoop files
Chinmay Soman Fixed a jmx unregister bug in hdfs fetcher 0f2f85e
Chinmay Soman Fix in FsPermission constructor to maintain hadoop jar backwards comp…
Chinmay Soman Fixing a bug where we dont need to have hadoop conf in the classpath 2965762
Chinmay Soman Correcting usage: proxyUser to kerberosUser 577378e
Chinmay Soman Standardizing the Kerberos login phase : explicitly specify the Hadoo…
…p config and keytab path. Also assumes that extra kerberos related config parameters are passed to the Java process
Chinmay Soman Finalized changes to HdfsFetcher to make it work with a Kerberized Ha…
…doop cluster over webhdfs
Chinmay Soman Adding a hack to bypass Kerberos authentication for hftp protocol. TO…
…DO: remove this bypass once the libhadoop on Solaris is well tested
Commits on Dec 07, 2012
Chinmay Soman Adding KDC info and other JVM arguments to the voldemort scripts 7d93e6d
Chinmay Soman Removing the additional JVM args from the Voldemort scripts ef5420f
Chinmay Soman Creating constants for the default kerberos principal and keytab path fbe6718
Commits on Jan 02, 2013
@vinothchandar vinothchandar Stats to understand NIO layer performance + BDB exception counts et al dd29d0e
@vinothchandar vinothchandar NIO + BDB stats - Code review comments 5aa3716
Commits on Jan 04, 2013
Chinmay Soman Updated release_notes and release version to 1.1.7 including Kerberos…
… related changes
Commits on Jan 10, 2013
@jayjwylie jayjwylie Fix tests for testStartVoldemortCluster to not consume so much memory
- remove stress test from normal testing
- test the method startVoldemortCluster once and confirm it returns a non-null Cluster object

- clarified a comment
Commits on Jan 11, 2013
@vinothchandar vinothchandar Monitoring for streaming operations 903e749
@vinothchandar vinothchandar Improve batch modifications on BDB-JE ec3b37e
Chinmay Soman Added the ability to delete old checksum files in the Build and Push …
…reducer. Also updated the hadoop core library to version 1.0.4-p2-rc2 containing the fix to the User ID command usage.
Chinmay Soman Updated the hadoop-core jar to 1.0.4-p2 c611a8c
Chinmay Soman Fixing .classpath which had the wrong hadoop-core version. Also chang…
…ed the Mlock related info messages to debug.
Commits on Jan 12, 2013
@vinothchandar vinothchandar Minor debug fixes f97699b
Commits on Jan 14, 2013
Chinmay Soman Bug fixes in HdfsFetcher revealed by HdfsFetcherTest (NPE and hiding …
…the VoldemortSerializationException)
@vinothchandar vinothchandar Fixing intermittent test failures 2d12189
Commits on Jan 15, 2013
@vinothchandar vinothchandar Release notes & version update for 1.1.8 adc3eb9
@jayjwylie jayjwylie Change async checkout behavior to be more like sync checkout: only c…
…reate resources (connections) when they are needed.

    Now, connections should only be created on demand. The initial code created connections until the max limit of the pool was

    Some minor tweaks to test that confirm the desired behavior at a unit test level.
@jayjwylie jayjwylie Changed reset and destroyRequestQueue to avoid any heavy weight synch…
…ronization methods. This should fix deadlock issue.
@jayjwylie jayjwylie Remove an unnecessary synchronization primitive from keyedresourcepool. 1ac1b32
@jayjwylie jayjwylie Fixed Histogram: halved memory footprint, test boundary conditions, d…
…ropped unnecessary binary search.

These changes preserve/correct behavior of the current Histogram.

- Halved memory footprint by dropping unnecessary "bounds" array
- Dropped unnecessary binary search, making insert O(1) rather than O(log(nBuckets))
- Improved documentation
- Made interface consistent for type of values inserted/got from histogram (i.e., all are 'long')

- Added tests for boundary conditions: -ive values are dropped on
  insert, "too large" values are bucketed in the final bucket on

Minor fixes to RequestCounter calls to histogram to conform to 'long' interfaces.
@jayjwylie jayjwylie Added INFO level messages to better understand statistics tracking (S…
…toreStats, SocketClientStats, Histogram) behavior. Expect to remove most of these messages after debugging.

- one more check to harden the imnplementation of insert

- durationMs from int to long
- print out timing of histogram reset (q95, q99, reset)
@jayjwylie jayjwylie Added INFO level messages to better understand connection creation. E…
…xpet to remove most of this after debugging.

- print out time to establish connection (if it takes longer than 1 ms)

- print out info about object creation (connection establishment). In
  particular, how many outstanding creations (connection
  establishments) are in flight and how many idle resources are in the
  pool after the newly created resource is checked in.
@jayjwylie jayjwylie Added INFO level messages to better understand resetting stats in Cli…
…entSocketStats. Expect to remove this after debugging.
@jayjwylie jayjwylie Reverting minor-minor version change in 515b139
@jayjwylie jayjwylie Minor tweak to reduce INFO logging for KeyedResourcePool::attemptGrow 7b8d30c
@jayjwylie jayjwylie Added INFO level messages to better understand the performance of per…
…-request instrumentation. Expect to remove most of this after debugging.
@jayjwylie jayjwylie Clean up INFO debugging messages to be DEBUG. Minor other clean up too. 03214c9
@jayjwylie jayjwylie Change monitoring interval in ClientSocketStats to have larger window. 4c69fb6
@jayjwylie jayjwylie Minor code clean up based on review feedback
Protected all logger.(debug|info) statements I added with an is(Debug|Info)Enabled() check.

Made AsyncRecoveryFailureDetector less verbose. When it polls a server to see if it is available, it now prints out a clean INFO level message. It had been printing out a WARN level message *with* a stacktrace that made this expected behavior look much scarier than it really is.
@jayjwylie jayjwylie Less verbose logging in the face of expected exceptions and error log…
…ging in the face of a slop not being written.

- less verbose logging when a node is unavailable

- less verbose logging in the face of expected "exceptional" responses.

- ensure that an error message is logged if a slop is not written (we can grep for "Slop write of key.*was not written" in logs)
- added TODO because ObsoleteVersionExceptiosn are neither treated as failures or successes in the callback to sendHintParallel.

- minor fix

- from junit3 to junit4
@jayjwylie jayjwylie Reduce granularity of failure detector locking and do not destroy all…
… enqueued requests upon setUnavailable.

- Reduce amount of work done within synchronized section to reduce lock granularity and so ensure "side effects" of node being marked (un)available are not w/in sync section.

- Added TODO/comment to decide whether we want to actively destroy all connections upon node being marked unavailable
- Switched behavior to lazily destroying connections.
@jayjwylie jayjwylie Refactor all PerformParallel*Request classes.
- got rid of anonymous call back classes
- factored out waitForResponses logic and processResponses logic for most of these classes. GetAll stands out as being fairly different from the others.
- did not refactor to the point of sharing common code across classes, just refactored within each class.
- added many TODOs to the code for further refactoring.
@jayjwylie jayjwylie Interim checkin with an ugly example of how threads could be created …
…to handle callback work. The ugly code is commented out.
@jayjwylie jayjwylie Added stress test and cleaned up client request executor pool reset.
- changed behavior to match original KeyedResourcePool implementation. The original QueuedKeyedResourcePool.reset() was an unnecessary/bad behavior change that canceled enqueued requests. The original behavior was to destroy idle resources whenever pool is reset.

- stress test that has put and get threads contend for slow servers in such a manner as to trigger failure detection to mark nodes unavailable. This excercises connection tear down, reset(), and build up again. This also exercises the code paths in which callbacks do heavyweight work.
@jayjwylie jayjwylie Clean up of TODOs and cruft code that had been introduced. efe9f21
@jayjwylie jayjwylie Fix for possible race condition for resource creation in KeyedResourc… 5686a52
@jayjwylie jayjwylie Minor tweak to single stress test parameters. 4a3b985
@jayjwylie jayjwylie Minor changes to end-to-end stress test. 1c15b5a
@jayjwylie jayjwylie Substantial refactoring of (Queued)KeyedResourcePool tests.
- factored out a common base class in which all of the nested helper classes are defined
- separated out the various types of tests into files:
  - simple/basic tests
  - contention tests that spawn threads to generate contention
  - specific race condition test
- The specific race condition test for KeyedResourcePool shows that google issue 276 is resolved:
@jayjwylie jayjwylie Minor changes to tests
- renamed base keyedresourcepool test to avoid pattern that ant/junit uses to try and run tests.
- tweaked stress tests parameters once more to make it easier to run locally.
@jayjwylie jayjwylie Updated copyright notices in all files I touched. 79277d1
@jayjwylie jayjwylie Moved stress test for connection checkout/checkin and failure detecto…
…r to long unit test. Cleaned up comments in ClientRequestExecutorPool to make intended semantics clearer.
@jayjwylie jayjwylie Reverted refactoring of PerformParallel* classes committed in f37b25e…
@jayjwylie jayjwylie Fix to copyright notices that I accidentally changed by one year... a2c30c0
@jayjwylie jayjwylie Update release to 1.1.9 and updated release notes. 10353a5
@jayjwylie jayjwylie Prepare to release version release-1.1.9 292d278