in the hdfsfetcher and simulate retry logic this ensures checksum calculation is robust
…s used to catch the ClassNotFound exceptions
…This is used to handle the case where we might retry the copy in case of a Filesystem (hdfs) error.
…er. Renamed many classes and methods related to FetchStreamRequestHandler. - All sub-classes of FetchStreamRequestHandler have been renamed to have a more consistent nomenclature. - Did some further refactoring in the FullScan* classes to move more work from leaf classes to FullScanFetchRequestHandler.java - moved scan accounting to overall bae class - Added getNodesPartitionIdForKey method to StoreInstance to help with some fetch logic
…amed it to KeyVersionFetcherCLI). - mostly usability changes about command line options... - one copyright fix
KeySamplerCLI - added options: --store-names, --partition-ids, --keys-per-second-limit, and --progress-period-ops - got rid of unnecessary (and weird) retry loop. Can add seomthing like that later if needed. - pass all partitions to fetcher now instead of one-at-a-time Also did cosmetic fixes for KeyVersionSamplerCLI and Entropy.java
…andlers. Expanded AdminFetchTest. Added more common helper methods to common base class of all fetchers FetchStreamRequestHandler. Added abstract base classes for partition-based fetching and non-partition-based fetching: - FetchPartitionStreamRequestHandler (partition-based) - FetchItemsStreamRequestHandler (non-partition-based) Refactored some code up to abstract base classes and made implementations as similar as possible (without heroic efforts) across all fetchers: - FetchEntriesStreamRequestHandler - FetchKeysStreamRequestHandler - FetchPartitionEntriesStreamRequestHandler - FetchPartitionKeysStreamRequestHandler Significant better test coverage in AdminFetchTest - tests fetching keys as well as fetching entries - tests partition-aware and non-partition-aware servers - tests per-partition limits on entries/keys fetched All of this clean up and additional testing lead to minor correctness fixes. Minor other clean ups of comments, override annotations, and fixes for KeySamplerCLI.
These are cosmetic changes. The client-side and server-side code does not properly do recordsPerPartition yet. Added a few TODOs in the code too.
AFAIK skipRecords was never used. By inspection, the code that would have been exercised if it had been used has never been correct. Removing skipRecords from the code base. Also: - Added a number of TODOs to the code from the reviews - Changed some variable names
src/java/voldemort/client/protocol/admin/AdminClient.java - do not close down AdminStoreClient from queryKeys test/unit/voldemort/client/AdminServiceBasicTest.java - added some additional checks to test to confirm (non)existence of exceptions&values
… minor review feedback. build.xml - fixed commenting out of 'protobuff' target src/java/voldemort/client/protocol/admin/AdminClient.java - add ClientConfig to constructor. This is needed for AdminStoreClient creation. It is confusing that we need both an AdminClientConfig and ClientConfig, but that is because the *ClientConfig code is so clumsy. - changed ".stop()" methods to ".close()" to be consistent with other interfaces. et cetera - Updated all copyright notices that have changed on this branch since December. This touched a ton of files... - annotated some TODOs with "(refactor)" to make refactoring todos easier to find.
…eyResult. Many other fixes and cleanup: src/java/voldemort/utils/ConsistencyFix.java - tweak many variable names - add close method to stop adminClient - broke out BadKey to wrap a key with its string representation st failed fixes of badkey's can be dumped in full to file to be retried (without any additional effort) - marked 'parseVersion' as deprecated since, if we do this again, we should dump bytes not strings - track obsolete version exceptions and various statuses in Stats src/java/voldemort/utils/ConsistencyFixCLI.java - clean up of arguments, variable names, etc. - cleanly close down fixer... src/java/voldemort/utils/ConsistencyFixWorker.java - more logger.trace output - minor cleanup test/common/voldemort/TestUtils.java - added getVersioned() helper method test/common/voldemort/config/stores.xml - added consistency-fix store test/unit/voldemort/store/routed/ReadRepairerTest.java - marked all tests as @Test test/unit/voldemort/utils/ConsistencyCheckTest.java - update copyright notice
…replacing "entropy" tool. Added another argument to bulk fetch operations that specifies maxRecords so that server can fetch a subset of a partition. src/java/voldemort/utils/KeySamplerCLI.java - Samples keys from a cluster src/java/voldemort/utils/KeyVersionSamplerCLI.java - Given file that lists keys per store, samples versions from each "responsible node" for that key src/java/voldemort/client/protocol/admin/AdminClient.java - passed maxRecords through - TODO for future clean up of some types src/java/voldemort/client/protocol/pb/VAdminProto.java - auto generated! src/java/voldemort/server/protocol/admin/AdminServiceRequestHandler.java - white space src/java/voldemort/server/protocol/admin/FetchStreamRequestHandler.java src/java/voldemort/server/protocol/admin/FetchEntriesStreamRequestHandler.java src/java/voldemort/server/protocol/admin/FetchKeysStreamRequestHandler.java - handle maxRecords src/java/voldemort/server/protocol/admin/FetchPartitionKeysStreamRequestHandler.java src/java/voldemort/server/protocol/admin/FetchPartitionEntriesStreamRequestHandler.java - handle maxRecords - fixed usage of skipRecords src/java/voldemort/utils/Entropy.java - added maxRecords src/proto/voldemort-admin.proto - added mac_records to protobuff definition test/unit/voldemort/client/AdminFetchTest.java - added maxRecords field to test
…r analysis for cluster balance ("zone primary"). src/java/voldemort/client/rebalance/RebalancePartitionsInfo.java - print out hostname within plan to make it easier to read (rather than having to lookup node ID) src/java/voldemort/utils/ClusterInstance.java - calculate "zone primary" balance to understand which hosted partitions act as pseudo-master when zoned routing is used.
- added required argument for an output file name for bad keys - changed Reporter to print out 'just the key' to the output file; it outputs more info at DEBUG level in general. - removed 'quiet' option - throw exceptions: - if # partitions differ across clusters - if replication factor is hinky - if isExpired encounters unknown type - main catches exceptions and fails fast - changed system.out debugging to logger.trace
… files for each batch.
… file it outputs so that we have access to interim cluster configs.
…ug/trace messages in ConsistencyFix.java.
…'--parse-only' option to ConsistencyFix. src/java/voldemort/client/protocol/admin/AdminClient.java - Added hashCode & equals methods to AdminClient.Nodestore - cleaned up getSocketStore to not leak concurrently created socket stores. src/java/voldemort/utils/ConsistencyFix(CLI).java - added parse only flag which limits that actions of the fixer to bootstrapping and parsing the input file.
'--dry-run' option goes through all of the read paths (reading files, reading from servers) and calculates what to write where, but does not actually do any writes! Should combine --dry-run with these log4j settings: log4j.logger.voldemort.utils.ConsistencyFix=TRACE log4j.logger.voldemort.utils.ConsistencyFixWorker=DEBUG
src/java/voldemort/utils/ConsistencyFix.java - added .trace output for parsing of ugly input - pass the correct key-type into constructor src/java/voldemort/utils/ConsistencyFixWorker.java - substantially more .debug output to trace operation
TODOs for later cleanup.
src/java/voldemort/utils/ConsistencyFix.java - added BadKeyOrphanReader extends BadKeyReader to consume different input file src/java/voldemort/utils/ConsistencyFixCLI.java - added "orphan-format" flag to indicate that the 'bad-key-file-in' is of orphaned key/values. src/java/voldemort/utils/ConsistencyFixWorker.java - added constructor to take QueryKeyResult of orphaned keys - modified resolveReadConflicts to add orphaned key/values to imaginary nodes for the sake of determine the value/version to be repaired
Added a map of EventThrottle objects such that repair traffic to each server can be throttled. We care about throttling write rate because of its potential impact on GC and cleaning.