Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: fetcher_per_fi…
Commits on Mar 22, 2013
  1. @abh1nay

    Added test case to test runtime exceptions

    abh1nay authored
    Cleaned up the try catch  logic
  2. @abh1nay

    Adding test cases whch simulate intermittent exceptions etc

    abh1nay authored
    in the hdfsfetcher and simulate retry logic
    this ensures checksum calculation is robust
Commits on Mar 21, 2013
  1. Adding an extra catch block for Exception and Throwable types. This i…

    Chinmay Soman authored
    …s used to catch the ClassNotFound exceptions
  2. Using a per file checksum generator in the file copy in HdfsFetcher. …

    Chinmay Soman authored
    …This is used to handle the case where we might retry the copy in case of a Filesystem (hdfs) error.
Commits on Mar 20, 2013
  1. @jayjwylie
  2. @jayjwylie
  3. @jayjwylie

    Addressed all code review comments for KeySampler and KeyVersionFetch…

    jayjwylie authored
    …er. Renamed many classes and methods related to FetchStreamRequestHandler.
    - All sub-classes of FetchStreamRequestHandler have been renamed to have a more consistent nomenclature.
    - Did some further refactoring in the FullScan* classes to move more work from leaf classes to
    - moved scan accounting to overall bae class
    - Added getNodesPartitionIdForKey method to StoreInstance to help with some fetch logic
  4. @jayjwylie
  5. @jayjwylie

    Addressed review feedback and TODOs for KeyVersionSamplerCLI (and ren…

    jayjwylie authored
    …amed it to KeyVersionFetcherCLI).
    - mostly usability changes about command line options...
    - one copyright fix
  6. @jayjwylie

    Addressed all review feedback and TODOs for KeySamplerCLI

    jayjwylie authored
    - added options: --store-names, --partition-ids, --keys-per-second-limit, and --progress-period-ops
    - got rid of unnecessary (and weird) retry loop. Can add seomthing like that later if needed.
    - pass all partitions to fetcher now instead of one-at-a-time
    Also did cosmetic fixes for KeyVersionSamplerCLI and
  7. @jayjwylie

    Correctness fixes and significant refactoring of Fetch*StreamRequestH…

    jayjwylie authored
    …andlers. Expanded AdminFetchTest.
    Added more common helper methods to common base class of all fetchers FetchStreamRequestHandler.
    Added abstract base classes for partition-based fetching and non-partition-based fetching:
    - FetchPartitionStreamRequestHandler (partition-based)
    - FetchItemsStreamRequestHandler (non-partition-based)
    Refactored some code up to abstract base classes and made implementations as similar as possible (without heroic efforts) across all fetchers:
    - FetchEntriesStreamRequestHandler
    - FetchKeysStreamRequestHandler
    - FetchPartitionEntriesStreamRequestHandler
    - FetchPartitionKeysStreamRequestHandler
    Significant better test coverage in AdminFetchTest
    - tests fetching keys as well as fetching entries
    - tests partition-aware and non-partition-aware servers
    - tests per-partition limits on entries/keys fetched
    All of this clean up and additional testing lead to minor correctness fixes.
    Minor other clean ups of comments, override annotations, and fixes for KeySamplerCLI.
  8. @jayjwylie

    change maxRecords to recordsPerPartition in fetch API and protobuf

    jayjwylie authored
    These are cosmetic changes. The client-side and server-side code does not properly do recordsPerPartition yet.
    Added a few TODOs in the code too.
  9. @jayjwylie

    remove skipRecords from fetching API and protobuf

    jayjwylie authored
    AFAIK skipRecords was never used. By inspection, the code that would have been exercised if it had been used has never been correct. Removing skipRecords from the code base.
    - Added a number of TODOs to the code from the reviews
    - Changed some variable names
  10. @jayjwylie
  11. @jayjwylie

    Minor fix for change to AdminClient

    jayjwylie authored
    - do not close down AdminStoreClient from queryKeys
    - added some additional checks to test to confirm (non)existence of exceptions&values
  12. @jayjwylie

    Many minor tweaks to ConsistencyFix code and related files to address…

    jayjwylie authored
    … minor review feedback.
    - fixed commenting out of 'protobuff' target
    - add ClientConfig to constructor. This is needed for AdminStoreClient creation. It is confusing that we need both an AdminClientConfig and ClientConfig, but that is because the *ClientConfig code is so clumsy.
    - changed ".stop()" methods to ".close()" to be consistent with other interfaces.
    et cetera
    - Updated all copyright notices that have changed on this branch since December. This touched a ton of files...
    - annotated some TODOs with "(refactor)" to make refactoring todos easier to find.
  13. @jayjwylie

    Added unit tests for ConsistencyFix, ConsistencyFixWorker, and QueryK…

    jayjwylie authored
    Many other fixes and cleanup:
    - tweak many variable names
    - add close method to stop adminClient
    - broke out BadKey to wrap a key with its string representation st failed fixes of badkey's can be dumped in full to file to be retried (without any additional effort)
    - marked 'parseVersion' as deprecated since, if we do this again, we should dump bytes not strings
    - track obsolete version exceptions and various statuses in Stats
    - clean up of arguments, variable names, etc.
    - cleanly close down fixer...
    - more logger.trace output
    - minor cleanup
    - added getVersioned() helper method
    - added consistency-fix store
    - marked all tests as @Test
    - update copyright notice
  14. @jayjwylie
  15. @jayjwylie
  16. @jayjwylie

    Added KeySampler and KeyVersionSampler tools as a first step towards …

    jayjwylie authored
    …replacing "entropy" tool. Added another argument to bulk fetch operations that specifies maxRecords so that server can fetch a subset of a partition.
    - Samples keys from a cluster
    - Given file that lists keys per store, samples versions from each "responsible node" for that key
    - passed maxRecords through
    - TODO for future clean up of some types
    - auto generated!
    - white space
    - handle maxRecords
    - handle maxRecords
    - fixed usage of skipRecords
    - added maxRecords
    - added mac_records to protobuff definition
    - added maxRecords field to test
  17. @jayjwylie

    Made rebalance --show-plan slightly more verbose and added yet anothe…

    jayjwylie authored
    …r analysis for cluster balance ("zone primary").
    - print out hostname within plan to make it easier to read (rather than having to lookup node ID)
    - calculate "zone primary" balance to understand which hosted partitions act as pseudo-master when zoned routing is used.
  18. @jayjwylie

    Review and cleanup of consistency checker.

    jayjwylie authored
    - added required argument for an output file name for bad keys
    - changed Reporter to print out 'just the key' to the output file; it
      outputs more info at DEBUG level in general.
    - removed 'quiet' option
    - throw exceptions:
      - if # partitions differ across clusters
      - if replication factor is hinky
      - if isExpired encounters unknown type
    - main catches exceptions and fails fast
    - changed system.out debugging to logger.trace
  19. @jayjwylie
  20. @jayjwylie
  21. @jayjwylie

    Changed Rebalancer --output-dir option to append numbers to each .xml…

    jayjwylie authored
    … file it outputs so that we have access to interim cluster configs.
  22. @zhongjiewu @jayjwylie

    Refactored Consistency Check

    zhongjiewu authored jayjwylie committed
  23. @jayjwylie
  24. @jayjwylie

    Default to printing out BADKEYs from ConsistencyCheck. Cleaned up deb…

    jayjwylie authored
    …ug/trace messages in
  25. @jayjwylie

    Fixed hashmap issues in AdminClient raised during code review. Added …

    jayjwylie authored
    …'--parse-only' option to ConsistencyFix.
    - Added hashCode & equals methods to AdminClient.Nodestore
    - cleaned up getSocketStore to not leak concurrently created socket stores.
    - added parse only flag which limits that actions of the fixer to bootstrapping and parsing the input file.
  26. @jayjwylie

    Added 'dry-run' option and cleaned up help message.

    jayjwylie authored
    '--dry-run' option goes through all of the read paths (reading files, reading from servers) and calculates what to write where, but does not actually do any writes!
    Should combine --dry-run with these log4j settings:
  27. @jayjwylie

    Code fixes for the fixing of orphans.

    jayjwylie authored
    - added .trace output for parsing of ugly input
    - pass the correct key-type into constructor
    - substantially more .debug output to trace operation
  28. @jayjwylie
  29. @jayjwylie
  30. @jayjwylie

    Added basic code for repairing orphaned key,values.

    jayjwylie authored
    - added BadKeyOrphanReader extends BadKeyReader to consume different
      input file
    - added "orphan-format" flag to indicate that the 'bad-key-file-in' is
      of orphaned key/values.
    - added constructor to take QueryKeyResult of orphaned keys
    - modified resolveReadConflicts to add orphaned key/values to
      imaginary nodes for the sake of determine the value/version to be
  31. @jayjwylie

    Added per-server throttling to the Consistency Fixer.

    jayjwylie authored
    Added a map of EventThrottle objects such that repair traffic to each server can be throttled. We care about throttling write rate because of its potential impact on GC and cleaning.
Something went wrong with that request. Please try again.