Skip to content
This repository
branch: hdfsfetchertes…
Fetching contributors…

Cannot retrieve contributors at this time

file 463 lines (386 sloc) 21.493 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463
Release 1.1.7 on 01/03/2013

Changes made since release 1.1.4
* Upgrading Hadoop jar to 1.0.2
* Added support for Kerberos authentication in HdfsFetcher
* Extra config parameters for Kerberos config and keytab file

NOTE: Release 1.1.5 and 1.1.6 are special client side releases
not based off of master. 1.1.5 was rolled back to to a weird bug.
1.1.6 is a special client side release including Auto-
bootstrapper and Versioned Avro support.

Release 1.1.4 on 11/29/2012

Changes made since release 1.1.3
* Added BDB parameters to control LRU behavior in cache & proactive cleaner migration
* Added a mlock fix for pinning the indexes of RO stores in memory


Release 1.1.3 on 11/28/2012

Changes made since release 1.1.2
* Fixed a bug in the build and push job, specifically the Mapper
  that caused collisions
* Added retry mechanism with the HDFS fetcher for hftp


Release 1.1.2 on 10/31/2012

Changes made since release 1.1.1
* Reverted a change to voldemort.versioning.Versioned.getVersion() so
  that a Version is returned as our clients expect.


Release 1.1.1 on 10/30/2012

Changes made since release 1.1.0
* Fixed connection leak in ClientRequestExecutorFactory
* Changed client to default to DefaultStoreClient


Release 1.1.0 on 10/19/2012

Changes made since release 1.0.0

IMPORTANT NOTE : This release has significant changes to the BDB storage layer.
Users are required to read the bin/PREUPGRADE_FOR_1_1_X_README file
thoroughly before attempting to upgrade to 1.1.0. The necessary data
conversion will be done through bin/voldemort-convert-bdb.sh

* Upgrading to JE 4.1.17
* New data format that handles conflicting updates in Voldemort more
  efficiently
* Move data off heap and only use it for Index
* When scanning, evict whatever you bring in right away.
* Partition based scan api to dramatically speed up rebalancing & restore
  using Partition aware scans (you exactly scan whatever you want to fetch)
* Flexible knobs to control scheduling of DataCleanupJob


Release 1.0.0 on 10/17/2012

NOTE: The large version number jump from 0.96 to 1.0.0 is to
standardize on a version number of the sort MAJOR.MINOR.PATCH. This
change is part of our effort to treat internal and open source
releases in a much more similar manner. Along these lines, release
notes for internal releases (like this one) are committed on the
master branch. We hope this improves transparency as we work towards
the next open source release.

Changes made since release 0.96

* Auto bootstrapping: ZenStoreClient and System stores
  * Added server side system stores for managing metadata
  * ZenStoreClient interacts with system stores
  * ZenStoreClient auto bootstraps whenever cluster.xml or stores.xml changes
  * Added a new routing strategy to route to all with local preference
  * Added a client-registry for publishing client info and config values
  * Updated LazyClientStore to try to bootstrap during Init
  * Modified Failure Detector to work on a shared cluster object reference
* Avro: schema evolution and read only support
  * Added new Avro serializer type that supports schema evolution
  * Added Avro support to read only stores
  * Added LinkedIn build-and-push Azkaban jobs to build read only stores to contrib
  * Added a schema backwards compatibility check to VoldemortAdminTool and on server startup to prevent mishaps due to bad schemas
* Non-blocking IO: Fixed design flaw that blocked in the face of slow servers
  * Asynchronous operations no longer do a blocking checkout to get a SocketDestination
  * Added additional stats collection for better visibility into request queues
* Minor features
  * Enhanced VoldemortAdminTool to update store metadata version
  * Enhanced VoldemortAdminTool to work with the new system stores
  * Added feature to voldemort-shell.sh to dump byte & object arrays
  * Added a SlowStorageEngine for testing degraded mode performance
  * Added mechanism to isolate BDB cache usage among stores
  * Enhanced debug logging (for traffic analysis).
  * Python client bug fixes (from pull request)
  * Improved messages in request tracing
  * Cleaned up help/usage messages within the client shell
  * Added server config to control socket backlog
  * Added "--query-keys" option to query multiple keys of multiple stores from specific node
  * Added control to DataCleanupJob Frequency
  * Unified jmxid as the factory across the board
* Tools
  * bin/generate_cluster_xml.py to generate cluster.xml
  * bin/repeat-junit.sh and bin/repeat-junit-test.sh to repeatedly run tests
* Bug fixes
  * Changed getall return behavior to comply with javadoc
  * Fixed a bug that caused unnecessary serial requests in getall
  * HFTP performance issue bug fix (fix in byte buffer and copy process)
  * Fixed a bug that prevented "--fetch-keys" and "--fetch-entries" in admin tool from showing multiple store results
  * Fixed problem in sample config that prevented the server from starting
  * Fixed some intermittent BindException failures across many unit tests
  * Fixed some intermittent rebalance test failures
  * Wrapped long running tests with timeouts


Release 0.96 on 09/05/2012

Changes made since 0.90.1

 * Monitoring:
     * Append cluster name to various mbeans for better stats display
     * Implement average throughput in bytes
     * Add BDB JE stats
     * Add 95th and 99th latency tracking
     * Add stats for ClientRequestExecutorPool
     * Add error/exception count and max getall count
     * BDB+ Data cleanup Monitoring changes
 * Rebalancing:
     * Donor-based rebalancing and post cleanup (see https://github.com/voldemort/voldemort/wiki/Voldemort-Donor-Based-Rebalancing for more details)
     * Rebalancing integration testing framework (under test/integration/voldemort/rebalance/)
     * Generate multiple cluster.xml files based on the number specified when running the tool and choose the cluster with the smallest std dev as the final-cluster.xml
     * Add status output to log for updateEntries (used by rebalancing)
 * Read-only pipeline:
     * Add hftp and webhdfs support
     * Read-only bandwidth dynamic throttler
     * Add minimum throttle limit per store
     * Add rollback capability to the Admin tool
 * Voldemort-backed stack and index linked list impl
 * Change client requests to not process responses after timeout
 * Modified client request executor timeout to not factor in the NIO selector timeout
 * Added BDB native backup capabalities, checksum verification and incremental backups (well tested, but not yet used in production)
 * Add additional client-side tracing for debugging and consistency analytics
 * Clean up logging during exception at client-side
 * Security exception handling
 * Add snappy to CompressionStrategyFactory
 * Add configurable option to interrupt service being unscheduled
 * Add logging support for tracking ScanPermit owners (for debugging purposes)
 * Add a jmx terminate operation for async jobs
 * Add zone option for restore from replicas
 * Changing the enable.nio.connector to true by default
 * Better disconnection handling for python client
 * Split junit tests into a long and a short test suites
 * Add separate timeouts for different operations (put, get, delete, and getAll
 * Allow getAll to return partial results upon timeout
 * Improved cluster generation tool
 * Added log4j properties folder for junit test
 * Bug fixes:
     * httpclient 3.x to httpclient 4.x
     * Fix NPE in listing read-only store versions
     * Fixed 2 failure detector bugs during rebalancing or node swapping
     * Fixed a thread leak issue in StreamingSlopPusher
     * Fixed a NIO bug
     * Fixed a bug in TimeBasedInconsistency resolver.
     * Fixed race condition in client socket close
     * Fixed a potential deadlock issue in ScanPermitWrapper
     * Fixed a bug where a read returns null (on rare occations) when being concurrent with a write
     * Fixed a performance bug in HdfsFetcher when hftp is used


Changes made since 0.90

 * Updated the documentation for Voldemort shell tool in NOTES
 * Added Admin API to perform Bdb data cleanup (repairJob)
   and corresponding unit tests
 * Fixes in restore from replication, store creation code
 * Improved failure detector configuration. ThresholdFailureDetector
   is now the default option
 * Multiple Fixes to the Voldemort ruby client
 * Added additional Jmx metrics to expose Bdb environment statistics,
   caching statistics and Voldemort batch operation statistics
 * Updated default timeout for restore 'from replica' to 365 days
 * New feature in the performance tool: '--use-sample' option enables
   'read, write back unmodified' transactions in place of writes in
   the workload (allows for testing read-write transactions on stores
   with complex schemas)
 * Added the ability to dynamically update cluster.xml and
   reinitialize the scheduler

Release 0.90 on 7/10/2011

Changes made since 0.81
 * All upgrading instructions can be found here - https://github.com/voldemort/voldemort/wiki/Upgrading-from-0.81
 * Tooling
 - Single consolidated administrative tool - Got rid of the admin client
   shell and have a better Voldemort admin tool. More documentation -
   https://github.com/voldemort/voldemort/wiki/Voldemort-Admin-tool

 * Web manager
 - Basic GUI in JRuby ( and Sinatra ), capable of querying for store metadata.
 - Read contrib/web-manager/README.md for more details
  
 * Rebalancing
 - New better rebalancing support with (a) checkpointing (b) progress bar (c) support for RO store rebalancing
 - Documentation - https://github.com/voldemort/voldemort/wiki/Voldemort-Rebalancing
 - In the process solves Issue 203, 288, 305, 307, 311

 * Client side changes
 - Pipeline routed store ( with support for topology awareness ) - Changed the default client side routing
   to use a state machine like system. More documentation - https://github.com/voldemort/voldemort/wiki/RoutedStore-redesign
 - Lazy store client - Some of our clients are very inactive and don't really do many requests. For such
   clients it doesn't make sense to bootstrap the metadata at startup. Hence from now onwards clients instantiated
   from SocketStoreClientFactory give a LazyStoreClient which will not bootstrap till the first request is made
 - Caching store client factory - We have also checked-in a new store client factory which we use internally
   at LinkedIn to cache store clients for stores which we repeatedly.
 - Failure detector - We have fixed some bugs in the ThresholdFailureDetector. This is an improvement over the naive
   BannageFailureDetector which would aggressively mark nodes down after a single failure. More details -
   https://github.com/voldemort/voldemort/wiki/Client-side-failure-detector-implementations
 - Issue 219: StoreClient::put returns a Version - You may have to upgrade your clients since now the StoreClient returns
   a version to be used in successive requests.

 * Read-only store changes
 - Admin based store swapper - Till 0.81 we relied on a servlet based fetcher / swapper.
   Now we support an admin based store swapper which reports progress.
 - Other changes - (a) New directory format (b) Two new data format. Support for iterating over read-only data
   (c) Checksum of data in .metadata file (d) Some monitoring changes of RO stores
 
 * Jar upgrades
 - Introduction of JNA 3.2.7 - We have introduced JNA 3.2.7 in this release for supporting symbolic links in read-only stores.
   More details about the new RO format and changes can be found here - https://github.com/voldemort/voldemort/wiki/Upgrading-from-0.81
 - Protocol Buffers 2.3.0 - Upgrading protocol buffers from 2.2.0 to 2.3.0
 - Avro 1.4.0 - Upgraded Avro from 1.3.0 to 1.4.0

 * Storage engine
 - Added support for Krati 0.3.6 as a storage engine ( http://sna-projects.com/krati/ ) -
   https://github.com/voldemort/voldemort/tree/release-090/contrib/krati

 * Core features
 - Hinted handoff - https://github.com/voldemort/voldemort/wiki/Hinted-Handoff
 - Experimental support for server side transforms - https://github.com/voldemort/voldemort/wiki/Server-side-transforms-in-Voldemort
 - Topology awareness ( i.e. datacenter / rack ) - https://github.com/voldemort/voldemort/wiki/Topology-awareness-capability
 - Repair Job ( Job which will delete data if the node is not responsible for it )

 * Better monitoring
 - Tons of JMX changes ( average bytes, etc ) - More details ( https://github.com/voldemort/voldemort/wiki/JMX-Monitoring )
 - Key distribution generator - Ability to estimate skew of your cluster ( ./bin/run-class.sh voldemort.utils.KeyDistributionGenerator )

 * Clients
 - Python - Updated Python client with support for Binary JSON serialized stores ( ./clients/python/README )
 - Ruby - Updated Ruby client ( ./clients/ruby/README.md )

Release 0.81 on 6/15/2010

Changes made since 0.80.2:

* IMPORTANT: we have upgraded the Hadoop Core jar to v0.20.2. Since
  this version of Hadoop requires Java 6, in order to retain backwards
  compatibility with Java 5, you will need to replace this jar with
  the Hadoop 0.18.* core jar.
* Multiple donors for Voldemort Rebalancing: speeds up rebalancing, by
  allowing a single stealer node to transfer partitions from multiple
  nodes
* Features for EC2 Testing
* Read-only Stores: backwards compatibility with 0.70, bug fix in
  Checksum code
* Hadoop InputFormat, Pig LoadFunc in Contrib: ability to perform
  Map/Reduce (either directly via Hadoop or using Pig) over data in
  Voldemort stores. See:
  < contrib/hadoop/test/voldemort/hadoop/VoldemortWordCount.java > for
  an example.
* Voldemort Performance Tool: performance measurement tool based on
  YCSB (Yahoo Cloud Serving Benchmark) code. Run
  < bin/voldemort-performance-tool.sh --help > for more information.
* Reverted default failure detector implementation back to
  BannagedPeriodFailureDetector due to potential bugs in the
  ThresholdFailureDetector

Release 0.80.2 on 4/27/2010

Changes made since 0.80.1:

* Batched NIO writes (improves performance for AdminClient/Streaming
  operations when the NIO connector is enabled)
* VoldemortAdminTool: Added support to specify stores, support to
  fetch all keys to a binary file, support to save keys in ASCII
  (JSON) format [experimental], capability to add stores. Run
  < bin/voldemort-admin-tool.sh --help > for more information.
* Minor bug fixes for Rebalancing
* Issue 240: Voldemort fetcher should use different temp directories for
  different stores
* Checksum capability during construction of Voldemort read-only stores

Release 0.80.1 on 3/23/2010

Changes made since 0.80:

* Issue 133: Support for Apache Avro as a serialization format
* Issue 223: Changed the default client to use
  TresholdFailureDetector
* Fixed issue 222: Revised KeyedResourcePool.close(K key) to fix
  leaking sockets
* Fixed issue 198: Revised ReadRepairer to use a separate copy of the
  vector clock, fixing a situation where NoSuchElementException would
  be thrown
* Miscellaneous enhancement: support for TCP keep-alives, improved
  read only store utilities, command line interface to AdminClient,
  improved load testing tools

Release 0.80 on 2/18/2010

Changes made since 0.70.1:

* IMPORTANT: backwards compatibility between the client and server has
  changed. A backwards incompatibility in the wire protocol was found
  between releases 0.60 to 0.70.1 and releases prior to 0.60. We chose
  to make 0.80 compatible with 0.57.1 and earlier versions, while
  introducing an incompatibility with 0.60-0.70.1. What this means is
  that if you're presently running 0.60 and higher, you would need to
  upgrade the Voldemort jar files on *all* servers and clients.
* Upgraded the BDB storage engine to use BerkeleyDB-JE 4.0.92,
  retaining ability to use BerkeleyDB-JE 3.3.* if desired. IMPORTANT:
  if one switches from BerkeleyDB-JE 3.3.* to 4.0.92 they will be able
  to access all of existing data. Once a switch has been made to
  4.0.92 the data will not be readable by earlier versions of BDB. If
  there's a chance that a roll back to 3.3.* might be needed, the best
  course of action will be to make a backup of existing data
  prior to upgrading.
  Switching between 3.3.* and 4.0.* would also require rebuilding the
  class files (e.g., by running "ant clean && ant release" after
  replacing the BDB jar files).
* Compression support for read-only stores
* Increased the socket buffer size for transferring read-only
  stores from Hadoop for improved performance over high-latency links
* NIO support for the Admin Service, including Streaming
  Functionality
* Support for adding stores on the fly via the Admin
  Service
* Fixed issue 209: Incorrect object passed to List.contains in
  RebalanceUtils.getLatestCluster()
* Fixed issue 211: Unnecessary read repairs during getAll() with more
  than one key
* Other enhancements: better CLI for rebalancing, throttling in
  Admin Service is now based on all disk activity

Release 0.70.1 on 2/1/2010

Changes made since 0.70:

* Fixed issue 205: if no keys passed to getAll() were in partitions
  undergoing rebalancing, proxyGetAll() would be called with an
  empty list even if rebalancing wasn't happening

Release 0.70 on 1/27/2010

Changes made since 0.60.1:

* A beta of rebalancing (dynamic cluster expansion) support merged
  into the main branch. See the project's wiki for more information:
  http://wiki.github.com/voldemort/voldemort/voldemort-rebalancing
* New failure detector merged into the main branch:
  http://wiki.github.com/voldemort/voldemort/failure-detection
* Beta mechanism for restoring all of node's data from replicas on
  demand. This is an alternative to a more gradual mechanism provided
  by read-repair: useful when a machine is down for a prolonged period
  and is then re-inserted into the cluster.
  Invoked via JMX: the operation is restoreDataFromReplication in the
  voldemort.server.VoldemortServer MBean, with a mandatory parameter
  (integer >= 1) indicating the number of transfers to do in parallel.
* Simple gossip protocol (for cluster metadata) merged
  into the main branch. Disabled by default: use "enable.gossip=true"
  to enable, use "gossip.interval.ms" to set an interval at which gossip
  occurs (default: 30000 i.e., 30 seconds).
* Fixed issue 190: add a way of aggregating performance data over
  all stores
* Fixed issue 181: stack trackes shouldn't be filled for Obsolete
  version exception

Release 0.60.1 on 12/18/2009

Minor changes made since 0.60:

* Better logging in the exception thrown if config/.temp and
  config/.version are copied
* Bumping up the version to 0.60.1 in order to release updated
  archives, fixing an error in the stores.xml for single_node_cluster
  sample config

Release 0.60 on 12/15/2009

Changes made since 0.57.1:

* Admin Client/Server API: adds support for streaming-based transfer
  of entries between nodes, deleting entries on remote nodes,
  remotely deleting and updating metadata
* EC2 testing: a way to periodically run integration and performance
  tests which involve Voldemort instances on different machines
* Experimental support for views
* Interpolation search for read-only stores
* Support for large lists and strings in the JSON serializer
* LZF compression support
* Ruby client contributed
* Fixed issue 170: hanging if a port is used by another process
* Fixed issue 122: suspicious integer division in
  RequestCounter.getThroughput
* Miscellaneous improvements and bug fixes for read-only stores
* Fixed issue 168: added StorageEngine.keys()

Release 0.57.1 on 11/27/2009

Minor change made since 0.57:

* Modified build.xml to exclude .git directory from release tarballs/zipfiles

Release 0.57 on 11/16/2009
 
The following changes were made since 0.56:

* Fixed an issue in ReadOnlyEngine's close() method
* Fixed hidden logging in StorageService
* Fix for issue 163 (lock mode during get)
* Exposed bdb environment stats with setFast(false)

Release 0.56 on 10/26/2009

The following changes were made since 0.55:
 
* Fix for issue 164: Changed default bdb.max.logfile.size to 60MB
* Make file deletes asynchronous in read only store swap
* Added better debug logs for bdb stats
* Fixed race condition in AbstractSocketPoolTest
* Added improved monitoring for bdb stats
* Added backoff and retry logic in bootstrap code
* Not logging obosleteVersionException(s) or counting it in JMX exception count
* Fixed issue 159
* C++ client building on OS X
* Rely only on the number of versions returned to decide whether to retrieve
  the value for put(K,V)
* Implemented issue 152: getVersion() API in Store
 
Release 0.55 on 10/7/2009
 
The following changes were made since 0.52 (in summary):
 
* Add an event throttler
* Added DataCleanupJob
* Protocol buffers at version 2.2.0
* BDB JE upgraded to 3.3.87
* Added data compression support (enable by adding
    <compression>
       <type>gzip</type>
    </compression>
  to value-serializer or key-serializer sections in stores.xml)
* Added a resource pool/socket pool implementation (no longer using
  commons-pool)
* Added server-side NIO support (enabled by setting
  enable.nio.connector=true
  in server.properties)
* Improved the efficiency of the protocol buffers network protocol by using
  CodedOutputStream/CodedInputStream (zero copy transfers)
* Fix for issue #21: incorrectness in vector clock inconsistency resolver
* Upgrade google-collections
* Support for building read only stores in Hadoop
* Added a C++ client
Something went wrong with that request. Please try again.