Skip to content
This repository
tag: riak-1.1.1
Fetching contributors…

Octocat-spinner-32-eaf2f5

Cannot retrieve contributors at this time

file 148 lines (119 sloc) 7.578 kb

Riak 1.1.1 Release Notes

Major Features and Improvements for Riak

MapReduce Changes

This release includes fixes for systems with heavy MapReduce load.

Pipe maintains a limit on the number of active worker processes in the system. Workers are created on demand when the pipe executes - they are not reserved at pipe creation time.

Prior to 1.1.1 not all fittings (e.g. get object from riak, map value, reduce) were checking to make sure their output was being delivered to the next stage of the pipeline. This has been changed so that the pipe will error if processing errors or system limits are hit.

As a result of the change if the pipe is overloaded clients will see errors of this form: {error,<<"Error sending inputs: {error,worker_limit_reached}">>}

Clients can either decide to retry (preferably with a backoff/retries limit mechanism) or if the hardware has sufficient capacity the number of workers per vnode can be increased by setting worker_limit in etc/app.config.

{riak_pipe, [{worker_limit, #Workers}]}

Different numbers of workers are used based on the type of MapReduce query so tuning will be very workload dependent.

If you are using Javascript MapReduce you will also need to increase the number of Javascript virtual machines in the riak_kv section of app.config.

{map_js_vm_count,  #MapVMs},
{reduce_js_vm_count, #ReduceVMs}

Again, tuning will be workload dependent. If sufficient resources are available setting #MapVMs to the maximum number of JS MapReduce jobs being run in parallel will give best results and #ReduceVMs set to number of jobs / number of nodes plus a margin for overload.

Bugs Fixed

-Map-reduce jobs fail during rolling upgrade to 1.1 -Race Condition causes Javascript VMs to never be returned to VM Pool -Avoid {case_clause, ok} errors when vnode backend fails to start -Bitcask - Fix incorrect NIF error tuples

Riak 1.1.0 Release Notes

Major Features and Improvements for Riak

Riak Control

-Riak Control preview demo -Riak Control repository with documentation on how to get started

Riaknostic

-Blog post announcing riaknostic -Riaknostic Homepage

Bitcask Improvements

  • CRC checks have been added to hintfiles for bitcask

LevelDB Improvements

  • Snappy (compression algorithm from Google) has been enabled
  • The compaction algorithm has been tuned to reduce delays due to compaction

Other backend changes

  • Multi-backend now supports 2i properly

Lager Improvements

  • Tracing support (see the README)
  • Term printing is ~4x faster and much more correct (compared to io:format)
  • Bitstring printing support was added

MapReduce Improvements

  • The MapReduce interface now supports requests with empty queries. This allows the 2i, list-keys, and search inputs to return matching keys to clients without needing to include a reduce_identity query phase.
  • MapReduce error messages have been improved. Most error cases should now return helpful information all the way to the client, while also producing less spam in Riak’s logs.

Riak KV Improvements

Listkeys Backpressure

Backpressure has been added to listkeys to prevent the node listing keys from being overwhelemed. The change has required a protocol change so that the key lister can limit the rate it receives data.

In mixed clusters where some of the nodes are < 1.1 please set listkeys_backpressure false in the riak_kv section of app.config until all nodes are upgraded.

{listkeys_backpressure, false}

Once all nodes are upgraded, set listkeys_backpressue to true in the riak_kv section of app.config

{listkeys_backpressure, true}

Running nodes can be upgraded without restarting by running this snippet from the riak console

application:set_env(riak_kv, listkeys_backpressure, true).

Fresh 1.1.0 and above installs default to using listkeys backpressure - adjust app.config if different behavior is desired.

Don’t drop post-commit errors on floor

In previous releases there is no easy way to determine if a post-commit hook is failing. In this release two counters have been added to riak-admin status that will indicate pre/post-commit hook failures. They are precommit_fail and postcommit_fail. By default the errors themselves are not logged. The thought is that a bad hook could cause unnecessary IO overload.

If the error needs to be discovered then Lager, Riak’s logging system, will allow you to dynamically change the logging level to debug on the function executing the hook. To do that you need to riak attach on one of the nodes and run the following.

{ok, Trace} = lager:trace_file("<path>/failing-postcommits", [{module, riak_kv_put_fsm}, {function, decode_postcommit}], debug).

This will output all post-commit errors to <path>/failing-postcommits. When you’ve got enough samples you can stop the trace like so.

lager:stop_trace(Trace).

Other Additions

Default small_vclock to be equal to big_vclock

If you are using bidirectional cluster replication and you have overridden the defaults for either of these then you should consider setting both to the same value.

The default value of small_vclock has been changed to be equal to big_vclock in order to delay or even prevent unnecessary sibling creation in a Riak deployment with bidirectional cluster replication. When you replicate a pruned vector clock the other cluster will think it isn’t a descendent, even though it is, and create a sibling. By raising small_vclock to match big_vclock you reduce the frequency of pruning and thus siblings. Combined with vnode vclocks, sibling creation, for this particular reason, may be entirely avoided since the number of entries will almost always stay below the threshold in a well behaved cluster (i.e. one not under constant node membership change or network partitions).

Known Issues

-Luwak has been deprecated in the 1.1 release -bz1160 - Bitcask fails to merge on corrupt file

Bugs Fixed

-bz775 - Start-up script does not recreate /var/run/riak -bz1283 - erlang_js uses non-thread-safe driver function -bz1333 - Bitcask attempts to open backup/other files -Poolboy - Lots of potential bugs fixed, see detailed post by Andrew Thompson

Lager Specific Bugs Fixed

  • #26 - don’t make a crash log called ‘undefined’
  • #28 - R13A support (god only knows why I bothered merging this)
  • #29 - Don’t unnecessarily quote atoms
  • #31 - Better crash reports for proc_lib processes
  • #33 - Don’t assume supervisor children are named with atoms
  • #35 - Support printing bitstrings (binaries with trailing bits)
  • #37 - Don’t generate dynamic atoms
Something went wrong with that request. Please try again.