Skip to content

Commit

Permalink
Merge pull request #88 from basho/jem-rel-notes-known-issues
Browse files Browse the repository at this point in the history
Update release notes with known issues
  • Loading branch information
jaredmorrow committed Sep 30, 2011
2 parents 3c921cb + 113be69 commit bf28217
Showing 1 changed file with 80 additions and 24 deletions.
104 changes: 80 additions & 24 deletions RELEASE-NOTES.org
@@ -1,25 +1,5 @@
* Riak 1.0.0 Release Notes * Riak 1.0.0 Release Notes



** Rolling Upgrade From Riak Search 14.2 to Riak 1.0.0

There are a couple of caveats for rolling upgrade from Riak Search
14.2 to Riak 1.0.0.

First, there are some extra steps that need to be taken when
installing the new package. Instead of simply installing the new
package you must uninstall the old one, move the data dir, and then
install the new package.

Second, while in a mixed cluster state some queries will return
incorrect results. It's tough to say which queries will exhibit this
behavior because it depends on which node the data is stored and what
node is making the query. Essentially, if two nodes with different
versions need to coordinate on a query it will produce incorrect
results. Once all nodes have been upgrade to 1.0.0 all queries will
return the correct results.


** Major Features and Improvements for Riak ** Major Features and Improvements for Riak
*** 2i *** 2i
Secondary Indexes (2I) makes it easier to find your data in Secondary Indexes (2I) makes it easier to find your data in
Expand Down Expand Up @@ -129,10 +109,12 @@ return the correct results.
cluster. The leave command ensures that the exiting node will cluster. The leave command ensures that the exiting node will
handoff all its partitions before leaving the cluster. It should be handoff all its partitions before leaving the cluster. It should be
executed by the node intended to leave. executed by the node intended to leave.
- =riak-admin remove= is now changed to a force-remove, where a node - =riak-admin remove= no longer exists. Use =riak-admin leave= to safely
is immediately removed from the cluster without waiting on remove a node from the cluster, or =riak-admin force-remove= to remove
handoff. This is designed for cases where a node is unrecoverable an unrecoverable node.
and for which handoff does not make sense. - =riak-admin force-remove= immediately removes a node from the cluster
without having it first handoff data. All data replicas are therefore
lost. This is designed for cases where a node is unrecoverable
- The new cluster changes require all nodes to be up and reachable in - The new cluster changes require all nodes to be up and reachable in
order for new members to be integrated into the cluster and for the order for new members to be integrated into the cluster and for the
data to be rebalanced. During brief node outages, the new protocol data to be rebalanced. During brief node outages, the new protocol
Expand All @@ -143,6 +125,12 @@ return the correct results.
nodes and performing ring rebalances. Nodes marked as down will nodes and performing ring rebalances. Nodes marked as down will
automatically rejoin and reintegrate into the cluster when they come automatically rejoin and reintegrate into the cluster when they come
back online. back online.
- When performing a rolling upgrade, the cluster will auto-negotiate
the proper gossip protocol, using the legacy gossip protocol while
there is a mixed-verison cluster. During the upgrade, executing
=riak-admin ringready= and =riak-admin transfers= from a non-1.0
node will fail. However, executing those commands from a 1.0 node
will succeed and give the desired information.




*** Get/Put Improvements *** Get/Put Improvements
Expand Down Expand Up @@ -301,6 +289,73 @@ access to the =other_field=.
- Fixed bug in =lucene_parser= to handle all errors returned from - Fixed bug in =lucene_parser= to handle all errors returned from
calls to =lucene_scan:string=. calls to =lucene_scan:string=.



** Known Issues
*** Rolling Upgrade From Riak Search 14.2 to Riak 1.0.0

There are a couple of caveats for rolling upgrade from
Riak Search 0.14.2 to Riak 1.0.0.

First, there are some extra steps that need to be taken when
installing the new package. Instead of simply installing the new
package you must uninstall the old one, move the data dir, and then
install the new package.

Second, while in a mixed cluster state some queries will return
incorrect results. It's tough to say which queries will exhibit this
behavior because it depends on which node the data is stored and what
node is making the query. Essentially, if two nodes with different
versions need to coordinate on a query it will produce incorrect
results. Once all nodes have been upgrade to 1.0.0 all queries will
return the correct results.

*** Intermittent CRASH REPORT on node leave (bz://1218)

There is a harmless race condition that sometimes triggers a crash when a node leaves
the cluster. It can be ignored. It shows up on the console/logs as:

=(08:00:31.564 [notice] "node removal completed, exiting.")=

=(08:00:31.578 [error] CRASH REPORT Process riak_core_ring_manager with 0 neighbours crashed with reason: timeout_value)=

*** Node stats incorrectly report pbc_connects_total

The new code path for recording stats is not currently incrementing the
total number of protocol buffer connections made to the node, causing it
to incorrectly report 0 in both =riak-admin status= and =GET /stats= .

*** Secondary Indexes not supported under Multi Backend

Multi Backend does not correctly expose all capabilities of its
child backends. This prohibits using Secondary Indexes with Multi
Backend. Currently, Secondary Indexing is only supported for the
ELevelDB backend (=riak_kv_eleveldb_backend=). Tracked as [[https://issues.basho.com/show_bug.cgi?id=1231][Bug 1231]].

*** MapReduce reduce phase may run more often than requested

If a reduce phase of a MapReduce query is handed off from one Riak
Pipe vnode to another it immediately and unconditionally reduces the
inputs it has accumulated. This may cause the reduce function to be
evaluated more often than requested by the batch size configuration
options. Tracked as [[https://issues.basho.com/show_bug.cgi?id=1183][Bug 1183]] and [[https://issues.basho.com/show_bug.cgi?id=1184][Bug 1184]].

*** Potential Cluster/Gossip Overload

The new cluster protocol is designed to ensure that a Riak cluster
converges as quickly as possible. When running multiple Riak nodes on
a single-machine, the underlying gossip mechanism may become CPU-bound
for a period of time and cause cluster related commands to
timeout. This includes the following =riak-admin= commands: =join,
leave, remove, member_status, ring_status=. Incoming client requests
and other Riak operations will continue to function, although latency
may be impacted. The cluster will continue to handle gossip messages
and will eventually converge, resolving this issue. Note: This
behavior only occurs when adding/removing nodes from the cluster, and
will not occur when a cluster is stable. Also, this behavior has only
been observed when running multiple nodes on a single machine, and has
not been observed when running Riak on multiple servers or EC2
instances.

** Bugs Fixed ** Bugs Fixed
-[[https://issues.basho.com/show_bug.cgi?id=0105][bz0105 - Python client new_binary doesn't set the content_type well]] -[[https://issues.basho.com/show_bug.cgi?id=0105][bz0105 - Python client new_binary doesn't set the content_type well]]
-[[https://issues.basho.com/show_bug.cgi?id=0123][bz0123 - default_bucket_props in app.config is not merged with the hardcoded defaults]] -[[https://issues.basho.com/show_bug.cgi?id=0123][bz0123 - default_bucket_props in app.config is not merged with the hardcoded defaults]]
Expand Down Expand Up @@ -382,4 +437,5 @@ access to the =other_field=.
-[[https://issues.basho.com/show_bug.cgi?id=1224][bz1224 - platform_data_dir (/data) is not being created before accessed for some packages]] -[[https://issues.basho.com/show_bug.cgi?id=1224][bz1224 - platform_data_dir (/data) is not being created before accessed for some packages]]
-[[https://issues.basho.com/show_bug.cgi?id=1226][bz1226 - Riak creates identical vtags for the same bucket/key with different values]] -[[https://issues.basho.com/show_bug.cgi?id=1226][bz1226 - Riak creates identical vtags for the same bucket/key with different values]]
-[[https://issues.basho.com/show_bug.cgi?id=1227][bz1227 - badstate crash in handoff]] -[[https://issues.basho.com/show_bug.cgi?id=1227][bz1227 - badstate crash in handoff]]
-[[https://issues.basho.com/show_bug.cgi?id=1229][bz1229 - "Downed" (riak-admin down) nodes don't rejoin cluster]]


0 comments on commit bf28217

Please sign in to comment.