You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Symptom : single slow node may cause user fetch failure.
Getting CS user is in two steps, first step with PR=all,
in which single slow riak node can cause timeout error at client.
When timeout occurs in riakc_pb_socket, it disconnects TCP connection
and goes into wait-and-retry loop.
Then, CS user get 2nd phase with weak option, but it's likely that reconnect
does not happen yet, fails with disconnected error.
If "slow" node is completely frozen (no action will come out from it),
after health check timeout, strong get fails by "insufficient vnodes"
and weak get should work well. For this case, certain user can not
access Riak CS for finite time period, 60 sec by default.
Reproduction (or simulation)
Create 4-node cluster ({get_user_timeout, 3000} in advanced.config may help)
Freeze it: kill -s SIGSTOP $DEV2 (keep your fingers crossed, if unfortunate, freeze another node 🙉)
Do any access,
The text was updated successfully, but these errors were encountered:
Basho-JIRA
changed the title
Timeout in get user with strong option makes subsequent weak get fail
Timeout in get user with strong option makes subsequent weak get fail [JIRA: RCS-250]
Jul 29, 2015
For release note, short version: Improve user object fetch logic when some nodes are slow or silently failed.
For longer version, please refer the description of this issue.
Symptom : single slow node may cause user fetch failure.
in which single slow riak node can cause timeout error at client.
riakc_pb_socket
, it disconnects TCP connectionand goes into wait-and-retry loop.
does not happen yet, fails with disconnected error.
If "slow" node is completely frozen (no action will come out from it),
after health check timeout, strong get fails by "insufficient vnodes"
and weak get should work well. For this case, certain user can not
access Riak CS for finite time period, 60 sec by default.
Reproduction (or simulation)
Create 4-node cluster (
{get_user_timeout, 3000}
in advanced.config may help)Memo dev2 pid
Freeze it:
kill -s SIGSTOP $DEV2
(keep your fingers crossed, if unfortunate, freeze another node 🙉)Do any access,
The text was updated successfully, but these errors were encountered: