You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a 2-worker cluster, a modest workload sent to initializer triggers an intermittent behavior in worker1 where worker1 freezes after printing a series of 15 or more Muting DataChannel messages. Work continues on initializer until the point when the data channel(s) from initializer -> worker1 apply back pressure to initializer.
vagrant@ubuntu-bionic:/build2$ reset.sh
WARNING: all useful state files are deleted by this script!
vagrant@ubuntu-bionic:/build2$ start-cluster.sh 2
WARNING: all useful state files are deleted by this script!
Worker initializer: port = 7107
Worker worker1: port = 7117
Success
vagrant@ubuntu-bionic:/build2$ for i in `seq 1 3600`; do /bin/echo -n . ; cat /path/to/3118.csv | ./frame-text-lines.py | nc -w 1 localhost 7100; done
* Move increment of _ack_counter into _maybe_ack()
* Add call to _maybe_ack() to forward_barrier() to fix#3118.
Under very low application message rates and resilience=on,
the accumulation barrier messages on the other end of the
boundary was caused by not acking here. The _ack_counter
could become an odd number, and _maybe_ack()'s modulo
arithmetic + comparison could never be true, which meant
never sending an ack.
Fixes#3118
* Move increment of _ack_counter into _maybe_ack()
* Add call to _maybe_ack() to forward_barrier() to fix#3118.
Under very low application message rates and resilience=on,
the accumulation barrier messages on the other end of the
boundary was caused by not acking here. The _ack_counter
could become an odd number, and _maybe_ack()'s modulo
arithmetic + comparison could never be true, which meant
never sending an ack.
Fixes#3118
Is this a bug, feature request, or feedback?
Bug
What is the current behavior?
In a 2-worker cluster, a modest workload sent to
initializer
triggers an intermittent behavior inworker1
whereworker1
freezes after printing a series of 15 or moreMuting DataChannel
messages. Work continues oninitializer
until the point when the data channel(s) frominitializer -> worker1
apply back pressure toinitializer
.What is the expected behavior?
No freeze
What OS and version of Wallaroo are you using?
Ubuntu Bionic/18.04 LTS + Wallaroo @ commit 35d2038
Steps to reproduce?
env PONYCFLAGS="--verbose=1 --debug -Dresilience -Dtrace -Dcheckpoint_trace -Didentify_routing_ids" make
when building Machida3.The bug may take up to an hour before manifesting. See full logs at http://wallaroolabs-dev.s3.amazonaws.com/logs/logs.1583818189.tar.gz. From
/tmp/wallaroo.1
:The text was updated successfully, but these errors were encountered: