-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite retry loop with heavy CPU load #184
Comments
This edge case has probably not been considered back then. The loop should definitely terminate if no progress is made and recovery fails. I would prefer exiting the loop and pass control back to the client application which could handle the retry depending on the context. |
I remember that we needed this recovery loop to handle spurious failures from a device that was connected via a Serial-over-USB adapter. The code is from pre-async Tokio 0.3 were recovery from errors in async code was much more complicated. Making the library code less stateful and opinionated should be the goal. If you see a chance to simplify the code by removing this poor-man's approach for recovery from errors then don't hesitate to do it. |
Recovering from spurious failures that actually happen in practice sounds like a good reason to keep this code instead of giving up immediately and passing control back to the caller. Instead of adding a short delay, another option would be to define a maximum number of retries. |
Sure, a retry limit would also be reasonable. Ideally configurable, but a constant is probably sufficient for now. |
I've implemented the retry limit in PR #186 |
I have tested tokio-modbus on a misbehaving Modbus RTU bus, which currently responds with an infinite stream of zeros to all read requests. This is not a theoretical situation, but can happen when a single device on the bus is misconfigured.
I don't expect tokio-modbus to recover from such situations, which is why I've encapsulated the
read_holding_registers
call in atokio::time::timeout
and drop the client session after every error/timeout.However, tokio-modbus currently tries to recover from that and thereby enters an infinite loop without delay or even exponential backoff. My timeout kills that read operation after a few seconds without data, but in the meantime, it retries every millisecond and thereby causes heavy CPU load.
In particular, I'm talking about this loop:
tokio-modbus/src/codec/rtu.rs
Lines 242 to 263 in b8bdb90
get_pdu_len
reads a zero byte here, therefore returns with anErr
that is caught by theor_else
branch. That branch callsrecover_on_error
to clear that zero byte, setsretry
totrue
, and tries again immediately.I'm open for any solution that adds a short delay here.
Either via a
retry_delay
parameter that is passed totokio::time::sleep
before every retry, or via a passed function that is called before every retry (allows the user to implement exponential backoff).CC @flosse @uklotzde
The text was updated successfully, but these errors were encountered: