Adds a connection health flag to the multiplexer #308

anarthal · 2025-09-18T12:22:32Z

Uses this flag in async_exec to make cancel_if_not_connected reliable
Deprecates basic_connection::run_is_canceled()
Introduces preconditions in multiplexer methods to be called by the reader and writer tasks
Ensures that request cancellation due to a connection lost is only called after the reader and writer have finished, avoiding potential race conditions

close #236

anarthal · 2025-09-18T12:28:40Z

The rationale behind this PR is establishing preconditions to multiplexer methods. At the moment, cancel_on_conn_lost is called directly from connection::cancel, so there is no guarantee with respect to ordering. For instance, the following is possible:

The reader is parsing a message. More data is needed, so the task gets suspended while parsing.
The user calls cancel, so cancel_on_conn_lost is called, removing the request we were handling.
The reader just receives new data (before getting the cancellation) and invokes multiplexer::consume_next, probably yielding a crash.

This is very unlikely, but I'd like to avoid having to test it. With this PR, we guarantee that cancel_on_conn_lost will definitely be called after the reader/writer tasks exit.

I've taken the opportunity and added a flag to signal whether our connection is healthy or not, which makes cancel_if_not_connected actually do what it advertises.

Let me know your views on this. I can fragment this in 2 PRs if you prefer.

mzimbres · 2025-09-19T05:17:20Z

include/boost/redis/connection.hpp

-      mpx_.cancel_on_conn_lost();
   }

   bool is_open() const noexcept { return stream_.is_open(); }


Can we deprecate is_open() as well? Or at least is should be renamed to is_healthy().

mzimbres · 2025-09-19T05:22:53Z

include/boost/redis/connection.hpp

-               conn_->mpx_.reset();
               clear_response(conn_->setup_resp_);
+               conn_->read_buffer_.clear();
+               conn_->mpx_.on_connection_up();


I would like to move away the implementation towards #104. For this PR that would mean mpx.on_connection_up() should be actually be called mpx.log_data_received() and be called somehow in redis_stream::async_read_some when the read size is different from 0. All my other comments in this PR assume this model.

Edit: perhaps it makes more sense to move health funktionality to the redis_sream where data is actually received.

This would be equivalent to calling redis_stream::async_read_some with asio::cancel_after(ping_interval), and tearing down the connection if the timeout elapses, wouldn't it?

mzimbres · 2025-09-19T05:25:05Z

include/boost/redis/detail/multiplexer.hpp

+   // Handle connection health
+   void on_connection_up();  // Might be called in any state
+
+   void on_connection_down();  // Must be called once, with is_connection_healthy() == true


This function is not needed since the connection is considered down when data is older than the ping interval specified in config::health_check_interval.

mzimbres · 2025-09-19T05:30:55Z

include/boost/redis/detail/multiplexer.hpp

-   bool cancel_run_called_ = false;
   usage usage_;
   any_adapter receive_adapter_;
+   bool conn_healthy_ = false;


I think this could be a std::chrono::time_point that gets update at each log_data_received(). If creating a timestamp is two expensive i.e. I would need to think more.

mzimbres · 2025-09-19T05:32:45Z

include/boost/redis/detail/multiplexer.hpp

+
+   void on_connection_down();  // Must be called once, with is_connection_healthy() == true
+
+   bool is_connection_healthy() const { return conn_healthy_; }


This function should check if conn_healthy_ is not older than config::health_check_interval.

anarthal · 2025-09-19T10:34:04Z

I've clearly tried to get too many things together in a single PR :) I'm going to split this in two: getting rid of the race condition with cancel_on_conn_lost and then improve the health checker. I need to think of your comments about the latter.

anarthal · 2025-09-19T16:25:32Z

A subset of this PR (including tests) is available in #309. It doesn't include any of the health checking functionality - it only solves the race condition problem.

anarthal added 3 commits September 18, 2025 14:00

Initial impl

3704d68

Deprecate run_is_canceled

b969158

Use the flag in the multiplexer

3ca78e1

mzimbres reviewed Sep 19, 2025

View reviewed changes

anarthal closed this Sep 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adds a connection health flag to the multiplexer #308

Adds a connection health flag to the multiplexer #308

Uh oh!

anarthal commented Sep 18, 2025

Uh oh!

anarthal commented Sep 18, 2025

Uh oh!

mzimbres Sep 19, 2025

Uh oh!

mzimbres Sep 19, 2025 •

edited

Loading

Uh oh!

anarthal Sep 19, 2025

Uh oh!

mzimbres Sep 19, 2025

Uh oh!

mzimbres Sep 19, 2025

Uh oh!

mzimbres Sep 19, 2025

Uh oh!

anarthal commented Sep 19, 2025

Uh oh!

anarthal commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		void on_connection_down(); // Must be called once, with is_connection_healthy() == true

		bool is_connection_healthy() const { return conn_healthy_; }

Adds a connection health flag to the multiplexer #308

Adds a connection health flag to the multiplexer #308

Uh oh!

Conversation

anarthal commented Sep 18, 2025

Uh oh!

anarthal commented Sep 18, 2025

Uh oh!

mzimbres Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

mzimbres Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anarthal Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

mzimbres Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

mzimbres Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

mzimbres Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

anarthal commented Sep 19, 2025

Uh oh!

anarthal commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mzimbres Sep 19, 2025 •

edited

Loading