Adds Sentinel support #345

anarthal · 2025-10-30T13:56:08Z

close #237
close #269
close #268
close #229

anarthal · 2025-10-30T13:56:32Z

Still an early prototype, I want to check that my CI approach is viable.

include/boost/redis/detail/sentinel_resolve_fsm.hpp

include/boost/redis/detail/exec_one_fsm.hpp

include/boost/redis/impl/run_fsm.ipp

mzimbres · 2025-11-15T22:29:30Z

include/boost/redis/impl/sentinel_resolve_fsm.ipp

+
+         // Store the resulting address in a well-known place
+         if (st.cfg.sentinel.server_role == role::master) {
+            st.cfg.addr = resp.master_addr;


IMO this assignment should be moved to the run_op right after async_sentinel_resolve resumes to make that function more readable. It would be more intuitive if async_sentinel_resolve complete with the value instead of changing the connection_impl state. Can we change the completion signature to void(error_code, address) to achieve that?

EDIT: We could also extend the completion signature with a third parameter which are the new sentinel addresses which is also use in run_op. This way the async_sentinel_resolve would not change any state in connection_impl.

Moving the assignment or the update_sentinel_list to run makes run (which is already big) have even more responsibilities. The tests are already gigantic, so I'm not very keen on this.

I don't think adding a 2nd (or 3rd) completion parameter to the run FSM makes a lot of good, because the parameter needs to be there for code paths, but has a valid value only in one of it. Unless you have a very clear opinion on this, I'd prefer for this to stay as is.

include/boost/redis/impl/sentinel_utils.hpp

mzimbres · 2025-11-15T22:50:48Z

include/boost/redis/impl/sentinel_utils.hpp

+}
+
+// Parses a list of replicas or sentinels
+inline system::error_code parse_server_list(


Can we move this to .ipp?

Can we take a std::vector<node> instead and work with .at() function instead of iterators. We don't need performance here so I would prefer to be on the safer side.

Can we move this to .ipp?

I don't think it makes sense because this is an impl/ file. It gets included by .ipp files only, and is never seen by the end-user code.

Can we take a std::vector instead and work with .at() function instead of iterators. We don't need performance here so I would prefer to be on the safer side.

I'm gonna defer this to #355. flat_tree contains the number of messages, and this makes parsing much easier. None of these asserts can be triggered unless there's a bug in the parser, so it's not much of a problem. I will address it though, since you're right about the security/performance tradeoff here.

mzimbres · 2025-11-15T22:57:01Z

I am half way through it. The sentinel_utils.hpp implements hard stuff so I will have a more detailed look at it, perhaps tomorrow.

mzimbres · 2025-11-16T20:20:54Z

I finished reviewing. I haven't found anything serious to fix only things to consider. We can also do it in another PR to avoid delaying a merge, we have plenty of time until the next release. Many thanks you implement this pretty fast considering its size.

mzimbres · 2025-11-16T20:26:26Z

One last question, IIUC we resolve with the sentinel using the same connection_impl that we use to interact with the server. Is it safe for users to star async_exec, receive and cancellation on these ops while the master/replica is being resolved? Shouldn't we add a new connection_impl that is only used to resolve with sentinel? Or is it clear to you that mixing these ops won't interfere with each other?

mzimbres · 2025-11-19T10:10:41Z

When merged we can also close these tickets https://github.com/boostorg/redis/issues?q=is%3Aissue%20state%3Aopen%20label%3Asentinel

This reverts commit 44c7f4a.

anarthal

One last question, IIUC we resolve with the sentinel using the same connection_impl that we use to interact with the server. Is it safe for users to star async_exec, receive and cancellation on these ops while the master/replica is being resolved? Shouldn't we add a new connection_impl that is only used to resolve with sentinel? Or is it clear to you that mixing these ops won't interfere with each other?

It's safe. Actually, the exec test hits this situation, and I've added a receive test this for receive. exec only touches the multiplexer queue, and receive2 the channel. Everything else is managed by run. run ensures that you're either resolving addresses or communicating with the server, but not both.

We could extract the read buffer from the multiplexer to make this point even clearer, since it's used by async_exec_one, and the current usage is a little bit weird. But I'll probably defer this to another PR.

Added support for Sentinel

a228511

anarthal force-pushed the feature/sentinel branch from ae013c1 to a228511 Compare November 7, 2025 09:25

anarthal added 27 commits November 9, 2025 19:21

Add support for replicas

52c74ee

Test and fixes

070d645

Adjust discussion

e4c51f3

Use a random replica

68bb941

Unknown master test for replicas

fd62b08

Update the reference with references to replicas

081acec

Docs for server_role

af8e9c9

no_replicas docs

812b096

Adjust the error code in resolve

2fc8c39

no_replicas stringification

ac364e5

test enum values

161e1f0

better logging 1

62e983c

Decrease TLS-related verbosity in Sentinel logs

7ede365

better Sentinel logging 2

446224a

parse_sentinel_response 1st test

44cad21

parse_sentinel_response master tests

ba26343

replica test

551e3bb

fixture

bf3f5a1

more replica tests

571f940

Remove redundant parameter

6cc0809

error tests 1

47faa6a

SENTINEL GET-MASTER-ADDR-BY-NAME errors

44c1dcc

simplification

f5f33a6

SENTINEL SENTINELS errs

fadbe9f

replica tests

2ba18b2

Take update_sentinel_list out

4e51d85

operator== for address

c84d29a