Skip to content

RDSC-4633 How to perform HA failover#2818

Merged
andy-stark-redis merged 8 commits intoredis:mainfrom
ilianiliev-redis:RDSC-4633-ha-failover-setup
Mar 2, 2026
Merged

RDSC-4633 How to perform HA failover#2818
andy-stark-redis merged 8 commits intoredis:mainfrom
ilianiliev-redis:RDSC-4633-ha-failover-setup

Conversation

@ilianiliev-redis
Copy link
Contributor

@ilianiliev-redis ilianiliev-redis commented Feb 23, 2026

Ticket: https://redislabs.atlassian.net/browse/RDSC-4633

Document how to perform an HA failover test.


Note

Low Risk
Low risk documentation-only change; no product code or behavior is modified. Main risk is operational confusion if the iptables example is followed incorrectly during testing.

Overview
Adds a new doc page, ha-test.md, describing how to deliberately trigger and observe RDI HA failover by blocking traffic to the RDI database (via iptables) and watching operator logs, plus cleanup steps.

Updates the VM HA installation guide (install-vm.md) to point readers to the new failover testing instructions.

Written by Cursor Bugbot for commit 48e096a. This will update automatically on new commits. Configure here.

Copy link
Contributor

@andy-stark-redis andy-stark-redis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few style suggestions, but otherwise LGTM.

54.78.220.161
```

2. For each of the IPs returned by the above command, run the following command to block the traffic:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that you expect dig to return more than one IP address for a hostname? (Presumably you only need to run the command once on the leader node.) If so, maybe say that explicitly in step 1, because it currently says "Identify the database IP", which makes it sound like there is only one address, but "IP" might potentially be plural here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that one hostname points to multiple IPs, for example:

% dig +short example.com
104.18.26.120
104.18.27.120

The idea is that if we have a DNS Round-robin or other load balancing happening at DNS level, we need to block all IPs, to ensure that we can not connect to any of those.

ilianiliev-redis and others added 5 commits March 2, 2026 13:08
Co-authored-by: andy-stark-redis <164213578+andy-stark-redis@users.noreply.github.com>
Co-authored-by: andy-stark-redis <164213578+andy-stark-redis@users.noreply.github.com>
Co-authored-by: andy-stark-redis <164213578+andy-stark-redis@users.noreply.github.com>
Co-authored-by: andy-stark-redis <164213578+andy-stark-redis@users.noreply.github.com>
Co-authored-by: andy-stark-redis <164213578+andy-stark-redis@users.noreply.github.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

ilianiliev-redis and others added 2 commits March 2, 2026 13:17
Co-authored-by: andy-stark-redis <164213578+andy-stark-redis@users.noreply.github.com>

To perform HA, you can simulate a connection failure between the leader and the RDI database by blocking the network traffic. You can do this by running the following commands on the leader node:

1. Identify the RDI database IP (replace `<hostname>` with your own hostname):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andy-stark-redis I have also added RDI here to make it clear which database we are talking about.

@andy-stark-redis andy-stark-redis merged commit 6bae5e5 into redis:main Mar 2, 2026
4 checks passed
@ilianiliev-redis ilianiliev-redis deleted the RDSC-4633-ha-failover-setup branch March 4, 2026 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants