Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applier fiber — ER_READONLY: Can't modify data because this instance is in read-only mode. #1204

Closed
filonenko-mikhail opened this issue Dec 26, 2020 · 3 comments · Fixed by #1643
Assignees
Labels
bug Something isn't working cartridge customer

Comments

@filonenko-mikhail
Copy link
Collaborator

After some admin actions on new cluster, I cant reboot the whole cluster.

Seems that somehow it's possible to make several masters, while one of them still not bootstraped.

The results

repro.tar.gz

tar xzf repro.tar.gz

cd repro 
tarantoolctl rocks install cartridge 2.3.0
cartridge start --name makeshort

# storage-02 cant start er_readonly
@rosik
Copy link
Contributor

rosik commented Dec 26, 2020

It seems you're missing stateboard.

If you restart a cluster without a stateboard, not a single instance could fetch a leader information and everyone will be read-only.
If you try to join an instance, it'll fail.

@rosik
Copy link
Contributor

rosik commented Dec 26, 2020

But there's another real problem: the same error arise when a leader isn't first in failover priority list.

Steps to reproduce:

  1. Start 3 unconfigured instances and a stateboard
  2. Compose a replicaset of two instances (3301 and 3302) with 3301 in top of failover priority list
  3. Enable stateful failover
  4. Promote the second instance (3302)
  5. Try to join the third instance (3303) to the same replicaset

Result:

I> bootstrapping replica from 4c4c9e4a-bdf9-4250-bb59-200155272620 at 127.0.0.1:3301
E> ER_READONLY: Can't modify data because this instance is in read-only mode. 

@rosik rosik added the bug Something isn't working label Dec 26, 2020
@rosik
Copy link
Contributor

rosik commented Dec 26, 2020

Here is a problem:

local function boot_instance(clusterwide_config)

local leader_uuid = leaders_order[1] -- !!!!! with stateful failover it's not always a leader
local leader = topology_cfg.servers[leader_uuid]
box_opts.replication = {pool.format_uri(leader.uri)}
box.cfg(box_opts)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cartridge customer
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants