Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore replication_connect_quorum at bootstrap #3428

Closed
lenkis opened this issue May 25, 2018 · 1 comment
Closed

Ignore replication_connect_quorum at bootstrap #3428

lenkis opened this issue May 25, 2018 · 1 comment
Assignees
Labels
feature A new functionality replication
Milestone

Comments

@lenkis
Copy link
Contributor

lenkis commented May 25, 2018

At replica set bootstrap -- when calling box.cfg{} for the first time, when no replica set exists yet -- an instance currently ignores replication_connect_quorum and tries to connect to all replicas in box.cfg.replication.

It also ignores replication_connect_quroum when issuing box.cfg{replication} for a second time after box.cfg.

@lenkis lenkis added bug Something isn't working replication labels May 25, 2018
@kyukhin kyukhin added feature A new functionality and removed bug Something isn't working labels Jul 9, 2018
@kyukhin kyukhin added this to the 1.9.2 milestone Jul 9, 2018
@GeorgyKirichenko
Copy link
Contributor

It might be a cause for cluster inconsistency, because all new nodes SHOULD be bootstrapped from one master (at same time exactly).
My proposal is wait for all possible masters until timeout is reached but not raise an error if quorum is reached (this is changed). This allows us to choose the same leader from all currently working replicas. Timeout should be large enough to connect to all working replicas.

sergepetrenko added a commit that referenced this issue Aug 7, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. The instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428
sergepetrenko added a commit that referenced this issue Aug 7, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. The instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428
sergepetrenko added a commit that referenced this issue Aug 9, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. The instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428

@TarantoolBot document
Title: replication_connect_quorum is not ignored.
Now on replica set bootstrap instance tries to connect
not to every replica, listed in box.cfg.replication, but
only to replication_connect_quorum replicas. If after
replication_connect_timeout seconds instance is not connected to
at least replication_connect_quorum other instances, we
throw an error.
sergepetrenko added a commit that referenced this issue Aug 10, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. The instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428

@TarantoolBot document
Title: replication_connect_quorum is not ignored.
Now on replica set bootstrap instance tries to connect
not to every replica, listed in box.cfg.replication, but
only to replication_connect_quorum replicas. If after
replication_connect_timeout seconds instance is not connected to
at least replication_connect_quorum other instances, we
throw an error.
sergepetrenko added a commit that referenced this issue Aug 10, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. The instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428

@TarantoolBot document
Title: replication_connect_quorum is not ignored.
Now on replica set bootstrap instance tries to connect
not to every replica, listed in box.cfg.replication, but
only to replication_connect_quorum replicas. If after
replication_connect_timeout seconds instance is not connected to
at least replication_connect_quorum other instances, we
throw an error.
sergepetrenko added a commit that referenced this issue Aug 10, 2018
Add start arguments to replication test instances to control
replication_timeout and replication_connect_timeout settings between
restarts.

Prerequisite #3428
sergepetrenko added a commit that referenced this issue Aug 10, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. An instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428

@TarantoolBot document
Title: replication_connect_quorum is not ignored.
Now on replica set bootstrap and in case of replication reconfiguration
(e.g. calling box.cfg{replication=...} for the second time) tarantool
doesn't fail, if it couldn't connect to to every replica, but could
connect to replication_connect_quorum replicas. If after
replication_connect_timeout seconds the instance is not connected to at
least replication_connect_quorum other instances, we throw an error.
sergepetrenko added a commit that referenced this issue Aug 13, 2018
Add start arguments to replication test instances to control
replication_timeout and replication_connect_timeout settings between
restarts.

Prerequisite #3428
sergepetrenko added a commit that referenced this issue Aug 13, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. An instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428

@TarantoolBot document
Title: replication_connect_quorum is not ignored.
Now on replica set bootstrap and in case of replication reconfiguration
(e.g. calling box.cfg{replication=...} for the second time) tarantool
doesn't fail, if it couldn't connect to to every replica, but could
connect to replication_connect_quorum replicas. If after
replication_connect_timeout seconds the instance is not connected to at
least replication_connect_quorum other instances, we throw an error.
sergepetrenko added a commit that referenced this issue Aug 13, 2018
Add start arguments to replication test instances to control
replication_timeout and replication_connect_timeout settings between
restarts.

Prerequisite #3428
sergepetrenko added a commit that referenced this issue Aug 13, 2018
On bootstrap and after replication reconfiguration
replication_connect_quorum was ignored. The instance tried to connect to
every replica listed in replication parameter, and errored if it wasn't
possible.
The patch alters this behaviour. An instance still tries to connect to
every node listed in replication, but does not raise an error if it was
able to connect to at least replication_connect_quorum instances.

Closes #3428

@TarantoolBot document
Title: replication_connect_quorum is not ignored.
Now on replica set bootstrap and in case of replication reconfiguration
(e.g. calling box.cfg{replication=...} for the second time) tarantool
doesn't fail, if it couldn't connect to to every replica, but could
connect to replication_connect_quorum replicas. If after
replication_connect_timeout seconds the instance is not connected to at
least replication_connect_quorum other instances, we throw an error.
locker pushed a commit that referenced this issue Aug 14, 2018
Add start arguments to replication test instances to control
replication_timeout and replication_connect_timeout settings
between restarts.

Needed for #3428
@locker locker closed this as completed in c1a16b2 Aug 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality replication
Projects
None yet
Development

No branches or pull requests

5 participants