Skip to content
This repository has been archived by the owner on Apr 23, 2020. It is now read-only.

Rolling Ensemble Change

Randgalt edited this page Apr 16, 2012 · 11 revisions

Background

ZooKeeper ensembles are statically configured and difficult to change once running. Exhibitor makes changing the servers in the ensemble a lot easier via Rolling Configuration Changes support.

When you make a change in configuration that requires ZooKeeper instances to be restarted, you’ll be offered to make the change all at once or as a “Rolling Release”. In a rolling release, the config changes are applied to each ZooKeeper instance one at a time. This will keep the ZooKeeper ensemble in quorum and available while the change is made. During the rolling release you’ll see this message on the Config tab:

Rolling Configuration Change

Scenarios

Please see the Workflows page for examples/scenarios.

Details

The rolling config change is accomplished via the Shared Configuration. When a rolling config change is in progress, two versions of the configuration are stored: a “master” set of values and the “rolling” set of values. Additionally, a rolling state value is stored that controls which instances in the ensemble should get the master values and which should get the rolling values. Here’s the pseudo flow:

  • Rolling values are written to the shared config store
  • The list of hostnames is generated. New servers are listed first, servers that are not changing come next. Servers that are being removed are not in the list because they can update “all at once” when the roll is complete. A rolling index is set to 0 – i.e. the first hostname in the list should apply the changes.
  • Each instance polls for shared configuration changes. The first server in the rolling hostnames list will set the new “rolling values” which will cause it to re-write zoo.cfg and re-boot its ZooKeeper instance. After the instance achieves stable quorum, the rolling index will be advanced.
  • The next instance in the rolling hostnames list will now notice the change and apply it.
  • Once the list has been fully processed, the rolling state is turned off and the master values are updated with the rolling values which will cause any remaining instances to see the new values.

Notice that this method does not require a single worker instance to manage the roll. The fact that each instance polls for configuration changes allows each instance to manage the roll at the point that it applies to it.