Skip to content

Commit

Permalink
Stateful failover (#604)
Browse files Browse the repository at this point in the history
This patch introduces new failover mode - stateful.

There are 3 main concepts:

**Internal coordinator** is a new role, which makes decisions regarding
leadership. Earlier it was a part of every instances' failover module,
but now it's split away. There may be only one active coordinator in
cluster at a time. Its uniqueness is ensured by external storage which
manages the lock and saves appointments.

**External storage** (`kingdom.lua`) is a stand-alone Tarantool instance
which provides locking mechanism and keeps decisions made by the
coordinator.

**Failover module** (the old one) operates on every instance in cluster
and gathers leadership information for others modules. It was refactored
too, and now it can be described with 4 functions with clearly separated
responsibilities:

- `_get_appointments_*` generates leadership map by itself or polls it
  from external storage depending on the mode setting
  (disabled/eventual/stateful).
- `accept_appointments()` just refreshes the cache and tracks if
  anything changed.
- `failover_loop` (a fiber) repeatedly gets new appointments and accepts
  them using corresponding functions from the above.
- `cfg` is called from `confapplier.apply_config()` (on restart or on
  committing new clusterwide configuration). At first it gets
  appointments synchronously and then starts the failover loop.
  • Loading branch information
rosik committed Mar 23, 2020
1 parent 02784e7 commit 887843a
Show file tree
Hide file tree
Showing 14 changed files with 1,234 additions and 93 deletions.
7 changes: 7 additions & 0 deletions .luacheckrc
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,10 @@ exclude_files = {
'cartridge/graphql.lua',
'cartridge/graphql/*.lua',
}
new_read_globals = {
box = { fields = {
session = { fields = {
storage = {read_only = false, other_fields = true}
}}
}}
}
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
can be either absolute or relative - in the later case it's calculated
relative to `cartridge.workdir`.

- Implement stateful failover mode.

### Deprecated

Lua API:
Expand Down
5 changes: 5 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,11 @@ install(
DESTINATION ${TARANTOOL_INSTALL_LUADIR}
)

install(
FILES ${CMAKE_CURRENT_SOURCE_DIR}/kingdom.lua
DESTINATION ${TARANTOOL_INSTALL_BINDIR}
)

install(
FILES
${CMAKE_CURRENT_BINARY_DIR}/VERSION.lua
Expand Down
1 change: 1 addition & 0 deletions cartridge-scm-1.rockspec
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ build = {
TARANTOOL_DIR = '$(TARANTOOL_DIR)',
TARANTOOL_INSTALL_LIBDIR = '$(LIBDIR)',
TARANTOOL_INSTALL_LUADIR = '$(LUADIR)',
TARANTOOL_INSTALL_BINDIR = '$(BINDIR)',
},
copy_directories = {'doc'},
}
4 changes: 4 additions & 0 deletions cartridge.lua
Original file line number Diff line number Diff line change
Expand Up @@ -509,6 +509,10 @@ local function cfg(opts, box_opts)
service_registry.set('httpd', httpd)
end

local ok, err = roles.register_role('cartridge.roles.coordinator')
if not ok then
return nil, err
end
for _, role in ipairs(opts.roles or {}) do
local ok, err = roles.register_role(role)
if not ok then
Expand Down

0 comments on commit 887843a

Please sign in to comment.