Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve bucket_discovery for large clusters (more than 1M buckets) #210

Closed
ochaton opened this issue Feb 19, 2020 · 0 comments
Closed

Improve bucket_discovery for large clusters (more than 1M buckets) #210

ochaton opened this issue Feb 19, 2020 · 0 comments
Assignees
Labels
customer feature A new functionality router
Milestone

Comments

@ochaton
Copy link
Member

ochaton commented Feb 19, 2020

It is not possible to perform bucket_discovery on a storage with more than 200K buckets.
Discovery of 500K from a single storage takes more than 1 second, blocking other fibers.
It is actually possible to cache the previous discovery result on a storage and invalidate it when rebalancing or pinning happens.

Also, it is not necessary to perform automatic bucket_discovery from routers because cluster's topology changes quite rarely.

Please support disabling automatic discovery after router fills its routing table.
Even when bucket_send is called on a storage, router is still able to discover transferred bucket calling bucket_stat on all replicasets.

@Gerold103 Gerold103 self-assigned this Feb 19, 2020
@Gerold103 Gerold103 added customer feature A new functionality router labels Feb 19, 2020
@Gerold103 Gerold103 added this to the 0.2 milestone Feb 19, 2020
@Gerold103 Gerold103 added the prio1 label Mar 5, 2020
Gerold103 added a commit that referenced this issue Mar 21, 2020
Since vshard creation it never needed any schema changes. Version
of vshard schema was always '1', stored in _schema as
'oncevshard:storage:1', because it was created by
box.once('vshard:storage:1').

Now it is time to make some changes to the schema. First is
necessity to simplify introduction and removal of internal
functions. There are 3 of them at the moment: bucket_recv()/
rebalancer_apply_routes()/rebalancer_request_state(). Users see
them in vshard.storage.* namespace, and in _func space. They are
going to be replaced by one vshard.storage._service_call(). In
that way it will be easy to update them without schema changes, to
add new functions, remove them. In _service_call() it will be
possible to validate versions, return proper errors in case a
requested function does not exist.

Secondly, _service_call() is going to be used by new discovery on
routers. It is not possible to modify the existing
vshard.storage.buckets_discovery() because it is a pubic function.
But a new discovery can be added to _service_call().

Thirdly, there is a bug in rebalancing related to possible big TCP
delays, which requires a change of _bucket space format in future.

Fourthly, _service_call() would allow to introduce new functions
on read-only replica, using code reload only. This may be needed
for the bug about bucket_ref() not preventing bucket move.

The patch introduces versioning using 4 numbers: x.x.x.x. The
first 3 numbers are major, minor, patch, the same as in the tags.
The last value is a number increased when first 3 numbers can't be
changed, but the schema is modified. That happens when users take
master branch between 2 tags, and then the schema is changed again
before the new tag is published.

Upgrade between two adjacent tags is supposed to be always safe
and automatic when reload or reconfiguration happen. However
currently there are too few versions so as any upgrade could be
not safe.

Needed for #227
Needed for #210
Gerold103 added a commit that referenced this issue Mar 21, 2020
It is a new internal function which is supposed to absorb all the
other internal functions to make it easier to change them, to
simplify adding new and drop old functions.

This also hides them from vshard.storage.* namespace so users
can't try to use them not knowing they are private.

Additionally, _call() allows to change internal function set even
on read-only instances, because it has nothing to do with the
schema. Internal function set can be updated by reload.

Closes #227
Part of #210
Gerold103 added a commit that referenced this issue Mar 22, 2020
Since vshard creation it never needed any schema changes. Version
of vshard schema was always '1', stored in _schema as
'oncevshard:storage:1', because it was created by
box.once('vshard:storage:1').

Now it is time to make some changes to the schema. First is
necessity to simplify introduction and removal of internal
functions. There are 3 of them at the moment: bucket_recv()/
rebalancer_apply_routes()/rebalancer_request_state(). Users see
them in vshard.storage.* namespace, and in _func space. They are
going to be replaced by one vshard.storage._service_call(). In
that way it will be easy to update them without schema changes, to
add new functions, remove them. In _service_call() it will be
possible to validate versions, return proper errors in case a
requested function does not exist.

Secondly, _service_call() is going to be used by new discovery on
routers. It is not possible to modify the existing
vshard.storage.buckets_discovery() because it is a pubic function.
But a new discovery can be added to _service_call().

Thirdly, there is a bug in rebalancing related to possible big TCP
delays, which requires a change of _bucket space format in future.

Fourthly, _service_call() would allow to introduce new functions
on read-only replica, using code reload only. This may be needed
for the bug about bucket_ref() not preventing bucket move.

The patch introduces versioning using 4 numbers: x.x.x.x. The
first 3 numbers are major, minor, patch, the same as in the tags.
The last value is a number increased when first 3 numbers can't be
changed, but the schema is modified. That happens when users take
master branch between 2 tags, and then the schema is changed again
before the new tag is published.

Upgrade between two adjacent tags is supposed to be always safe
and automatic when reload or reconfiguration happen. However
currently there are too few versions so as any upgrade could be
not safe.

Needed for #227
Needed for #210
Gerold103 added a commit that referenced this issue Mar 22, 2020
It is a new internal function which is supposed to absorb all the
other internal functions to make it easier to change them, to
simplify adding new and drop old functions.

This also hides them from vshard.storage.* namespace so users
can't try to use them not knowing they are private.

Additionally, _call() allows to change internal function set even
on read-only instances, because it has nothing to do with the
schema. Internal function set can be updated by reload.

Closes #227
Part of #210
Gerold103 added a commit that referenced this issue Mar 23, 2020
Since vshard creation it never needed any schema changes. Version
of vshard schema was always '1', stored in _schema as
'oncevshard:storage:1', because it was created by
box.once('vshard:storage:1').

Now it is time to make some changes to the schema. First is
necessity to simplify introduction and removal of internal
functions. There are 3 of them at the moment: bucket_recv()/
rebalancer_apply_routes()/rebalancer_request_state(). Users see
them in vshard.storage.* namespace, and in _func space. They are
going to be replaced by one vshard.storage._service_call(). In
that way it will be easy to update them without schema changes, to
add new functions, remove them. In _service_call() it will be
possible to validate versions, return proper errors in case a
requested function does not exist.

Secondly, _service_call() is going to be used by new discovery on
routers. It is not possible to modify the existing
vshard.storage.buckets_discovery() because it is a pubic function.
But a new discovery can be added to _service_call().

Thirdly, there is a bug in rebalancing related to possible big TCP
delays, which requires a change of _bucket space format in future.

Fourthly, _service_call() would allow to introduce new functions
on read-only replica, using code reload only. This may be needed
for the bug about bucket_ref() not preventing bucket move.

The patch introduces versioning using 4 numbers: x.x.x.x. The
first 3 numbers are major, minor, patch, the same as in the tags.
The last value is a number increased when first 3 numbers can't be
changed, but the schema is modified. That happens when users take
master branch between 2 tags, and then the schema is changed again
before the new tag is published.

Upgrade between two adjacent tags is supposed to be always safe
and automatic when reload or reconfiguration happen. However
currently there are too few versions so as any upgrade could be
not safe.

Needed for #227
Needed for #210
Gerold103 added a commit that referenced this issue Mar 23, 2020
It is a new internal function which is supposed to absorb all the
other internal functions to make it easier to change them, to
simplify adding new and drop old functions.

This also hides them from vshard.storage.* namespace so users
can't try to use them not knowing they are private.

Additionally, _call() allows to change internal function set even
on read-only instances, because it has nothing to do with the
schema. Internal function set can be updated by reload.

Closes #227
Part of #210
Gerold103 added a commit that referenced this issue Mar 24, 2020
Since vshard creation it never needed any schema changes. Version
of vshard schema was always '1', stored in _schema as
'oncevshard:storage:1', because it was created by
box.once('vshard:storage:1').

Now it is time to make some changes to the schema. First is
necessity to simplify introduction and removal of internal
functions. There are 3 of them at the moment: bucket_recv()/
rebalancer_apply_routes()/rebalancer_request_state(). Users see
them in vshard.storage.* namespace, and in _func space. They are
going to be replaced by one vshard.storage._call(). In that way
it will be easy to update them without schema changes, to add new
functions, remove them. In _call() it will be possible to
validate versions, return proper errors in case a requested
function does not exist.

Secondly, _call() is going to be used by new discovery on
routers. It is not possible to modify the existing
vshard.storage.buckets_discovery() because it is a pubic function.
But a new discovery can be added to _call().

Thirdly, there is a bug in rebalancing related to possible big TCP
delays, which requires a change of _bucket space format in future.

Fourthly, _call() would allow to introduce new functions on
read-only replica, using code reload only. This may be needed
for the bug about bucket_ref() not preventing bucket move.

The patch introduces versioning using 4 numbers: x.x.x.x. The
first 3 numbers are major, minor, patch, the same as in the tags.
The last value is a number increased when first 3 numbers can't be
changed, but the schema is modified. That happens when users take
master branch between 2 tags, and then the schema is changed again
before the new tag is published.

Upgrade between two adjacent tags is supposed to be always safe
and automatic when reload or reconfiguration happen. However
currently there are too few versions so as any upgrade could be
not safe.

Needed for #227
Needed for #210
Gerold103 added a commit that referenced this issue Mar 24, 2020
It is a new internal function which is supposed to absorb all the
other internal functions to make it easier to change them, to
simplify adding new and drop old functions.

This also hides them from vshard.storage.* namespace so users
can't try to use them not knowing they are private.

Additionally, _call() allows to change internal function set even
on read-only instances, because it has nothing to do with the
schema. Internal function set can be updated by reload.

Closes #227
Part of #210
Gerold103 added a commit that referenced this issue Mar 28, 2020
Since vshard creation it never needed any schema changes. Version
of vshard schema was always '1', stored in _schema as
'oncevshard:storage:1', because it was created by
box.once('vshard:storage:1').

Now it is time to make some changes to the schema. First is
necessity to simplify introduction and removal of internal
functions. There are 3 of them at the moment: bucket_recv()/
rebalancer_apply_routes()/rebalancer_request_state(). Users see
them in vshard.storage.* namespace, and in _func space. They are
going to be replaced by one vshard.storage._call(). In that way
it will be easy to update them without schema changes, to add new
functions, remove them. In _call() it will be possible to
validate versions, return proper errors in case a requested
function does not exist.

Secondly, _call() is going to be used by new discovery on
routers. It is not possible to modify the existing
vshard.storage.buckets_discovery() because it is a pubic function.
But a new discovery can be added to _call().

Thirdly, there is a bug in rebalancing related to possible big TCP
delays, which requires a change of _bucket space format in future.

Fourthly, _call() would allow to introduce new functions on
read-only replica, using code reload only. This may be needed
for the bug about bucket_ref() not preventing bucket move.

The patch introduces versioning using 4 numbers: x.x.x.x. The
first 3 numbers are major, minor, patch, the same as in the tags.
The last value is a number increased when first 3 numbers can't be
changed, but the schema is modified. That happens when users take
master branch between 2 tags, and then the schema is changed again
before the new tag is published.

Upgrade between two adjacent tags is supposed to be always safe
and automatic when reload or reconfiguration happen. However
currently there are too few versions so as any upgrade could be
not safe.

Needed for #227
Needed for #210
Gerold103 added a commit that referenced this issue Mar 28, 2020
It is a new internal function which is supposed to absorb all the
other internal functions to make it easier to change them, to
simplify adding new and drop old functions.

This also hides them from vshard.storage.* namespace so users
can't try to use them not knowing they are private.

Additionally, _call() allows to change internal function set even
on read-only instances, because it has nothing to do with the
schema. Internal function set can be updated by reload.

Closes #227
Part of #210
Gerold103 added a commit that referenced this issue Mar 30, 2020
Router does discovery once per 10 seconds. Discovery sends a
request to each replicaset to download all pinned and active
buckets from there. When there are millions of buckets, that
becomes a long operation taking seconds, during which the storage
is unresponsive.

The patch makes discovery work step by step, downloading not more
than 1000 buckets at a time. That gives the storage time to
process other requests.

Part of #210
Gerold103 added a commit that referenced this issue Mar 30, 2020
Discovery disabling may be useful when there are very many routers
and buckets, and a user does not want to pay overhead of the
automatic massive discovery. It may be expensive in big clusters.

In that case users may want to turn off discovery when there is
no rebalancing, and turn it back on, when it starts, to keep the
routers up to date with _bucket changes.

Part of #210

@TarantoolBot document
Title: vshard.router.discovery_enable/disable, and new cfg option

```Lua
vshard.router.discovery_disable()
```
Turns off the background discovery fiber used by the router to
find buckets. If discovery is already disabled, nothing happens.

```Lua
vshard.router.discovery_enable()
```
Turns on the background discovery fiber. If it is already turned
on, nothing happens.

If `vshard.router.discovery_wakeup()` is called when it is
disabled, nothing happens.

These methods are good to enable/disable discovery after the
router is already started, but it is enabled by default. You may
want to never enable it even for a short time - then specify
`discovery_enable` option in the configuration. Set to `true` to
enable discovery, and to `false` to disable. When it is disabled
through the option, discovery is never even started in
`vshard.router.cfg()`.

You may decide to turn off discovery if you have many routers, or
tons of buckets (hundreds of thousands and more), and you see that
the discovery process consumes notable CPU % on routers and
storages. In that case it may be wise to turn off discovery when
there is no rebalancing in the cluster. And turn it on for new
routers, as well as for all routers when rebalancing is started.
Gerold103 added a commit that referenced this issue Apr 26, 2020
Discovery disabling may be useful when there are very many routers
and buckets, and a user does not want to pay overhead of the
automatic massive discovery. It may be expensive in big clusters.

In that case users may want to turn off discovery when there is
no rebalancing, and turn it back on, when it starts, to keep the
routers up to date with _bucket changes.

Part of #210

@TarantoolBot document
Title: vshard.router.discovery_enable/disable, and new cfg option

```Lua
vshard.router.discovery_disable()
```
Turns off the background discovery fiber used by the router to
find buckets. If discovery is already disabled, nothing happens.

```Lua
vshard.router.discovery_enable()
```
Turns on the background discovery fiber. If it is already turned
on, nothing happens.

If `vshard.router.discovery_wakeup()` is called when it is
disabled, nothing happens.

These methods are good to enable/disable discovery after the
router is already started, but it is enabled by default. You may
want to never enable it even for a short time - then specify
`discovery_enable` option in the configuration. Set to `true` to
enable discovery, and to `false` to disable. When it is disabled
through the option, discovery is never even started in
`vshard.router.cfg()`.

You may decide to turn off discovery if you have many routers, or
tons of buckets (hundreds of thousands and more), and you see that
the discovery process consumes notable CPU % on routers and
storages. In that case it may be wise to turn off discovery when
there is no rebalancing in the cluster. And turn it on for new
routers, as well as for all routers when rebalancing is started.
Gerold103 added a commit that referenced this issue Apr 26, 2020
Router does discovery once per 10 seconds. Discovery sends a
request to each replicaset to download all pinned and active
buckets from there. When there are millions of buckets, that
becomes a long operation taking seconds, during which the storage
is unresponsive.

The patch makes discovery work step by step, downloading not more
than 1000 buckets at a time. That gives the storage time to
process other requests.

Moreover, discovery now has some kind of 'state'. For each
replicaset it keeps an iterator which is moved by 1k buckets on
every successful discovered bucket batch. It means, that if on a
replicaset with 1 000 000 buckets discovery failed after 999 999
buckets were already discovered, it won't start from 0. It will
retry from the old position.

However, still there is space for improvement. Discovery could
avoid downloading anything after all is downloaded, if it sees
that bucket generation is not changed on replicasets.

Part of #210
Gerold103 added a commit that referenced this issue Apr 27, 2020
Discovery disabling may be useful when there are very many routers
and buckets, and a user does not want to pay overhead of the
automatic massive discovery. It may be expensive in big clusters.

In that case users may want to turn off discovery when there is
no rebalancing, and turn it back on, when it starts, to keep the
routers up to date with _bucket changes.

Part of #210

@TarantoolBot document
Title: vshard.router.discovery_enable/disable, and new cfg option

```Lua
vshard.router.discovery_disable()
```
Turns off the background discovery fiber used by the router to
find buckets. If discovery is already disabled, nothing happens.

```Lua
vshard.router.discovery_enable()
```
Turns on the background discovery fiber. If it is already turned
on, nothing happens.

If `vshard.router.discovery_wakeup()` is called when it is
disabled, nothing happens.

These methods are good to enable/disable discovery after the
router is already started, but it is enabled by default. You may
want to never enable it even for a short time - then specify
`discovery_enable` option in the configuration. Set to `true` to
enable discovery, and to `false` to disable. When it is disabled
through the option, discovery is never even started in
`vshard.router.cfg()`.

You may decide to turn off discovery if you have many routers, or
tons of buckets (hundreds of thousands and more), and you see that
the discovery process consumes notable CPU % on routers and
storages. In that case it may be wise to turn off discovery when
there is no rebalancing in the cluster. And turn it on for new
routers, as well as for all routers when rebalancing is started.
Gerold103 added a commit that referenced this issue Apr 27, 2020
Router's test sometimes need to wipe the route map. Simple reset
it to {} may produce unexpected results, because route map is not
just a table. It is also statistics in replicaset objects.

Inconsistent statistics may lead to failing tests in surprising
places. That becomes even more true with forthcoming patches,
which rework the statistics a little bit so it actually affects
something inside the router.

Part of #210
Gerold103 added a commit that referenced this issue Apr 27, 2020
Known bucket count was calculated on demand when router.info() was
called. Now it is going to be needed for advanced discovery. The
optimization will be that if known bucket count is equal to total
bucket count, the discovery enters 'idle' mode, when it works much
less aggressive, therefore reducing load on the cluster. Which can
be quite big when bucket count is huge.

Part of #210
Gerold103 added a commit that referenced this issue Apr 27, 2020
Router does discovery once per 10 seconds. Discovery sends a
request to each replicaset to download all pinned and active
buckets from there. When there are millions of buckets, that
becomes a long operation taking seconds, during which the storage
is unresponsive.

The patch makes discovery work step by step, downloading not more
than 1000 buckets at a time. That gives the storage time to
process other requests.

Moreover, discovery now has some kind of 'state'. For each
replicaset it keeps an iterator which is moved by 1k buckets on
every successful discovered bucket batch. It means, that if on a
replicaset with 1 000 000 buckets discovery failed after 999 999
buckets were already discovered, it won't start from 0. It will
retry from the old position.

However, still there is space for improvement. Discovery could
avoid downloading anything after all is downloaded, if it sees
that bucket generation is not changed on replicasets.

Part of #210
Gerold103 added a commit that referenced this issue May 1, 2020
Discovery disabling may be useful when there are very many routers
and buckets, and a user does not want to pay overhead of the
automatic massive discovery. It may be expensive in big clusters.

In that case users may want to turn off discovery when there is
no rebalancing, and turn it back on, when it starts, to keep the
routers up to date with _bucket changes.

Part of #210
Gerold103 added a commit that referenced this issue May 1, 2020
Router's test sometimes need to wipe the route map. Simple reset
it to {} may produce unexpected results, because route map is not
just a table. It is also statistics in replicaset objects.

Inconsistent statistics may lead to failing tests in surprising
places. That becomes even more true with forthcoming patches,
which rework the statistics a little bit so it actually affects
something inside the router.

Part of #210
Gerold103 added a commit that referenced this issue May 1, 2020
Known bucket count was calculated on demand when router.info() was
called. Now it is going to be needed for advanced discovery. The
optimization will be that if known bucket count is equal to total
bucket count, the discovery enters 'idle' mode, when it works much
less aggressive, therefore reducing load on the cluster. Which can
be quite big when bucket count is huge.

Part of #210
Gerold103 added a commit that referenced this issue May 1, 2020
Router does discovery once per 10 seconds. Discovery sends a
request to each replicaset to download all pinned and active
buckets from there. When there are millions of buckets, that
becomes a long operation taking seconds, during which the storage
is unresponsive.

The patch makes discovery work step by step, downloading not more
than 1000 buckets at a time. That gives the storage time to
process other requests.

Moreover, discovery now has some kind of 'state'. For each
replicaset it keeps an iterator which is moved by 1k buckets on
every successfully discovered bucket batch. It means, that if on a
replicaset with 1 000 000 buckets discovery fails after 999 999
buckets are already discovered, it won't start from 0. It will
retry from the old position.

However, still there is space for improvement. Discovery could
avoid downloading anything after all is downloaded, if it could
somehow see, if bucket space is not changed. Unfortunately it is
not so easy, since bucket generation (version of _bucket space)
is not persisted. So after instance restart it is always equal to
bucket count.

Part of #210
Gerold103 added a commit that referenced this issue May 1, 2020
Closes #210

@TarantoolBot document
Title: vshard.router.discovery_set() and new config option

```Lua
vshard.router.discovery_set(mode)
```
Turns on/off the background discovery fiber used by the router to
find buckets.

When `mode` is `"on"`, the discovery fiber works all the lifetime
of the router. Even after all buckets are discovered, it will
still go to storages and download their buckets with some big
period. This is useful, if bucket topology changes often and
bucket count is not big. Router will keep its route table up to
date even when no requests are processed. This is the default
value.

When `mode` is `"off"`, discovery is disabled completely.

When `mode` is `"once"`, discovery will start, find locations of
all the buckets, and then the discovery fiber is terminated. This
is good for large bucket count and for rarely clusters, where
rebalancing happens rarely.

The method is good to enable/disable discovery after the router is
already started, but discovery is enabled by default. You may want
to never enable it even for a short time - then specify
`discovery_mode` option in the configuration. It takes the same
values as `vshard.router.discovery_set(mode)`.

You may decide to turn off discovery or make it 'once' if you have
many routers, or tons of buckets (hundreds of thousands and more),
and you see that the discovery process consumes notable CPU % on
routers and storages. In that case it may be wise to turn off
discovery when there is no rebalancing in the cluster. And turn it
on for new routers, as well as for all routers when rebalancing is
started.
Gerold103 added a commit that referenced this issue May 2, 2020
Discovery disabling may be useful when there are very many routers
and buckets, and a user does not want to pay overhead of the
automatic massive discovery. It may be expensive in big clusters.

In that case users may want to turn off discovery when there is
no rebalancing, and turn it back on, when it starts, to keep the
routers up to date with _bucket changes.

Part of #210
Gerold103 added a commit that referenced this issue May 2, 2020
Router's test sometimes need to wipe the route map. Simple reset
it to {} may produce unexpected results, because route map is not
just a table. It is also statistics in replicaset objects.

Inconsistent statistics may lead to failing tests in surprising
places. That becomes even more true with forthcoming patches,
which rework the statistics a little bit so it actually affects
something inside the router.

Part of #210
Gerold103 added a commit that referenced this issue May 2, 2020
Known bucket count was calculated on demand when router.info() was
called. Now it is going to be needed for advanced discovery. The
optimization will be that if known bucket count is equal to total
bucket count, the discovery enters 'idle' mode, when it works much
less aggressive, therefore reducing load on the cluster. Which can
be quite big when bucket count is huge.

Part of #210
Gerold103 added a commit that referenced this issue May 2, 2020
Router does discovery once per 10 seconds. Discovery sends a
request to each replicaset to download all pinned and active
buckets from there. When there are millions of buckets, that
becomes a long operation taking seconds, during which the storage
is unresponsive.

The patch makes discovery work step by step, downloading not more
than 1000 buckets at a time. That gives the storage time to
process other requests.

Moreover, discovery now has some kind of 'state'. For each
replicaset it keeps an iterator which is moved by 1k buckets on
every successfully discovered bucket batch. It means, that if on a
replicaset with 1 000 000 buckets discovery fails after 999 999
buckets are already discovered, it won't start from 0. It will
retry from the old position.

However, still there is space for improvement. Discovery could
avoid downloading anything after all is downloaded, if it could
somehow see, if bucket space is not changed. Unfortunately it is
not so easy, since bucket generation (version of _bucket space)
is not persisted. So after instance restart it is always equal to
bucket count.

Part of #210
Gerold103 added a commit that referenced this issue May 2, 2020
Closes #210

@TarantoolBot document
Title: vshard.router.discovery_set() and new config option

```Lua
vshard.router.discovery_set(mode)
```
Turns on/off the background discovery fiber used by the router to
find buckets.

When `mode` is `"on"`, the discovery fiber works all the lifetime
of the router. Even after all buckets are discovered, it will
still go to storages and download their buckets with some big
period. This is useful, if bucket topology changes often and
bucket count is not big. Router will keep its route table up to
date even when no requests are processed. This is the default
value.

When `mode` is `"off"`, discovery is disabled completely.

When `mode` is `"once"`, discovery will start, find locations of
all the buckets, and then the discovery fiber is terminated. This
is good for large bucket count and for rarely clusters, where
rebalancing happens rarely.

The method is good to enable/disable discovery after the router is
already started, but discovery is enabled by default. You may want
to never enable it even for a short time - then specify
`discovery_mode` option in the configuration. It takes the same
values as `vshard.router.discovery_set(mode)`.

You may decide to turn off discovery or make it 'once' if you have
many routers, or tons of buckets (hundreds of thousands and more),
and you see that the discovery process consumes notable CPU % on
routers and storages. In that case it may be wise to turn off
discovery when there is no rebalancing in the cluster. And turn it
on for new routers, as well as for all routers when rebalancing is
started.
Gerold103 added a commit that referenced this issue May 4, 2020
Router does discovery once per 10 seconds. Discovery sends a
request to each replicaset to download all pinned and active
buckets from there. When there are millions of buckets, that
becomes a long operation taking seconds, during which the storage
is unresponsive.

The patch makes discovery work step by step, downloading not more
than 1000 buckets at a time. That gives the storage time to
process other requests.

Moreover, discovery now has some kind of 'state'. For each
replicaset it keeps an iterator which is moved by 1k buckets on
every successfully discovered bucket batch. It means, that if on a
replicaset with 1 000 000 buckets discovery fails after 999 999
buckets are already discovered, it won't start from 0. It will
retry from the old position.

However, still there is space for improvement. Discovery could
avoid downloading anything after all is downloaded, if it could
somehow see, if bucket space is not changed. Unfortunately it is
not so easy, since bucket generation (version of _bucket space)
is not persisted. So after instance restart it is always equal to
bucket count.

Part of #210
Gerold103 added a commit that referenced this issue May 4, 2020
Closes #210

@TarantoolBot document
Title: vshard.router.discovery_set() and new config option

```Lua
vshard.router.discovery_set(mode)
```
Turns on/off the background discovery fiber used by the router to
find buckets.

When `mode` is `"on"`, the discovery fiber works all the lifetime
of the router. Even after all buckets are discovered, it will
still go to storages and download their buckets with some big
period. This is useful, if bucket topology changes often and
bucket count is not big. Router will keep its route table up to
date even when no requests are processed. This is the default
value.

When `mode` is `"off"`, discovery is disabled completely.

When `mode` is `"once"`, discovery will start, find locations of
all the buckets, and then the discovery fiber is terminated. This
is good for large bucket count and for rarely clusters, where
rebalancing happens rarely.

The method is good to enable/disable discovery after the router is
already started, but discovery is enabled by default. You may want
to never enable it even for a short time - then specify
`discovery_mode` option in the configuration. It takes the same
values as `vshard.router.discovery_set(mode)`.

You may decide to turn off discovery or make it 'once' if you have
many routers, or tons of buckets (hundreds of thousands and more),
and you see that the discovery process consumes notable CPU % on
routers and storages. In that case it may be wise to turn off
discovery when there is no rebalancing in the cluster. And turn it
on for new routers, as well as for all routers when rebalancing is
started.
Gerold103 added a commit that referenced this issue May 6, 2020
Discovery disabling may be useful when there are very many routers
and buckets, and a user does not want to pay overhead of the
automatic massive discovery. It may be expensive in big clusters.

In that case users may want to turn off discovery when there is
no rebalancing, and turn it back on, when it starts, to keep the
routers up to date with _bucket changes.

Part of #210
Gerold103 added a commit that referenced this issue May 6, 2020
Router's test sometimes need to wipe the route map. Simple reset
it to {} may produce unexpected results, because route map is not
just a table. It is also statistics in replicaset objects.

Inconsistent statistics may lead to failing tests in surprising
places. That becomes even more true with forthcoming patches,
which rework the statistics a little bit so it actually affects
something inside the router.

Part of #210
Gerold103 added a commit that referenced this issue May 6, 2020
Known bucket count was calculated on demand when router.info() was
called. Now it is going to be needed for advanced discovery. The
optimization will be that if known bucket count is equal to total
bucket count, the discovery enters 'idle' mode, when it works much
less aggressive, therefore reducing load on the cluster. Which can
be quite big when bucket count is huge.

Part of #210
Gerold103 added a commit that referenced this issue May 6, 2020
Router does discovery once per 10 seconds. Discovery sends a
request to each replicaset to download all pinned and active
buckets from there. When there are millions of buckets, that
becomes a long operation taking seconds, during which the storage
is unresponsive.

The patch makes discovery work step by step, downloading not more
than 1000 buckets at a time. That gives the storage time to
process other requests.

Moreover, discovery now has some kind of 'state'. For each
replicaset it keeps an iterator which is moved by 1k buckets on
every successfully discovered bucket batch. It means, that if on a
replicaset with 1 000 000 buckets discovery fails after 999 999
buckets are already discovered, it won't start from 0. It will
retry from the old position.

However, still there is space for improvement. Discovery could
avoid downloading anything after all is downloaded, if it could
somehow see, if bucket space is not changed. Unfortunately it is
not so easy, since bucket generation (version of _bucket space)
is not persisted. So after instance restart it is always equal to
bucket count.

Part of #210
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer feature A new functionality router
Projects
None yet
Development

No branches or pull requests

2 participants