Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to stop storage/router #121

Open
Gerold103 opened this issue Jun 15, 2018 · 8 comments
Open

Ability to stop storage/router #121

Gerold103 opened this issue Jun 15, 2018 · 8 comments
Labels
feature A new functionality router storage teamS Scaling
Milestone

Comments

@Gerold103
Copy link
Collaborator

When a replica or a router is moved out of cluster, it would be useful to be able to reset its state: stop background fibers (discovery, failover, recovery ...), clear _func from 'vshard.*' functions, close connections.

Khatskevich added a commit that referenced this issue Jun 22, 2018
Introduce functions:
 * vshard.router.destroy()
 * vshard.storage.destroy()

Those functions:
 * close connections
 * stop background fibers
 * delete vshard spaces
 * delete vshard funcitons
 * delete `once` metadate

Closes #121
Khatskevich added a commit that referenced this issue Jun 22, 2018
Introduce functions:
 * vshard.router.destroy()
 * vshard.storage.destroy()

Those functions:
 * close connections
 * stop background fibers
 * delete vshard spaces
 * delete vshard funcitons
 * delete `once` metadate

Closes #121
Khatskevich added a commit that referenced this issue Jun 22, 2018
Introduce functions:
 * vshard.router.destroy()
 * vshard.storage.destroy()

Those functions:
 * close connections
 * stop background fibers
 * delete vshard spaces
 * delete vshard funcitons
 * delete `once` metadate

Closes #121
Khatskevich added a commit that referenced this issue Jun 25, 2018
Introduce functions:
 * vshard.router.destroy()
 * vshard.storage.destroy()

Those functions:
 * close connections
 * stop background fibers
 * delete vshard spaces
 * delete vshard funcitons
 * delete `once` metadate

After the destroy, module can be configured as it was just loaded.

Extra changes:
 * introduce fiber_list function which returns names of non-tarantool
   fibers
 * introduce update_M function, which updates M (module internals) with
   values defined in the module

Closes #121
@Khatskevich
Copy link
Contributor

I have started implementing router/storage.destroy() feature and faced a problem:

We cannot just delete _bucket space, because it will possibly be replicated and delete the space from other instances:

  1. It is replicated
  2. It will be replicated in the future
    a. replica suddenly gets back to live
    b. some replica gets configured to replicate this instance

We want to make an API in the way which prevents a user from loosing
any data.

Possible solutions:

  1. Split destroy to stop and destroy, and ask users call destroy only after they are absolutely sure that everything is ok
  2. implement only stop feature, and let user cleanup/restart Tarantool
  3. implement destroy as
    a. stop
    b. close connections
    c. change UUID
    d. cleanup _bucket

stop means stop fibers and reset config
destroy means stop and delete _bucket

Khatskevich added a commit that referenced this issue Jun 25, 2018
Introduce functions:
 * vshard.router.destroy()
 * vshard.storage.destroy()

Those functions:
 * close connections
 * stop background fibers
 * delete vshard spaces
 * delete vshard funcitons
 * delete `once` metadate

After the destroy, module can be configured as it was just loaded.

Extra changes:
 * introduce fiber_list function which returns names of non-tarantool
   fibers
 * introduce update_M function, which updates M (module internals) with
   values defined in the module

Closes #121
@Khatskevich
Copy link
Contributor

@kostja voted for the second option

@Khatskevich
Copy link
Contributor

Khatskevich commented Jun 26, 2018

1.Do not need destroy feature.
2. storage.cfg{} should work as stop.

cite @racktear

@Gerold103
Copy link
Collaborator Author

No, storage.cfg should not work as stop. It contradicts with box.cfg and breaks storage.cfg syntax.

@knazarov
Copy link

storage.cfg({sharding={}}) is perfectly fine by me.

@Gerold103
Copy link
Collaborator Author

Bad idea as well. It is like allow to stop Tarantool with box.cfg(box.NULL) - looks weird. Then I prefer vshard.storage/router.stop(). It will stop all background fibers and close connections. After you will be able to call cfg again when the storage is cleaned up.

@Gerold103 Gerold103 added this to the 0.2 milestone Jul 3, 2018
@Gerold103 Gerold103 added storage router feature A new functionality labels Jul 3, 2018
@rosik
Copy link
Contributor

rosik commented Nov 11, 2020

I'd like to state the relevance of this feature. We need it in cartridge for two reasons:

  1. Cartridge allows a user to disable a role (e.g. an empty storage), and he expects that all fibers are stopped.
  2. In the context of [2pt] Support roles hot-reload cartridge#1100 I'd like to perform code hot-reload as stop - reload - init.

@R-omk
Copy link
Contributor

R-omk commented Nov 17, 2020

Related issue #219

@kyukhin kyukhin added the teamS Scaling label Sep 17, 2021
rosik added a commit to tarantool/cartridge that referenced this issue Oct 4, 2021
Stopping the vshard-storage role still isn't implemented
(see tarantool/vshard#121).
So we use nasty workarounds to simulate it.

One of the internal netbox connections wasn't closed. It was reused by
the rebalancer, but roles reload had corrupted it by killing a control
fiber, so the rebalancer got stuck. 

This patch enhances the cleanup and fixes the rebalancer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality router storage teamS Scaling
Projects
None yet
Development

No branches or pull requests

6 participants