Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a check of the session state before take a task #108

Merged

Conversation

LeonidVas
Copy link
Contributor

We need to check of a session state before take a task after wait()
because session maybe in disconnecting state and task will be
hang in a take state after the session will be disconnected

Fixes #104

@LeonidVas LeonidVas force-pushed the lvasiliev/gh-queue-104-take-task-after-disconnect branch from 79a544f to 4e96760 Compare February 3, 2020 15:48
Copy link
Contributor

@olegrok olegrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some questions were discussed verbally

Copy link

@Mons Mons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for canceling fibers, which will not be able to proceed with the task. We even asked recently to do this automatically: tarantool/tarantool#4788 (just for information).

I would also describe what is the root of the problem in the test case. Very sketchy it may look so:

Put task, take it, take again (no more tasks, it'll be blocked in tube.take()), disconnect, schedule the fiber with blocked take (but don't run yet) (from method._on_consumer_disconnect), release the task (from method._on_consumer_disconnect), <here the scheduled fiber with blocked take will be executed>.

The result is that the task will be in 'taken' state, however there is no consumer, which may work with it. The task likely will be stuck in this state until a server restart.

At least look whether you can improve the description and if so, spend a hour on it, please. (I often stuck with changes made several years ago w/o described reasoning, so maybe I overdramatize this, but anyway.)

queue/abstract.lua Show resolved Hide resolved
t/110-take-task-after-reconnect.t Outdated Show resolved Hide resolved
t/110-take-task-after-reconnect.t Outdated Show resolved Hide resolved
t/110-take-task-after-reconnect.t Outdated Show resolved Hide resolved
t/110-take-task-after-reconnect.t Outdated Show resolved Hide resolved
queue/abstract.lua Show resolved Hide resolved
@LeonidVas LeonidVas force-pushed the lvasiliev/gh-queue-104-take-task-after-disconnect branch from 4e96760 to 7791258 Compare February 28, 2020 08:39
@LeonidVas
Copy link
Contributor Author

Updated

@LeonidVas
Copy link
Contributor Author

Now two implementations have been presented, and you need to choose one.

queue/abstract.lua Outdated Show resolved Hide resolved
@LeonidVas LeonidVas force-pushed the lvasiliev/gh-queue-104-take-task-after-disconnect branch 6 times, most recently from 6d6abe3 to 2344c46 Compare February 28, 2020 15:43
@Totktonada
Copy link
Member

Leonid said that he met problems with approach with fiber.cancel(). The reason likely is caching of fiber conds (see compat.lua) and that waiters field of struct fiber_cond remains filled after canceling the fiber.

Let's stick with the original Leonid's variant.

@Totktonada
Copy link
Member

Pushed two small fixups.

We need to check of a session state before take a task after wait()
because session maybe in disconnecting state and task will be
hang in a take state after the session will be disconnected

Fixes #104
@Totktonada Totktonada force-pushed the lvasiliev/gh-queue-104-take-task-after-disconnect branch from e791fb5 to 5f07d75 Compare February 28, 2020 23:05
@Totktonada
Copy link
Member

Squashed them (Leonid agreed).

Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@Totktonada Totktonada merged commit 161ab21 into master Feb 28, 2020
@Totktonada Totktonada deleted the lvasiliev/gh-queue-104-take-task-after-disconnect branch February 28, 2020 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: second take() hangs after disconnect during take()
4 participants