Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stateless connections #11

Closed
pda opened this issue Jun 23, 2009 · 6 comments
Closed

stateless connections #11

pda opened this issue Jun 23, 2009 · 6 comments
Labels
Unplanned Issue is not planned for some reason, such as complexity, lack of clarity, or low priority.

Comments

@pda
Copy link
Contributor

pda commented Jun 23, 2009

Stateful connections cause problems, especially when processing long-running jobs
or connecting over long distances or complicated network topologies. We can solve
these problems by learning from HTTP: prefer (rather, require) stateless, disposable,
short-lived connections.

However, statelessness poses challenges, especially with efficiency. For example,
repeating the tube name for every put command would use more bandwidth than
the current protocol. So this is a tradeoff.

The current beanstalkd protocol is highly stateful and its design pervasively assumes
that state is available. It uses this where possible to be more efficient.

We should reevaluate whether the current statefulness of the protocol is the best tradeoff.

Original post follows.


Currently when a client disconnects while it has a reserved job, that job is instantly released onto the front of the queue, without respect of its remaining TTR.

This means that:

  • A worker executing long jobs cannot disconnect/reconnect to beanstalkd during execution. Perhaps that would make it too hard to track which client owns which job reservation though...
  • A job that causes a worker to crash constantly hogs the front of the queue, blocking subsequent jobs from coming through.

Perhaps auto-release due to disconnect should at least release onto the end of the queue, at a lower priority, or with a delay..?

@Meta-phaze
Copy link

While I understand your requested behavior I just wanted to note I rely on the 'front of queue' behavior so, if this issue moves forward, I hope it can be controlled during the reserve step (or some other way to access either the new or old behavior as desired).

@SyBernot
Copy link

I'm currently running into this a s a problem along with issue #109 in my work around. My initial thought is another operation/state, check-out, where a job is reserved and placed in a checked-out state (a special case delay) for $timeout seconds (0=forever), the client can then disconnect and go about it's business. if timeout is reached the job returns to the ready state. Things in the checked-out state can be released or deleted by any connection via there $id.

I've actually tried to implement this as reserve (get the job id/job data)->release with a really long delay (also has the benefit of adjusting the priority for when it returns to the ready) but I haven't yet been able to find a way to implement a delete from the delayed state (I get a NOT_FOUND). My end goal is to come up with something that is not dependent on a persistent connection, our jobs can run from seconds to weeks and our workers can be on the other side of the globe so pretty much anything can happen, but also I do want any unfinished jobs to persist on the server until deleted so they can be rekicked if the worker just dies mid job.

@kr
Copy link
Member

kr commented Sep 27, 2012

The ability to delete delayed jobs is covered in #30, which
is fixed in recent versions of beanstalkd.

@SyBernot
Copy link

I just updated to 1.7 and will test delete delayed again. If it woks that would be awesome as I haven't found a way to get something out of delayed other than just waiting. A working kick-job (id) would do it too. I see it's in the protocol doc but it's not a valid command in 1.7.

@kr
Copy link
Member

kr commented Dec 6, 2012

In most popular programming languages, it's not hard to create
workers that effectively never crash, regardless of the contents
of a job. For example, if the job raises an unknown exception,
have the worker catch all exceptions in its main loop. The
strategy described How to Handle Job Failures then helps deal
with identifying, triaging, and fixing job failure types.

A more common, and harder to prevent, cause of worker crashes
is external factors, such as resource exhaustion, being killed,
power failures, etc. The current behavior is optimized for this, the
common case.

Having said that, there's still a lot of value in eliminating state
from the connection, and avoiding long-lived connections (or at
least making connections disposable). We're not going to add
configuration flags for this sort of thing, so the behavior will either
stay as it is, or change to be stateless. If we choose to go that
route, we should go all the way to fully stateless connections.

@ysmolski ysmolski added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed needs-label labels Jun 26, 2019
@ysmolski
Copy link
Member

The answer by @kr sums it rather well. Also I second this post https://xph.us/2010/05/02/how-to-handle-job-failures.html. From my experience it is possible to handle errors or crashes gracefully. I do not know why this issue was kept open for so long since we are not going to solve it foreseeable future. I will hold it open, but mark as unplanned.

@ysmolski ysmolski added Unplanned Issue is not planned for some reason, such as complexity, lack of clarity, or low priority. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Jun 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Unplanned Issue is not planned for some reason, such as complexity, lack of clarity, or low priority.
Projects
None yet
Development

No branches or pull requests

6 participants