Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers quit before all pending tasks are done #222

Closed
jimjh opened this issue Nov 14, 2013 · 2 comments
Closed

Workers quit before all pending tasks are done #222

jimjh opened this issue Nov 14, 2013 · 2 comments

Comments

@jimjh
Copy link
Contributor

jimjh commented Nov 14, 2013

... and as a result some tasks are never run and eventually expired.

Consider the following set of tasks:

class Block(luigi.Task):
  def run(self):
    # sleep for 20 seconds
  def complete(self):
    # return true after it has slept for 20 seconds

class Dep(luigi.Task):
  def requires(self):
    return Block()
  def run(self):
    # do something, such as creating a file

Submit Block, then in another terminal submit Dep while Block is running.

$> ./block_task.py Block
$> ./block_task.py Dep
INFO: There are no more tasks to run at this time
INFO: Block() is currently run by worker worker-428203729
INFO: Worker was stopped. Shutting down Keep-Alive thread

Now visit the scheduler's web UI at http://localhost:8082. We see that Dep is still pending, but it never gets run because all workers have exited.

screen shot 2013-11-14 at 9 31 23 am

I believe this is related to the following comment in luigi/worker.py.

# TODO: sleep for a bit and query server again if there are
# pending tasks in the future we might be able to run
@erikbern
Copy link
Contributor

Yeah this is a feature/bug depending on how you look at it. We thought about doing some kind of sleep/retry or even long-polling. However we thought it was easier to just trigger the job repeatedly until it builds. Typically we run all our workflows from cron every minute. If the same workflow is already running, it will just exit. If nothing is running, it will traverse the dependency graph again and try to schedule everything.

Sleep/retry should be fairly easy to implement though, but I think it should be configurable. There's a risk you might create a lot of spinning processes. There are also some corner cases like you could end up having 100 processes waiting for Block to finish so they can build Dep, but then once Block finishes, only one process will end up building Dep and the rest will exit.

JoeEnnever pushed a commit to JoeEnnever/luigi that referenced this issue Feb 19, 2014
Break up worker#run into smaller methods and give each a meaningful
name so it's easier to figure out what it's doing and allow code reuse.

Collapse duplicate if constructs.
@jimjh
Copy link
Contributor Author

jimjh commented Dec 2, 2015

Oops forgot to close this. Added a flag to work around this in #225.

@jimjh jimjh closed this as completed Dec 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants