Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Bug" introduced in 2022.04.2 requiring explicit cluster close #6255

Closed
evamaxfield opened this issue May 1, 2022 · 2 comments · Fixed by #6256
Closed

"Bug" introduced in 2022.04.2 requiring explicit cluster close #6255

evamaxfield opened this issue May 1, 2022 · 2 comments · Fixed by #6256

Comments

@evamaxfield
Copy link

evamaxfield commented May 1, 2022

What happened:

If you attempt to exit a Python process while a LocalCluster (threaded) is running, an error is raised.

What you expected to happen:

No error raises, the Python process exists with no error exit code.

Minimal Complete Verifiable Example:

Step-by-step:
distributed==2022.04.1
distributed==2022.04.2 (bug)
distributed==2022.04.2 (fixed)

2022.04.1

pip install distributed==2022.04.1
python
from distributed import LocalCluster
cluster = LocalCluster(processes=False)
exit()

no error exit code

2022.04.2 (bug)

pip install distributed==2022.04.2
python
from distributed import LocalCluster
cluster = LocalCluster(processes=False)
exit()
raised error / exit code (expand to see full traceback):
cannot schedule new futures after interpreter shutdown
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 759, in wrapper
    return await func(*args, **kwargs)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/worker.py", line 1604, in close
    await to_thread(_close)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/asyncio/base_events.py", line 819, in run_in_executor
    executor.submit(func, *args), loop=self)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/concurrent/futures/thread.py", line 169, in submit
    raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/deploy/spec.py", line 654, in close_clusters
    cluster.close(timeout=10)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/deploy/cluster.py", line 194, in close
    return self.sync(self._close, callback_timeout=timeout)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 318, in sync
    return sync(
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 385, in sync
    raise exc.with_traceback(tb)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 358, in f
    result = yield future
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/asyncio/tasks.py", line 479, in wait_for
    return fut.result()
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/deploy/spec.py", line 420, in _close
    assert w.status == Status.closed, w.status
AssertionError: Status.closing
Exception ignored in: <coroutine object Worker.close at 0x7f8e31f3cdc0>
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 759, in wrapper
    return await func(*args, **kwargs)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 779, in __exit__
    frame = stack[self.unroll_stack]
IndexError: list index out of range
Task was destroyed but it is pending!
task: <Task pending name='Task-68' coro=<Worker.close() done, defined at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py:757> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f8e31f37eb0>()]> cb=[IOLoop.add_future.<locals>.<lambda>() at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/tornado/ioloop.py:688]>
Exception ignored in: <coroutine object Worker.close at 0x7f8e31f3cd40>
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 759, in wrapper
    return await func(*args, **kwargs)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 779, in __exit__
    frame = stack[self.unroll_stack]
IndexError: list index out of range
Task was destroyed but it is pending!
task: <Task pending name='Task-20' coro=<InProcListener._listen() running at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/comm/inproc.py:265> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f8e31e6cfa0>()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-30' coro=<Worker.handle_scheduler() running at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/worker.py:169> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f8e31e74730>()]> cb=[IOLoop.add_future.<locals>.<lambda>() at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/tornado/ioloop.py:688]>

2022.04.2 (fixed)

pip install distributed==2022.04.2
python
from distributed import LocalCluster
cluster = LocalCluster(processes=False)
cluster.close()
exit()

no error exit code

Anything else we need to know?:

Environment:

  • Python version: 3.9.12
  • Operating System: macOS 11.6.1
  • Install method (conda, pip, source): pip

🤷 Technically it's not really a bug but definitely caused a lot of my CRON jobs to be marked as failing because I wasn't explicitly closing my clusters. If this is the desired behavior, then great, but it didn't seem like this was reported in the changelog outside of this PR: #6091

mrocklin added a commit to mrocklin/distributed that referenced this issue May 1, 2022
mrocklin added a commit to mrocklin/distributed that referenced this issue May 1, 2022
@mrocklin
Copy link
Member

mrocklin commented May 1, 2022

Thank you for reporting @JacksonMaxfield . Definitely a bug.

This should be fixed in #6256 . I'll suggest to the team that we get out a bugfix release soon for this.

@mrocklin
Copy link
Member

mrocklin commented May 1, 2022

@fjetter @jrbourbeau is this enough to trigger a bugfix release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants