Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[draft] Drop Python 3.7 and allow tests to work in isolation #565

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

graingert
Copy link
Member

No description provided.

@@ -94,7 +96,22 @@
""".strip()


class Job(ProcessInterface, abc.ABC):
class FixedProcessInterface(ProcessInterface):
Copy link
Member Author

@graingert graingert Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently it's not safe to create a ProcessInterface without a running loop.

I can adapt ProcessInterface upstream to support this usecase with see distributed.actor._LateLoopEvent and I will need to make a _LateLoopLock. However it's better if jobs are not created until an event loop has started.

the problem line is commented: "trigger property to ensure that the job is valid":

]
self._dummy_job # trigger property to ensure that the job is valid
super().__init__(
scheduler=scheduler,

https://github.com/graingert/dask-jobqueue/runs/7541364445?check_suite_focus=true#step:8:841

 dask_jobqueue/core.py:560: in __init__
    self._dummy_job  # trigger property to ensure that the job is valid
dask_jobqueue/core.py:589: in _dummy_job
    return self.job_cls(
dask_jobqueue/local.py:40: in __init__
    super().__init__(
dask_jobqueue/core.py:168: in __init__
    super().__init__()
/usr/share/miniconda/envs/dask-jobqueue/lib/python3.8/site-packages/distributed/deploy/spec.py:55: in __init__
    self.lock = asyncio.Lock()
/usr/share/miniconda/envs/dask-jobqueue/lib/python3.8/asyncio/locks.py:164: in __init__
    self._loop = events.get_event_loop()

I'm using @functools.cached_property here just to demo the fix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so from what I'm aware of, we never get into loop questionning on dask-jobqueue. I guess we only use the default loop from MainThread when using it.

From what I understand, this is the clean function of distributed.tests_util that just removes the default created loop when we use the loop fixture. Is that normal that this function might wipe the MainThread loop?

See the last commit of my branch d50cc2d (sorry I did not cherry pick some of your modifications, I still lack git knowledge on top of asyncio stuf).

Maybe this is meaning that we are not doing things right, but might this be a temporary workaround until a bigger refactor?

Copy link
Member Author

@graingert graingert Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the default event loop is deprecated, and already removed when used with asyncio.run. I added this to our cleanup to make sure we (distributed) are ready for the removal and support running with multiple calls to asyncio.run:

import asyncio

import distributed
import dask_jobqueue

# creating a dask_jobqueue object works here

async def noop():
    pass

asyncio.run(noop())

# and raises a
# RuntimeError: There is no current event loop in thread 'MainThread'
# here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. If I understand correctly, at one point dask-jobqueue would have to be ready for the removal of the default event loop, is that right? When will it be supressed?

In the meantime, do you think it's reasonable to use the workaround I proposed, and in parallel open an issue about this problem and try to fix it later on?

Copy link
Member Author

@graingert graingert Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, at one point dask-jobqueue would have to be ready for the removal of the default event loop

The default event loop already gets removed when anyone uses asyncio.run and so is already a bug from python 3.6+

It's deprecated in 3.10 and scheduled for removal in 3.12, and so that version of python will act as though someone has already called asyncio.run

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea for a workaround that I want to try tomorrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my workaround didn't work, so I've made an issue upstream dask/distributed#6858

@graingert
Copy link
Member Author

graingert commented Jul 27, 2022

for the last remaining tests it looks like there's a ThreadPoolExecutor running and not being shutdown. I'm not sure if this is a bug in distributed or dask-jobqueue right now

@guillaumeeb
Copy link
Member

Thanks @graingert for all the contribution here. I think I'll need some time to digest all this.

I'm a bit confused by the modification for the running loop. Outside of test environment, the current code of dask-jobqueue seems to be working well with recent versions of distributed. And I didn't encounter these errors in my tests, how would you explain it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants