Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check to scheduler creation in SpecCluster #4605

Merged
merged 2 commits into from
Mar 19, 2021

Conversation

jacobtomlinson
Copy link
Member

@jacobtomlinson jacobtomlinson commented Mar 18, 2021

While working on cluster discovery support I found that when reconstructing a cluster scheduler and worker objects may have already been created before start is called.

This small PR adds a check and only creates the scheduler if it doesn't already exist.

This is the first step towards making SpecCluster declarative rather than imperative.

  • Tests added / passed
  • Passes black distributed / flake8 distributed

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jacobtomlinson! In principle this check seems fine, though I'm wondering how we will encounter a case where self.scheduler isn't None in practice. Are you doing something like

cluster = SpecCluster(...)
cluster.scheduler = <an-existing-scheduler>
await cluster.start()

during cluster discovery?

@jacobtomlinson
Copy link
Member Author

Are you doing something like ... during cluster discovery?

Kinda, self.scheduler is being set in KubeCluster.start() if a cluster with the same name already exists. So when it calls super().start() I'm running into problems.

I had already tried setting the status to running which would skip over the contents of SpecCluster.start but then I had trouble because I needed the rpc to be created and for Cluster.start to be called in order to correctly set up status listeners.

@jacobtomlinson
Copy link
Member Author

The _lock also wasn't being created when I was short-circuiting SpecCluster._start by setting the status to running, hence me opening #4596. But this approach seems better.

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Would you mind adding a comment which mentions this, or a test, to make sure we don't accidentally revert this if self.scheduler is None check?

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jacobtomlinson!

@jrbourbeau jrbourbeau merged commit 3e2ea59 into dask:main Mar 19, 2021
@jacobtomlinson jacobtomlinson deleted the spec-optiona-start branch March 19, 2021 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants