Skip to content

RuntimeError: Cluster failed to start: No module named 'bokeh' #7227

@seanlaw

Description

@seanlaw

Describe the issue:

I was attempting to re-run a previously passing Github Actions workflow (first passing attempt) as a sanity check and was surprised to find the following error (second failing attempt):

self = LocalCluster(a0a6f468, '<Not Connected>', workers=0, threads=0, memory=0 B)

    async def _start(self):
        while self.status == Status.starting:
            await asyncio.sleep(0.01)
        if self.status == Status.running:
            return
        if self.status == Status.closed:
            raise ValueError("Cluster is closed")
    
        self._lock = asyncio.Lock()
        self.status = Status.starting
    
        if self.scheduler_spec is None:
            try:
                import distributed.dashboard  # noqa: F401
            except ImportError:
                pass
            else:
                options = {"dashboard": True}
            self.scheduler_spec = {"cls": Scheduler, "options": options}
    
        try:
            # Check if scheduler has already been created by a subclass
            if self.scheduler is None:
                cls = self.scheduler_spec["cls"]
                if isinstance(cls, str):
                    cls = import_term(cls)
                self.scheduler = cls(**self.scheduler_spec.get("options", {}))
                self.scheduler = await self.scheduler
            self.scheduler_comm = rpc(
                getattr(self.scheduler, "external_address", None)
                or self.scheduler.address,
                connection_args=self.security.get_connection_args("client"),
            )
            await super()._start()
        except Exception as e:  # pragma: no cover
            self.status = Status.failed
            await self._close()
>           raise RuntimeError(f"Cluster failed to start: {e}") from e
E           RuntimeError: Cluster failed to start: No module named 'bokeh'

/opt/hostedtoolcache/Python/3.10.8/x64/lib/python3.10/site-packages/distributed/deploy/spec.py:319: RuntimeError
---------------------------- Captured stderr setup -----------------------------
2022-10-29 14:43:07,628 - distributed.deploy.spec - WARNING - Cluster closed without starting up
=========================== short test summary info ============================
ERROR tests/test_stumped.py::test_stumped_int_input - RuntimeError: Cluster failed to start: No module named 'bokeh'
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
=============================== 1 error in 1.73s ===============================
Error: pytest encountered exit code 1
Error: Process completed with exit code 1.

Specifically, RuntimeError: Cluster failed to start: No module named 'bokeh'. A quick Google search revealed that other folks are also running into this issue. It's not entirely clear if this is a distributed issue or a problem with Github Actions. However, I did notice that the passing attempts were performed on distributed version distributed-2022.9.2-py3-none-any.whl whereas the failed attempt was distributed-2022.10.1-py3-none-any.whl.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions