Skip to content

"no enough slots available " error of running openMPI on linux clusters  #7719

@ghost

Description

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.3.tar.gz

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

It was downloaded from
https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.3.tar.gz
and installed on databricks cluster.

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

It was built on databricks cluster with Ubuntu.

Please describe the system on which you are running

  • Operating system/version: Linux 4.4.0 Ubuntu
  • Computer hardware: x86_64
  • Network type: databricks

Details of the problem

I am trying to run :

    mpirun --allow-run-as-root -np 20  MY_c_Application 

The MY_c_Application was written by C and compiled on databricks Linux.

My databricks cluster has 21 nodes with one as driver. Each node has 36 cores.

When I run the above command, I got the error as follows.

Could you please let me know how this could be caused ?
Or, do I miss something ?

thanks


There are not enough slots available in the system to satisfy the 20
slots that were requested by the application:

  MY_c_application

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
    processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
    hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
    RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.


Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.

Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:

shell$ mpirun -np 2 ./hello_world

Metadata

Metadata

Assignees

No one assigned

    Labels

    RTEIssue likely is in RTE or PMIx areas

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions