Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions python/sparkdl/horovod/runner_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def __init__(self, np):
which maps to a GPU on a GPU cluster or a CPU core on a CPU cluster.
Accepted values are:

- If -1, this will spawn a subprocess on the driver node to run the Horovod job locally.
- If <0, this will spawn -np subprocesses on the driver node to run Horovod locally.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mengxr I don't think "-np subprocesses" is straight forward for a layperson, do you think it's oK for our customers? Do you plan to do the release as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Training stdout and stderr messages go to the notebook cell output, and are also
available in driver logs in case the cell output is truncated. This is useful for
debugging and we recommend testing your code under this mode first. However, be
Expand All @@ -63,8 +63,6 @@ def __init__(self, np):
- If 0, this will use all task slots on the cluster to launch the job.
"""
self.num_processor = np
if self.num_processor < -1:
raise ValueError("Invalid number of processes: np = %s" % str(self.num_processor))

def run(self, main, **kwargs):
"""
Expand Down