Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI - Add command sb benchmark [list,list-parameters] #279

Merged
merged 5 commits into from
Jan 18, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 62 additions & 1 deletion docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,68 @@ Welcome to the SB CLI!

The following lists `sb` commands usages and examples:

### `sb benchmark list`

List benchmarks which match the name.
abuccts marked this conversation as resolved.
Show resolved Hide resolved
```bash title="SB CLI"
sb benchmark list [--name]
```

#### Optional arguments

| Name | Default | Description |
|---------------|---------|---------------------------------------|
| `--name` `-n` | `None` | Benchmark name or regular expression. |

#### Global arguments

| Name | Default | Description |
|---------------|---------|--------------------|
| `--help` `-h` | N/A | Show help message. |
cp5555 marked this conversation as resolved.
Show resolved Hide resolved

#### Examples

List all benchmarks:
```bash title="SB CLI"
sb benchmark list
```

List all benchmarks ending with "-bw":
```bash title="SB CLI"
sb benchmark list --name [a-z]+-bw
```

### `sb benchmark list-parameters`

List parameters for benchmarks which match the name.
```bash title="SB CLI"
cp5555 marked this conversation as resolved.
Show resolved Hide resolved
sb benchmark list-parameters [--name]
cp5555 marked this conversation as resolved.
Show resolved Hide resolved
```

#### Optional arguments

| Name | Default | Description |
|---------------|---------|---------------------------------------|
| `--name` `-n` | `None` | Benchmark name or regular expression. |

#### Global arguments

| Name | Default | Description |
|---------------|---------|--------------------|
| `--help` `-h` | N/A | Show help message. |

#### Examples

List parameters for all benchmarks:
```bash title="SB CLI"
sb benchmark list-parameters
```

List parameters for all benchmarks which starts with "pytorch-":
```bash title="SB CLI"
sb benchmark list-parameters --name pytorch-[a-z]+
```

### `sb deploy`

Deploy the SuperBench environments to all managed nodes.
Expand Down Expand Up @@ -217,4 +279,3 @@ Print version:
```bash title="SB CLI"
sb version
```

14 changes: 13 additions & 1 deletion superbench/benchmarks/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,25 @@
import argparse
import numbers
from datetime import datetime
from operator import attrgetter
from abc import ABC, abstractmethod

from superbench.common.utils import logger
from superbench.benchmarks import BenchmarkType, ReturnCode
from superbench.benchmarks.result import BenchmarkResult


class SortedMetavarTypeHelpFormatter(argparse.MetavarTypeHelpFormatter):
"""Custom HelpFormatter class for argparse which sorts option strings."""
def add_arguments(self, actions):
"""Sort option strings before original add_arguments.

Args:
actions (argparse.Action): Argument parser actions.
"""
super(SortedMetavarTypeHelpFormatter, self).add_arguments(sorted(actions, key=attrgetter('option_strings')))
cp5555 marked this conversation as resolved.
Show resolved Hide resolved


class Benchmark(ABC):
"""The base class of all benchmarks."""
def __init__(self, name, parameters=''):
Expand All @@ -29,7 +41,7 @@ def __init__(self, name, parameters=''):
add_help=False,
usage=argparse.SUPPRESS,
allow_abbrev=False,
formatter_class=argparse.MetavarTypeHelpFormatter
formatter_class=SortedMetavarTypeHelpFormatter,
)
self._args = None
self._curr_run_index = 0
Expand Down
59 changes: 59 additions & 0 deletions superbench/cli/_benchmark_handler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

"""SuperBench CLI benchmark subgroup command handler."""

import re
from pprint import pformat

from knack.util import CLIError

from superbench.benchmarks import Platform, BenchmarkRegistry


def benchmark_list_command_handler(name=None):
"""List benchmarks which match the name.

Args:
name (str, optional): Benchmark name or regular expression. Defaults to None.

Raises:
CLIError: If cannot find the matching benchmark.

Returns:
list: Benchmark list.
"""
benchmark_list = list(BenchmarkRegistry.get_all_benchmark_predefine_settings().keys())
if name is None:
return benchmark_list
filter_list = list(filter(re.compile(name).match, benchmark_list))
if not filter_list:
raise CLIError('Benchmark {} does not exist.'.format(name))
return filter_list


def benchmark_list_params_command_handler(name=None):
"""List parameters for benchmarks which match the name.

Args:
name (str, optional): Benchmark name or regular expression. Defaults to None.

Raises:
CLIError: If cannot find the matching benchmark.
"""
for benchmark_name in benchmark_list_command_handler(name):
format_help = ''
for platform in Platform:
if platform in BenchmarkRegistry.benchmarks[benchmark_name]:
format_help = BenchmarkRegistry.get_benchmark_configurable_settings(
BenchmarkRegistry.create_benchmark_context(benchmark_name, platform=platform)
)
break
print(
(
f'=== {benchmark_name} ===\n\n'
f'{format_help}\n\n'
f'default values:\n'
f'{pformat(BenchmarkRegistry.benchmarks[benchmark_name]["predefine_param"])}\n'
)
)
8 changes: 8 additions & 0 deletions superbench/cli/_commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ def load_command_table(self, args):
g.command('deploy', 'deploy_command_handler')
cp5555 marked this conversation as resolved.
Show resolved Hide resolved
g.command('exec', 'exec_command_handler')
g.command('run', 'run_command_handler')
with CommandGroup(self, 'benchmark', 'superbench.cli._benchmark_handler#{}') as g:
g.command('list', 'benchmark_list_command_handler')
g.command('list-parameters', 'benchmark_list_params_command_handler')
with CommandGroup(self, 'node', 'superbench.cli._node_handler#{}') as g:
g.command('info', 'info_command_handler')
with CommandGroup(self, 'result', 'superbench.cli._result_handler#{}') as g:
Expand Down Expand Up @@ -61,6 +64,10 @@ def load_arguments(self, command):
nargs='+',
help='Extra arguments to override config_file.'
)

with ArgumentsContext(self, 'benchmark') as ac:
ac.argument('name', options_list=('--name', '-n'), type=str, help='Benchmark name or regular expression.')

with ArgumentsContext(self, 'result') as ac:
ac.argument('raw_data_file', options_list=('--data-file', '-d'), type=str, help='Path to raw data file.')
ac.argument('rule_file', options_list=('--rule-file', '-r'), type=str, help='Path to rule file.')
Expand All @@ -73,4 +80,5 @@ def load_arguments(self, command):
help='Path to output directory, outputs/{datetime} will be used if not specified.'
)
ac.argument('output_file_format', type=str, help='Format of output file, excel or json.')

super().load_arguments(command)
47 changes: 41 additions & 6 deletions superbench/cli/_help.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,31 @@
text: {cli_name} run --docker-image superbench/cuda:11.1 --host-file ./host.ini
""".format(cli_name=CLI_NAME)

helps['benchmark'] = """
type: group
short-summary: Commands to manage benchmarks.
"""

helps['benchmark list'] = """
type: command
examples:
- name: list all benchmarks
text: {cli_name} benchmark list
- name: list all benchmarks ending with "-bw"
text: {cli_name} benchmark list --name [a-z]+-bw
cp5555 marked this conversation as resolved.
Show resolved Hide resolved
""".format(cli_name=CLI_NAME)

helps['benchmark list-parameters'] = """
type: command
examples:
- name: list parameters for all benchmarks
text: {cli_name} benchmark list-parameters
- name: list parameters for all benchmarks which starts with "pytorch-"
cp5555 marked this conversation as resolved.
Show resolved Hide resolved
text: {cli_name} benchmark list-parameters --name pytorch-[a-z]+
""".format(cli_name=CLI_NAME)

helps['node'] = """
type: Group
type: group
short-summary: Get detailed information or configurations on the local node.
"""

Expand All @@ -75,19 +98,31 @@
""".format(cli_name=CLI_NAME)

helps['result'] = """
type: Group
type: group
short-summary: Process or analyze the results of SuperBench benchmarks.
"""

helps['result diagnosis'] = """
type: command
short-summary: Filter the defective machines automatically from benchmarking results according to rules defined in rule file.
short-summary: >
Filter the defective machines automatically from benchmarking results
according to rules defined in rule file.
examples:
- name: run data diagnosis and output the results in excel format
text: {cli_name} result diagnosis --data-file 'outputs/results-summary.jsonl' --rule-file 'rule.yaml' --baseline-file 'baseline.json' --output-file-foramt 'excel'
text: >
{cli_name} result diagnosis
--data-file outputs/results-summary.jsonl
--rule-file rule.yaml
--baseline-file baseline.json
--output-file-foramt excel
- name: run data diagnosis and output the results in jsonl format
text: {cli_name} result diagnosis --data-file 'outputs/results-summary.jsonl' --rule-file 'rule.yaml' --baseline-file 'baseline.json' --output-file-foramt 'json'
""".format(cli_name=CLI_NAME) # noqa: E501
text: >
{cli_name} result diagnosis
--data-file outputs/results-summary.jsonl
--rule-file rule.yaml
--baseline-file baseline.json
--output-file-foramt json
""".format(cli_name=CLI_NAME)


class SuperBenchCLIHelp(CLIHelp):
Expand Down
56 changes: 28 additions & 28 deletions tests/benchmarks/model_benchmarks/test_model_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,26 +147,26 @@ def test_arguments_related_interfaces():
settings = benchmark.get_configurable_settings()
expected_settings = (
"""optional arguments:
--run_count int The run count of benchmark.
--duration int The elapsed time of benchmark in seconds.
--num_warmup int The number of warmup step.
--num_steps int The number of test step.
--sample_count int The number of data samples in dataset.
--batch_size int The number of batch size.
--precision Precision [Precision ...]
Model precision. E.g. float16 float32 float64 bfloat16
uint8 int8 int16 int32 int64.
--model_action ModelAction [ModelAction ...]
Benchmark model process. E.g. train inference.
--distributed_backend DistributedBackend
Distributed backends. E.g. nccl mpi gloo.
--distributed_impl DistributedImpl
Distributed implementations. E.g. ddp mirrored
multiworkermirrored parameterserver horovod.
--distributed_backend DistributedBackend
Distributed backends. E.g. nccl mpi gloo.
--no_gpu Disable GPU training.
--pin_memory Enable option to pin memory in data loader.
--duration int The elapsed time of benchmark in seconds.
--force_fp32 Enable option to use full float32 precision.
--hidden_size int Hidden size.
--model_action ModelAction [ModelAction ...]
Benchmark model process. E.g. train inference.
--no_gpu Disable GPU training.
--num_steps int The number of test step.
--num_warmup int The number of warmup step.
--pin_memory Enable option to pin memory in data loader.
--precision Precision [Precision ...]
Model precision. E.g. float16 float32 float64 bfloat16
uint8 int8 int16 int32 int64.
--run_count int The run count of benchmark.
--sample_count int The number of data samples in dataset.
--seq_len int Sequence length."""
)
assert (settings == expected_settings)
Expand All @@ -181,26 +181,26 @@ def test_preprocess():
settings = benchmark.get_configurable_settings()
expected_settings = (
"""optional arguments:
--run_count int The run count of benchmark.
--duration int The elapsed time of benchmark in seconds.
--num_warmup int The number of warmup step.
--num_steps int The number of test step.
--sample_count int The number of data samples in dataset.
--batch_size int The number of batch size.
--precision Precision [Precision ...]
Model precision. E.g. float16 float32 float64 bfloat16
uint8 int8 int16 int32 int64.
--model_action ModelAction [ModelAction ...]
Benchmark model process. E.g. train inference.
--distributed_backend DistributedBackend
Distributed backends. E.g. nccl mpi gloo.
--distributed_impl DistributedImpl
Distributed implementations. E.g. ddp mirrored
multiworkermirrored parameterserver horovod.
--distributed_backend DistributedBackend
Distributed backends. E.g. nccl mpi gloo.
--no_gpu Disable GPU training.
--pin_memory Enable option to pin memory in data loader.
--duration int The elapsed time of benchmark in seconds.
--force_fp32 Enable option to use full float32 precision.
--hidden_size int Hidden size.
--model_action ModelAction [ModelAction ...]
Benchmark model process. E.g. train inference.
--no_gpu Disable GPU training.
--num_steps int The number of test step.
--num_warmup int The number of warmup step.
--pin_memory Enable option to pin memory in data loader.
--precision Precision [Precision ...]
Model precision. E.g. float16 float32 float64 bfloat16
uint8 int8 int16 int32 int64.
--run_count int The run count of benchmark.
--sample_count int The number of data samples in dataset.
--seq_len int Sequence length."""
)
assert (settings == expected_settings)
Expand Down
2 changes: 1 addition & 1 deletion tests/benchmarks/test_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,9 +113,9 @@ def test_get_benchmark_configurable_settings():
settings = BenchmarkRegistry.get_benchmark_configurable_settings(context)

expected = """optional arguments:
--run_count int The run count of benchmark.
--duration int The elapsed time of benchmark in seconds.
--lower_bound int The lower bound for accumulation.
--run_count int The run count of benchmark.
--upper_bound int The upper bound for accumulation."""
assert (settings == expected)

Expand Down
17 changes: 16 additions & 1 deletion tests/cli/test_sb.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,12 @@
import io
import contextlib
from functools import wraps
from knack.testsdk import ScenarioTest, StringCheck, NoneCheck
from knack.testsdk import ScenarioTest, StringCheck, NoneCheck, JMESPathCheck
from pathlib import Path

import superbench
from superbench.cli import SuperBenchCLI
from superbench.benchmarks import BenchmarkRegistry


def capture_system_exit(func):
Expand Down Expand Up @@ -83,6 +84,20 @@ def test_sb_run_nonexist_host_file(self):
result = self.cmd('sb run --host-file ./nonexist.yaml', expect_failure=True)
self.assertEqual(result.exit_code, 1)

def test_sb_benchmark_list(self):
"""Test sb benchmark list."""
self.cmd('sb benchmark list', checks=[JMESPathCheck('length(@)', len(BenchmarkRegistry.benchmarks))])

def test_sb_benchmark_list_nonexist(self):
"""Test sb benchmark list, give a non-exist benchmark name, should fail."""
result = self.cmd('sb benchmark list -n non-exist-name', expect_failure=True)
self.assertEqual(result.exit_code, 1)

def test_sb_benchmark_list_parameters(self):
"""Test sb benchmark list-parameters."""
self.cmd('sb benchmark list-parameters', checks=[NoneCheck()])
self.cmd('sb benchmark list-parameters -n pytorch-[a-z]+', checks=[NoneCheck()])
cp5555 marked this conversation as resolved.
Show resolved Hide resolved
Comment on lines +98 to +99
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check the exit_code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default expect_failure is False so exit code is expected to be 0, otherwise the case will fail


def test_sb_node_info(self):
"""Test sb node info, should fail."""
self.cmd('sb node info', expect_failure=False)
Expand Down