Skip to content

Commit

Permalink
Rename master -> coordinator
Browse files Browse the repository at this point in the history
Signed-off-by: Thane Thomson <connect@thanethomson.com>
  • Loading branch information
thanethomson committed Nov 1, 2021
1 parent fa3b67e commit 8e95c8b
Show file tree
Hide file tree
Showing 12 changed files with 658 additions and 657 deletions.
54 changes: 27 additions & 27 deletions README.md
Expand Up @@ -6,8 +6,8 @@ be the successor to [`tm-bench`](https://github.com/tendermint/tendermint/tree/m

Naturally, any transactions sent to a Tendermint network are specific to the
ABCI application running on that network. As such, the `tm-load-test` tool comes
with built-in support for the `kvstore` ABCI application, but you can
[build your own clients](./pkg/loadtest/README.md) for your own apps.
with built-in support for the `kvstore` ABCI application, but you can [build
your own clients](./pkg/loadtest/README.md) for your own apps.

**NB: `tm-load-test` is currently alpha-quality software. Semantic versioning is
not strictly adhered to prior to a v1.0 release, so breaking API changes can
Expand All @@ -28,7 +28,7 @@ make
## Usage

`tm-load-test` can be executed in one of two modes: **standalone**, or
**master/worker**.
**coordinator/worker**.

### Standalone Mode

Expand All @@ -46,23 +46,23 @@ To see a description of what all of the parameters mean, simply run:
tm-load-test --help
```

### Master/Worker Mode
### Coordinator/Worker Mode

In master/worker mode, which is best used for large-scale, distributed load
In coordinator/worker mode, which is best used for large-scale, distributed load
testing, `tm-load-test` allows you to have multiple worker machines connect to a
single master to obtain their configuration and coordinate their operation.
single coordinator to obtain their configuration and coordinate their operation.

The master acts as a simple WebSockets host, and the workers are WebSockets
The coordinator acts as a simple WebSockets host, and the workers are WebSockets
clients.

On the master machine:
On the coordinator machine:

```bash
# Run tm-load-test with similar parameters to the standalone mode, but now
# specifying the number of workers to expect (--expect-workers) and the host:port
# to which to bind (--bind) and listen for incoming worker requests.
tm-load-test \
master \
coordinator \
--expect-workers 2 \
--bind localhost:26670 \
-c 1 -T 10 -r 1000 -s 250 \
Expand All @@ -73,14 +73,14 @@ tm-load-test \
On each worker machine:

```bash
# Just tell the worker where to find the master - it will figure out the rest.
tm-load-test worker --master localhost:26680
# Just tell the worker where to find the coordinator - it will figure out the rest.
tm-load-test worker --coordinator localhost:26680
```

For more help, see the command line parameters' descriptions:

```bash
tm-load-test master --help
tm-load-test coordinator --help
tm-load-test worker --help
```

Expand Down Expand Up @@ -122,9 +122,9 @@ application, see the [`loadtest` package docs here](./pkg/loadtest/README.md).

## Monitoring

As of v0.4.1, `tm-load-test` exposes a number of metrics when in master/worker
mode, but only from the master's web server at the `/metrics` endpoint. So if
you bind your master node to `localhost:26670`, you should be able to get these
As of v0.4.1, `tm-load-test` exposes a number of metrics when in coordinator/worker
mode, but only from the coordinator's web server at the `/metrics` endpoint. So if
you bind your coordinator node to `localhost:26670`, you should be able to get these
metrics from:

```bash
Expand All @@ -133,16 +133,16 @@ curl http://localhost:26670/metrics

The following kinds of metrics are made available here:

* Total number of transactions recorded from the master's perspective (across
all workers)
* Total number of transactions recorded from the coordinator's perspective
(across all workers)
* Total number of transactions sent by each worker
* The status of the master node, which is a gauge that indicates one of the
* The status of the coordinator node, which is a gauge that indicates one of the
following codes:
* 0 = Master starting
* 1 = Master waiting for all peers to connect
* 2 = Master waiting for all workers to connect
* 0 = Coordinator starting
* 1 = Coordinator waiting for all peers to connect
* 2 = Coordinator waiting for all workers to connect
* 3 = Load test underway
* 4 = Master and/or one or more worker(s) failed
* 4 = Coordinator and/or one or more worker(s) failed
* 5 = All workers completed load testing successfully
* The status of each worker node, which is also a gauge that indicates one of
the following codes:
Expand All @@ -155,12 +155,12 @@ The following kinds of metrics are made available here:
* Standard Prometheus-provided metrics about the garbage collector in
`tm-load-test`
* The ID of the load test currently underway (defaults to 0), set by way of the
`--load-test-id` flag on the master
`--load-test-id` flag on the coordinator

## Aggregate Statistics

As of `tm-load-test` v0.7.0, one can now write simple aggregate statistics to
a CSV file once testing completes by specifying the `--stats-output` flag:
As of `tm-load-test` v0.7.0, one can now write simple aggregate statistics to a
CSV file once testing completes by specifying the `--stats-output` flag:

```bash
# In standalone mode
Expand All @@ -169,9 +169,9 @@ tm-load-test -c 1 -T 10 -r 1000 -s 250 \
--endpoints ws://tm-endpoint1.somewhere.com:26657/websocket,ws://tm-endpoint2.somewhere.com:26657/websocket \
--stats-output /path/to/save/stats.csv

# From the master in master/worker mode
# From the coordinator in coordinator/worker mode
tm-load-test \
master \
coordinator \
--expect-workers 2 \
--bind localhost:26670 \
-c 1 -T 10 -r 1000 -s 250 \
Expand Down
12 changes: 6 additions & 6 deletions cmd/tm-load-test/main.go
Expand Up @@ -4,7 +4,7 @@ import (
"github.com/informalsystems/tm-load-test/pkg/loadtest"
)

const appLongDesc = `Load testing application for Tendermint with optional master/worker mode.
const appLongDesc = `Load testing application for Tendermint with optional coordinator/worker mode.
Generates large quantities of arbitrary transactions and submits those
transactions to one or more Tendermint endpoints. By default, it assumes that
you are running the kvstore ABCI application on your Tendermint network.
Expand All @@ -16,7 +16,7 @@ To run the application in a similar fashion to tm-bench (STANDALONE mode):
To run the application in MASTER mode:
tm-load-test \
master \
coordinator \
--expect-workers 2 \
--bind localhost:26670 \
--shutdown-wait 60 \
Expand All @@ -25,17 +25,17 @@ To run the application in MASTER mode:
--endpoints ws://tm-endpoint1.somewhere.com:26657/websocket,ws://tm-endpoint2.somewhere.com:26657/websocket
To run the application in SLAVE mode:
tm-load-test worker --master localhost:26680
tm-load-test worker --coordinator localhost:26680
NOTES:
* MASTER mode exposes a "/metrics" endpoint in Prometheus plain text format
which shows total number of transactions and the status for the master and
all connected workers.
which shows total number of transactions and the status for the coordinator
and all connected workers.
* The "--shutdown-wait" flag in MASTER mode is specifically to allow your
monitoring system some time to obtain the final Prometheus metrics from the
metrics endpoint.
* In SLAVE mode, all load testing-related flags are ignored. The worker always
takes instructions from the master node it's connected to.
takes instructions from the coordinator node it's connected to.
`

func main() {
Expand Down
2 changes: 1 addition & 1 deletion pkg/loadtest/README.md
Expand Up @@ -83,7 +83,7 @@ func main() {
panic(err)
}
// The loadtest.Run method will handle CLI argument parsing, errors,
// configuration, instantiating the load test and/or master/worker
// configuration, instantiating the load test and/or coordinator/worker
// operations, etc. All it needs is to know which client factory to use for
// its load testing.
loadtest.Run(&loadtest.CLIConfig{
Expand Down
30 changes: 15 additions & 15 deletions pkg/loadtest/cli.go
Expand Up @@ -66,32 +66,32 @@ func buildCLI(cli *CLIConfig, logger logging.Logger) *cobra.Command {
rootCmd.PersistentFlags().StringVar(&cfg.StatsOutputFile, "stats-output", "", "Where to store aggregate statistics (in CSV format) for the load test")
rootCmd.PersistentFlags().BoolVarP(&flagVerbose, "verbose", "v", false, "Increase output logging verbosity to DEBUG level")

var masterCfg MasterConfig
masterCmd := &cobra.Command{
Use: "master",
var coordCfg CoordinatorConfig
coordCmd := &cobra.Command{
Use: "coordinator",
Short: "Start load test application in MASTER mode",
Run: func(cmd *cobra.Command, args []string) {
logger.Debug(fmt.Sprintf("Configuration: %s", cfg.ToJSON()))
logger.Debug(fmt.Sprintf("Master configuration: %s", masterCfg.ToJSON()))
logger.Debug(fmt.Sprintf("Coordinator configuration: %s", coordCfg.ToJSON()))
if err := cfg.Validate(); err != nil {
logger.Error(err.Error())
os.Exit(1)
}
if err := masterCfg.Validate(); err != nil {
if err := coordCfg.Validate(); err != nil {
logger.Error(err.Error())
os.Exit(1)
}
master := NewMaster(&cfg, &masterCfg)
if err := master.Run(); err != nil {
coord := NewCoordinator(&cfg, &coordCfg)
if err := coord.Run(); err != nil {
os.Exit(1)
}
},
}
masterCmd.PersistentFlags().StringVar(&masterCfg.BindAddr, "bind", "localhost:26670", "A host:port combination to which to bind the master on which to listen for worker connections")
masterCmd.PersistentFlags().IntVar(&masterCfg.ExpectWorkers, "expect-workers", 2, "The number of workers to expect to connect to the master before starting load testing")
masterCmd.PersistentFlags().IntVar(&masterCfg.WorkerConnectTimeout, "connect-timeout", 180, "The maximum number of seconds to wait for all workers to connect")
masterCmd.PersistentFlags().IntVar(&masterCfg.ShutdownWait, "shutdown-wait", 0, "The number of seconds to wait after testing completes prior to shutting down the web server")
masterCmd.PersistentFlags().IntVar(&masterCfg.LoadTestID, "load-test-id", 0, "The ID of the load test currently underway")
coordCmd.PersistentFlags().StringVar(&coordCfg.BindAddr, "bind", "localhost:26670", "A host:port combination to which to bind the coordinator on which to listen for worker connections")
coordCmd.PersistentFlags().IntVar(&coordCfg.ExpectWorkers, "expect-workers", 2, "The number of workers to expect to connect to the coordinator before starting load testing")
coordCmd.PersistentFlags().IntVar(&coordCfg.WorkerConnectTimeout, "connect-timeout", 180, "The maximum number of seconds to wait for all workers to connect")
coordCmd.PersistentFlags().IntVar(&coordCfg.ShutdownWait, "shutdown-wait", 0, "The number of seconds to wait after testing completes prior to shutting down the web server")
coordCmd.PersistentFlags().IntVar(&coordCfg.LoadTestID, "load-test-id", 0, "The ID of the load test currently underway")

var workerCfg WorkerConfig
workerCmd := &cobra.Command{
Expand All @@ -114,8 +114,8 @@ func buildCLI(cli *CLIConfig, logger logging.Logger) *cobra.Command {
},
}
workerCmd.PersistentFlags().StringVar(&workerCfg.ID, "id", "", "An optional unique ID for this worker. Will show up in metrics and logs. If not specified, a UUID will be generated.")
workerCmd.PersistentFlags().StringVar(&workerCfg.MasterAddr, "master", "ws://localhost:26670", "The WebSockets URL on which to find the master node")
workerCmd.PersistentFlags().IntVar(&workerCfg.MasterConnectTimeout, "connect-timeout", 180, "The maximum number of seconds to keep trying to connect to the master")
workerCmd.PersistentFlags().StringVar(&workerCfg.CoordAddr, "coordinator", "ws://localhost:26670", "The WebSockets URL on which to find the coordinator node")
workerCmd.PersistentFlags().IntVar(&workerCfg.CoordConnectTimeout, "connect-timeout", 180, "The maximum number of seconds to keep trying to connect to the coordinator")

versionCmd := &cobra.Command{
Use: "version",
Expand All @@ -129,7 +129,7 @@ func buildCLI(cli *CLIConfig, logger logging.Logger) *cobra.Command {
},
}

rootCmd.AddCommand(masterCmd)
rootCmd.AddCommand(coordCmd)
rootCmd.AddCommand(workerCmd)
rootCmd.AddCommand(versionCmd)
return rootCmd
Expand Down
4 changes: 2 additions & 2 deletions pkg/loadtest/client_kvstore_test.go
Expand Up @@ -91,8 +91,8 @@ func BenchmarkKVStoreClient_GenerateTx_100kB(b *testing.B) {
}

func TestKVStoreClient(t *testing.T) {
testCases := []struct{
config loadtest.Config
testCases := []struct {
config loadtest.Config
clientCount int
}{
{loadtest.Config{Size: 32, Count: 1000}, 5},
Expand Down
32 changes: 16 additions & 16 deletions pkg/loadtest/config.go
Expand Up @@ -38,9 +38,9 @@ type Config struct {
NoTrapInterrupts bool `json:"no_trap_interrupts"` // Should we avoid trapping Ctrl+Break? Only relevant for standalone execution mode.
}

// MasterConfig is the configuration options specific to a master node.
type MasterConfig struct {
BindAddr string `json:"bind_addr"` // The "host:port" to which to bind the master node to listen for incoming workers.
// CoordinatorConfig is the configuration options specific to a coordinator node.
type CoordinatorConfig struct {
BindAddr string `json:"bind_addr"` // The "host:port" to which to bind the coordinator node to listen for incoming workers.
ExpectWorkers int `json:"expect_workers"` // The number of workers to expect before starting the load test.
WorkerConnectTimeout int `json:"connect_timeout"` // The number of seconds to wait for all workers to connect.
ShutdownWait int `json:"shutdown_wait"` // The number of seconds to wait at shutdown (while keeping the HTTP server running - primarily to allow Prometheus to keep polling).
Expand All @@ -49,9 +49,9 @@ type MasterConfig struct {

// WorkerConfig is the configuration options specific to a worker node.
type WorkerConfig struct {
ID string `json:"id"` // A unique ID for this worker instance. Will show up in the metrics reported by the master for this worker.
MasterAddr string `json:"master_addr"` // The address at which to find the master node.
MasterConnectTimeout int `json:"connect_timeout"` // The maximum amount of time, in seconds, to allow for the master to become available.
ID string `json:"id"` // A unique ID for this worker instance. Will show up in the metrics reported by the coordinator for this worker.
CoordAddr string `json:"coord_addr"` // The address at which to find the coordinator node.
CoordConnectTimeout int `json:"connect_timeout"` // The maximum amount of time, in seconds, to allow for the coordinator to become available.
}

var validBroadcastTxMethods = map[string]interface{}{
Expand Down Expand Up @@ -120,26 +120,26 @@ func (c Config) MaxTxsPerEndpoint() uint64 {
return uint64(c.Rate) * uint64(c.Time)
}

func (c MasterConfig) ToJSON() string {
func (c CoordinatorConfig) ToJSON() string {
b, err := json.Marshal(c)
if err != nil {
return fmt.Sprintf("%v", c)
}
return string(b)
}

func (c MasterConfig) Validate() error {
func (c CoordinatorConfig) Validate() error {
if len(c.BindAddr) == 0 {
return fmt.Errorf("master bind address must be specified")
return fmt.Errorf("coordinator bind address must be specified")
}
if c.ExpectWorkers < 1 {
return fmt.Errorf("master expect-workers must be at least 1, but got %d", c.ExpectWorkers)
return fmt.Errorf("coordinator expect-workers must be at least 1, but got %d", c.ExpectWorkers)
}
if c.WorkerConnectTimeout < 1 {
return fmt.Errorf("master connect-timeout must be at least 1 second")
return fmt.Errorf("coordinator connect-timeout must be at least 1 second")
}
if c.LoadTestID < 0 {
return fmt.Errorf("master load-test-id must be 0 or greater")
return fmt.Errorf("coordinator load-test-id must be 0 or greater")
}
return nil
}
Expand All @@ -156,11 +156,11 @@ func (c WorkerConfig) Validate() error {
if len(c.ID) > 0 && !isValidWorkerID(c.ID) {
return fmt.Errorf("Invalid worker ID \"%s\": worker IDs can only be lowercase alphanumeric characters", c.ID)
}
if len(c.MasterAddr) == 0 {
return fmt.Errorf("master address must be specified")
if len(c.CoordAddr) == 0 {
return fmt.Errorf("coordinator address must be specified")
}
if c.MasterConnectTimeout < 1 {
return fmt.Errorf("expected connect-timeout to be >= 1, but was %d", c.MasterConnectTimeout)
if c.CoordConnectTimeout < 1 {
return fmt.Errorf("expected connect-timeout to be >= 1, but was %d", c.CoordConnectTimeout)
}
return nil
}
Expand Down

0 comments on commit 8e95c8b

Please sign in to comment.