Skip to content

Commit

Permalink
Merge pull request #1854 from huafengw/fix#1833
Browse files Browse the repository at this point in the history
fix #1833 update performance related doc
  • Loading branch information
clockfly committed Jan 12, 2016
2 parents 0622a07 + 193b06c commit ba3037a
Show file tree
Hide file tree
Showing 5 changed files with 13 additions and 39 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ We model streaming within the Akka actor hierarchy.

![](https://raw.githubusercontent.com/gearpump/gearpump/master/docs/img/actor_hierarchy.png)

Per initial benchmarks we are able to process 11 million messages/second (100 bytes per message) with a 17ms latency on a 4-node cluster.
Per initial benchmarks we are able to process near 18 million messages/second (100 bytes per message) with a 8ms latency on a 4-node cluster.

![](https://raw.githubusercontent.com/gearpump/gearpump/master/docs/img/dashboard.png)

Expand Down
2 changes: 1 addition & 1 deletion docs/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Gearpump is a message level streaming engine, which means every task in the DAG

#### High Performance message passing

By implementing smart batching strategies, Gearpump is extremely effective in transferring small messages. In one test of 4 machines, the whole cluster throughput can reach 11 million messages per second, with message size of 100 bytes.
By implementing smart batching strategies, Gearpump is extremely effective in transferring small messages. In one test of 4 machines, the whole cluster throughput can reach 18 million messages per second, with message size of 100 bytes.
![Dashboard](img/dashboard.png)

#### High availability, No single point of failure
Expand Down
Binary file modified docs/img/dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Gearpump's feature set includes:


### Gearpump Performance
Per initial benchmarks we are able to process 11 million messages/second (100 bytes per message) with a 17ms latency on a 4-node cluster.
Per initial benchmarks we are able to process 18 million messages/second (100 bytes per message) with a 8ms latency on a 4-node cluster.

![Dashboard](img/dashboard.png)

Expand Down
46 changes: 10 additions & 36 deletions docs/performance-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,59 +9,33 @@ description: Gearpump Performance Report

To illustrate the performance of Gearpump, we mainly focused on two aspects, throughput and latency, using a micro benchmark called SOL (an example in the Gearpump package) whose topology is quite simple. SOLStreamProducer delivers messages to SOLStreamProcessor constantly and SOLStreamProcessor does nothing. We set up a 4-nodes cluster with 10GbE network and each node's hardware is briefly shown as follows:

Processor: 32 core Intel(R) Xeon(R) CPU E5-2680 2.70GHz
Memory: 128GB
Processor: 32 core Intel(R) Xeon(R) CPU E5-2690 2.90GHz
Memory: 64GB

## Throughput

Gearpump uses Graphite for the metrics dashboard. We tried to explore the upper bound of the throughput, after launching 64 SOLStreamProducer and 64 SOLStreamProcessor the Figure below shows that the whole throughput of the cluster can reach about 13 million messages/second(100 bytes per message)

Figure: Performance Evaluation, Throughput and Latency
We tried to explore the upper bound of the throughput, after launching 48 SOLStreamProducer and 48 SOLStreamProcessor the Figure below shows that the whole throughput of the cluster can reach about 18 million messages/second(100 bytes per message)

## Latency

When we transfer message at the max throughput above, the average latency between two tasks is 17ms, standard deviation is 13ms.

Figure: Latency between Two tasks(ms)
When we transfer message at the max throughput above, the average latency between two tasks is 8ms.

## Fault Recovery time

When the corruption is detected, for example the Executor is down, Gearpump will reallocate the resource and restart the application. It takes about 10 seconds to recover the application.

![Dashboard](img/dashboard.png)

## How to setup the benchmark environment?

### Prepare the env

1). Set up a node running Graphite, see guide doc/dashboard/README.md.

2). Set up a 4-nodes Gearpump cluster with 10GbE network which have 3 Workers on each node. In our test environment, each node has 128GB memory and Intel? Xeon? 32-core processor E5-2680 2.70GHz. Make sure the metrics is enabled in Gearpump.

3). Submit a SOL application with 32 SteamProducers and 32 StreamProcessors:

```bash
bin/gear app -jar ./examples/sol/target/pack/lib/gearpump-examples-$VERSION.jar io.gearpump.streaming.examples.sol.SOL -streamProducer 32 -streamProcessor 32 -runseconds 600
```

4). Browser http://$HOST:801/, you should see a Grafana dashboard. The HOST should be the node runs Graphite.

5). Copy the config file doc/dashboard/graphana_dashboard, and modify the `host` filed to the actual hosts which runs Gearpump and the `source` and `target` fields. Please note that the format of the value should exactly the same as existing format and you also need to manually add the rest task ID to the value of `All` under `source` and `target` filed since now the number of each task type is 32.

6). In the Grafana web page, click the "search" button and then import the config file mentioned above.
1). Set up a 4-nodes Gearpump cluster with 10GbE network which have 4 Workers on each node. In our test environment, each node has 64GB memory and Intel(R) Xeon(R) 32-core processor E5-2690 2.90GHz. Make sure the metrics is enabled in Gearpump.

### Metrics

We use codahale metrics library. Gearpump support to use Graphite to visualize the metrics data. Metrics is disabled by default. To use it, you need to configure the 'conf/gear.conf'
2). Submit a SOL application with 48 StreamProducers and 48 StreamProcessors:

```bash
gearpump.metrics.reporter = graphite
gearpump.metrics.enabled = true ## Default is false, thus metrics is not enabled.
gearpump.metrics.graphite.host = "your actual graphite host name or ip"
gearpump.metrics.graphite.port = 2003 ## Your graphite port
gearpump.metrics.sample.rate = 10 ## this means we will sample 1 message for every 10 messages
bin/gear app -jar ./examples/sol-$VERSION-assembly.jar -streamProducer 48 -streamProcessor 48
```

For guide about how to install and configure Graphite, please check the Graphite website http://graphite.wikidot.com/. For guide about how to use Grafana, please check guide in [doc/dashboard/readme.md](https://github.com/gearpump/gearpump/blob/master/doc/dashboard/README.md)

Here is how it looks like for grafana dashboard:

![Dashboard](img/dashboard.png)
3). Launch Gearpump's dashboard and browser http://$HOST:8090/, switch to the Applications tab and you can see the detail information of your application. The HOST should be the node runs dashboard.

0 comments on commit ba3037a

Please sign in to comment.