Benchmark performance drops significantly when using map_and_batch

After taking the latest benchmarks, we noticed a drop in performance on models inception3 and resnet152. Testing with TensorFlow r1.5 on 32xP100 GPUs (8 servers), imagenet data, batch size 64. 

<h3>Inception3:</h3>

- grpc: **3350** ==> **3000**
- grpc + verbs: **3800** ==> **3150**

<h3>Resnet152:</h3>

- grpc: **2050** ==> **2000**
- grpc + verbs: **2450** ==> **2250**

We isolated the 'problematic' change to: https://github.com/tensorflow/benchmarks/commit/82dd0539c76afa8491e50d8f796e686b4d97b988#diff-3269d1838b2ebc9c6c071802fb946ca1R521 

After replacing the specific call to `map_and_batch()`, with the previous call to `map()` with 16 parallel calls (https://github.com/Mellanox/benchmarks/commit/56e0b2298f835905f7d8a53c5bf482ed1dce55fd), we get high numbers again. We don't have a theory to explain this.

Thanks


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark performance drops significantly when using map_and_batch #137

Inception3:

Resnet152:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmark performance drops significantly when using map_and_batch #137

Description

Inception3:

Resnet152:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions