[slim] performance reduce when train cifarnet with multi-gpu

I want to train cifarnet on single machine with 4 gpus, but the performance reduces comparing with training with only one gpu.
## [slim] Train cifarnet using the default script slim/scripts/train_cifarnet_on_cifar10.sh

When using the default script the speed is as follow:

```
INFO:tensorflow:global step 13900: loss = 0.7609 (0.06 sec/step)
```
## Modify slim/scripts/train_cifarnet_on_cifar10.sh by set num_clones=4

The speed become slow (I also try change num_preprocessing_threads = 1/2/4/8/16, num_readers=4/8, useless)

```
INFO:tensorflow:global step 14000: loss = 0.7438 (0.26 sec/step)
INFO:tensorflow:global step 14100: loss = 0.6690 (0.26 sec/step)
```
## Hardware

Four Titan X
02:00.0 VGA compatible controller: NVIDIA Corporation Device 17c2 (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation Device 17c2 (rev a1)
06:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30)
82:00.0 VGA compatible controller: NVIDIA Corporation Device 17c2 (rev a1)
83:00.0 VGA compatible controller: NVIDIA Corporation Device 17c2 (rev a1)
+------------------------------------------------------+  
| NVIDIA-SMI 352.30     Driver Version: 352.30         |  
|-----------------------------------+----------------------+------------------------+
| GPU  Name            Persistence-M  | Bus-Id               Disp.A |      Volatile Uncorr. ECC  |
| Fan   Temp    Perf  Pwr:Usage/Cap |            Memory-Usage |   GPU-Util  Compute M. |
|=========================+=================+=================|
|   0     GeForce  GTX TIT...        On   |     0000:02:00.0     Off |                                   N/A |
| 28%   67C         P2      75W / 250W |      228MiB / 12287MiB |                 0%      Default |
+----------------------------------+-----------------------+------------------------+
。。。

32 processor each as follow ：
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 63
model name      : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
## Finally

Could anyone give some advices?
I read some issues about multi-gpu in this rep, but still can't solve this.
I think it caused by IO, because I notice that when train on single gpu the GPU-Util is above 90%( when train with 4 gpus, the GPU-Util is about 20% ).
And I don't think it's due to the hardware performance of my machine.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[slim] performance reduce when train cifarnet with multi-gpu #490

[slim] Train cifarnet using the default script slim/scripts/train_cifarnet_on_cifar10.sh

Modify slim/scripts/train_cifarnet_on_cifar10.sh by set num_clones=4

Hardware

Finally

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[slim] performance reduce when train cifarnet with multi-gpu #490

Description

[slim] Train cifarnet using the default script slim/scripts/train_cifarnet_on_cifar10.sh

Modify slim/scripts/train_cifarnet_on_cifar10.sh by set num_clones=4

Hardware

Finally

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions