Examples are running very slow on AWS #53

pavelanni · 2017-06-18T18:49:59Z

As I don't have any GPU available, I decided to try the examples on AWS. They are running very slow on AWS 'p2-xlarge' instance. Much slower than on the video and even slower than on my desktop which doesn't have any GPU (2-3 slower--my visual estimation)

My config: instance p2-xlarge Ubuntu 16.04.2, tensorflow 1.2, python3, CUDA 8.0, cudaDNN 6 (installed as .deb from NVIDIA)

What I have tested so far:

CPU-GPU Tensorflow test from here: http://learningtensorflow.com/lesson10/ shows performance improvement of 11x when running matrix multiplication on GPU
When I am running mnist_1.0 example and run nvidia-smi command in another window, it shows that the GPU is busy and it's running that same python process ID that is running the example.
When the example starts it shows that it has found the GPU and is using it:

2017-06-18 18:35:07.317169: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: 
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
2017-06-18 18:35:07.317192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 
2017-06-18 18:35:07.317200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y 
2017-06-18 18:35:07.317209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)

All signs confirm that Tensorflow is using the GPU, but why it's so slow? It seems I'm missing something, but I can't find it.
Thanks,
Pavel

The text was updated successfully, but these errors were encountered:

martin-gorner · 2017-06-22T21:25:44Z

The videos in the presentation are accelerated to be 20 sec long whatever the length of the original training run.
If you do not have a GPU, I recommend Google ML Engine. It's a training cluster as a service. You have a sample in the mlengine folder.

martin-gorner closed this as completed Jun 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples are running very slow on AWS #53

Examples are running very slow on AWS #53

pavelanni commented Jun 18, 2017

martin-gorner commented Jun 22, 2017

Examples are running very slow on AWS #53

Examples are running very slow on AWS #53

Comments

pavelanni commented Jun 18, 2017

martin-gorner commented Jun 22, 2017