# Convolutional Neural Networks

# Contents
* Overview
* Highlights of the Tutorial
* CIFAR-10 Model
    - Model inputs
    - Model prediction
    - Model training 
* Launching and Training the Model
* Evaluating a Model
* Training a Model Using Multiple GPU Cards
    - Placing Variables and Operations on Devices
    - Launching and Training the Model on Multiple GPU cards
* Next Steps

#### 참고
* [2] https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/docker

# 환경설정

* tensorflow image (cpu + gpu) pulling
    - docker pull b.gcr.io/tensorflow/tensorflow-devel-gpu
* Ubuntu 리눅스 경우, [2]를 따라서 호스트 시스템 GPU 셋팅을 한다. 
* 그리고 도커를 다음 스크립트를 사용해서 연결
    - (관련 스크립트) docker_run_gpu.sh
* 이후 컨테이너 안에서 tensorboard 실행
    - (관련 스크립트) run_tensorboard.sh

## 테스트 환경

<code>lspci | grep -i nvidia</code>

<font color="blue">06:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40m] (rev a1)</font>

<code>nvidia-smi</code> 

<img src="figures/gpuusage.png" width=600 />

# 실습코드 
* 실습 도커OS의 다음 경로에 예제 코드가 있음 
    - /tensorflow/tensorflow/models/image/cifar10
* 스터디 github 현재 경로에도(이 노트북과 같은 위치) 복사.
* cifar10_eval_gpu.py는 cifar10_multi_gpu_train.py을 위한 원래 코드에서 설정을 조금 변경한 코드
* 원 코드들의 로그 경로를(텐서보드용) 현재 경로상의 log 디렉토리로 변경.
* 파이썬 코드들은 최대한 서버 터미널로 실행, 출력로그는 plog 확장자로 남김(노트북에서 실행하면 큰일) 

In [7]:
!ls cifar10* log

cifar10.py	     cifar10_input_test.py	 train.plog
cifar10_eval.py      cifar10_multi_gpu_train.py  train_gpu.plog
cifar10_eval_gpu.py  cifar10_train.py
cifar10_input.py     eval_gpu.plog

log:
cifar10_eval  cifar10_eval_gpu	cifar10_train  cifar10_train_gpu


In [8]:
!ls *.plog

eval_gpu.plog  train.plog  train_gpu.plog


# Oerview

* The problem is to classify RGB 32x32 pixel images across 10 categories: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.

<img src="https://www.tensorflow.org/versions/0.6.0/images/cifar_samples.png" />

# Goals

The goal of this tutorial is to build a relatively small convolutional neural network (CNN) for recognizing images. In the process, this tutorial:

1. Highlights a canonical organization for network architecture, training and evaluation.
2. Provides a template for constructing larger and more sophisticated models.

In [1]:
!python cifar10_input_test.py

I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 32
I tensorflow/core/common_runtime/gpu/gpu_init.cc:103] Found device 0 with properties: 
name: Tesla K40m
major: 3 minor: 5 memoryClockRate (GHz) 0.745
pciBusID 0000:06:00.0
Total memory: 11.25GiB
Free memory: 11.12GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:127] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:137] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40m, pci bus id: 0000:06:00.0)
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Allocating 3.34GiB bytes.
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:52] GPU 0 memory begins at 0x42047a0000 extends to 0x42da041ccc
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:66] Creating bin of max chunk size 1.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:66] Creating bin of max chunk size 2.0KiB

In [3]:
!python cifar10_input.py

In [14]:
!head train.plog -n 50

Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.
2016-02-03 09:49:56.909836: step 0, loss = 4.67 (2.0 examples/sec; 65.123 sec/batch)
2016-02-03 09:50:02.208112: step 10, loss = 4.66 (272.5 examples/sec; 0.470 sec/batch)
2016-02-03 09:50:06.935291: step 20, loss = 4.63 (309.4 examples/sec; 0.414 sec/batch)
2016-02-03 09:50:11.677843: step 30, loss = 4.61 (269.0 examples/sec; 0.476 sec/batch)
2016-02-03 09:50:16.492321: step 40, loss = 4.59 (261.3 examples/sec; 0.490 sec/batch)
2016-02-03 09:50:21.212418: step 50, loss = 4.57 (286.6 examples/sec; 0.447 sec/batch)
2016-02-03 09:50:25.903630: step 60, loss = 4.56 (289.0 examples/sec; 0.443 sec/batch)
2016-02-03 09:50:30.640395: step 70, loss = 4.54 (277.4 examples/sec; 0.461 sec/batch)
2016-02-03 09:50:35.276647: step 80, loss = 4.52 (287.4 examples/sec; 0.445 sec/batch)
2016-02-03 09:50:39.996685: step 90, loss = 4.50 (262.0 examples/sec; 0.489 sec/batch)
2016-02-03 09:50:44.778525:

In [None]:
!python cifar10_eval.py

In [None]:
!python cifar10.py

In [None]:
!python cifar10_multi_gpu_train.py

In [4]:
from IPython.core.display import HTML
HTML('<iframe src=cnn_tensorboard/TensorBoard_0.html width=1000 height=900></iframe>')

# 참고자료
* [1]
* [2] https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/docker