<a href="https://colab.research.google.com/github/retuyu88/digitalentkominfo/blob/master/Copy_of_%5B26_3%5D_TensorFlow_dengan_TPU_ADF.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src = "https://i.imgur.com/UjutVJd.jpg" align = "center">

# TensorFlow dengan TPU

Kali ini kita akan mengecek kemampuan TensorFlow jika dijalankan dengan akselerasi TPU.

Untuk itu, pastikan bahwa Google Colab ini menggunakan akselerasi Runtime **TPU**.
* Pilih menu Runtime
* Change Runtime Type
* Pilih **TPU**

<img src = "https://i.ibb.co/6t5CCyL/tpu.png" align = "center">



## Pengecekan Library

Sekali lagi, mari kita cek instalasi TensorFlow dan Keras

In [1]:
import tensorflow as tf
print(tf.__version__)
print(tf.keras.__version__)

1.14.0
2.2.4-tf


In [2]:
import keras
keras.__version__

Using TensorFlow backend.


'2.2.4'

## Pengecekan GPU
Berikutnya, mari kita cek kemampuan TensorFlow dalam memanfaatkan kemampuan GPU

Karena Google Colab sudah kita set menggunakan akselerasi TPU, maka device GPU tidak lagi bisa kita gunakan

In [3]:
print('GPU name:', tf.test.gpu_device_name())

GPU name: 


## Pengecekan TPU

Kemudian, mari kita cek kemampuan TensorFlow dalam memanfaatkan kemampuan TPU 

Karena Google Colab ini kita set dengan akselerasi TPU, maka nama dan jumlah device yang diberikan akan tertampi

In [4]:
import os
import pprint
import tensorflow as tf

if 'COLAB_TPU_ADDR' not in os.environ:
  print('ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!')
else:
  tpu_address = 'grpc://' + os.environ['COLAB_TPU_ADDR']
  print ('TPU address is', tpu_address)

  with tf.Session(tpu_address) as session:
    devices = session.list_devices()
    
  print('TPU devices:')
  pprint.pprint(devices)

TPU address is grpc://10.48.43.2:8470
TPU devices:
[_DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:CPU:0, CPU, -1, 4361477511126539627),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 16664257338021085502),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 14706666575299204974),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 10673233711504101602),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 4320409414066120778),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 12031429207439053541),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 13731042377140655935),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 10325777939450944264),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 636878606504

## Pelatihan Menggunakan TensorFlow

Terakhir, mari kita lihat seberapa cepat kemampuan TensorFlow jika dijalankan di atas TPU

In [0]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import tensorflow as tf

Di sini akan kita gunakan Jaringan Saraf Tiruan sederhana sebanyak 3 layer untuk melatih dataset Cifar-10

Jaringan dilatih selama 10 epoch dengan batch size 1000

Kita akan lihat berapa lama waktu yang dibutuhkan untuk proses pelatihan

In [0]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from keras.utils import to_categorical

In [7]:
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [0]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

mean_image = np.mean(X_train, axis = 0)
X_train -= mean_image
X_test -= mean_image

y_train_hot = to_categorical(y_train)
y_test_hot = to_categorical(y_test)

In [9]:
import os
import time


resolver = tf.contrib.cluster_resolver.TPUClusterResolver('grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.contrib.distribute.initialize_tpu_system(resolver)
strategy = tf.contrib.distribute.TPUStrategy(resolver)

with strategy.scope():
  model = Sequential()
  model.add(Flatten(input_shape=(32,32,3)))
  model.add(Dense(500, activation='sigmoid'))
  model.add(Dense(10, activation='softmax'))
  model.compile(loss='categorical_crossentropy',  optimizer='sgd', metrics=['accuracy'])



W0805 01:45:31.568508 140416748312448 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

W0805 01:45:37.909815 140416748312448 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


In [11]:
num_epochs = 10
batch_size = 1000

tic = time.time()
model.fit(X_train, y_train_hot, epochs=num_epochs, batch_size=batch_size, verbose=2)
toc = time.time()
print('\n\ntraining speed =', toc-tic,'seconds')

Epoch 1/10
50/50 - 3s - loss: 1.6328 - acc: 0.4454
Epoch 2/10
50/50 - 1s - loss: 1.6174 - acc: 0.4545
Epoch 3/10
50/50 - 1s - loss: 1.6006 - acc: 0.4609
Epoch 4/10
50/50 - 1s - loss: 1.5863 - acc: 0.4670
Epoch 5/10
50/50 - 1s - loss: 1.5721 - acc: 0.4742
Epoch 6/10
50/50 - 1s - loss: 1.5578 - acc: 0.4785
Epoch 7/10
50/50 - 1s - loss: 1.5454 - acc: 0.4854
Epoch 8/10
50/50 - 1s - loss: 1.5328 - acc: 0.4912
Epoch 9/10
50/50 - 1s - loss: 1.5209 - acc: 0.4973
Epoch 10/10
50/50 - 1s - loss: 1.5106 - acc: 0.5021


training speed = 33.039337396621704 seconds


## Kesimpulan

Dapat dilihat bahwa pelatihan TensorFlow yang berjalan di atas **TPU** juga hanya membutuhkan waktu berkisar **1 detik per epoch** pada dataset Cifar-10 dengan ukuran **batch 1000**

Namun bisa dilihat bahwa total waktu pelatihan lebih lama

Hal ini dikarenakan proses untuk mempersiapkan data pada device TPU membutuhkan waktu sedikit lebih lama daripada GPU

Hanya jika kita eskalasi data hingga **jutaan data**, kemampuan dan perbandingan TPU dengan GPU  baru bisa terlihat

<p>Copyright &copy; 2019 <a href=https://www.linkedin.com/in/andityaarifianto/>ADF</a> </p>