# Tensorflow with GPU

This notebook provides an introduction to computing on a [GPU](https://cloud.google.com/gpu) in Colab. In this notebook you will connect to a GPU, and then run some basic TensorFlow operations on both the CPU and a GPU, observing the speedup provided by using the GPU.


## Enabling and testing the GPU

First, you'll need to enable GPUs for the notebook:

- Navigate to Edit→Notebook Settings
- select GPU from the Hardware Accelerator drop-down

Next, we'll confirm that we can connect to the GPU with tensorflow:

In [2]:
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


## Observe TensorFlow speedup on GPU relative to CPU

This example constructs a typical convolutional neural network layer over a
random image and manually places the resulting ops on either the CPU or the GPU
to compare execution speed.

In [3]:
import tensorflow as tf
import timeit

device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  print(
      '\n\nThis error most likely means that this notebook is not '
      'configured to use a GPU.  Change this in Notebook Settings via the '
      'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
  raise SystemError('GPU device not found')

def cpu():
  with tf.device('/cpu:0'):
    random_image_cpu = tf.random.normal((100, 100, 100, 3))
    net_cpu = tf.keras.layers.Conv2D(32, 7)(random_image_cpu)
    return tf.math.reduce_sum(net_cpu)

def gpu():
  with tf.device('/device:GPU:0'):
    random_image_gpu = tf.random.normal((100, 100, 100, 3))
    net_gpu = tf.keras.layers.Conv2D(32, 7)(random_image_gpu)
    return tf.math.reduce_sum(net_gpu)

# We run each op once to warm up; see: https://stackoverflow.com/a/45067900
cpu()
gpu()

# Run the op several times.
print('Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images '
      '(batch x height x width x channel). Sum of ten runs.')
print('CPU (s):')
cpu_time = timeit.timeit('cpu()', number=10, setup="from __main__ import cpu")
print(cpu_time)
print('GPU (s):')
gpu_time = timeit.timeit('gpu()', number=10, setup="from __main__ import gpu")
print(gpu_time)
print('GPU speedup over CPU: {}x'.format(int(cpu_time/gpu_time)))

Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images (batch x height x width x channel). Sum of ten runs.
CPU (s):
3.3545092379999915
GPU (s):
0.08736958600002254
GPU speedup over CPU: 38x


In [4]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [14]:
from keras.src import Model
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Embedding, LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
import string


In [15]:
dataset = pd.read_csv("/content/drive/MyDrive/Language_Detection.csv")

In [46]:
dataset.head()


Unnamed: 0,Text,Language
0,"Nature, in the broadest sense, is the natural...",English
1,"""Nature"" can refer to the phenomena of the phy...",English
2,"The study of nature is a large, if not the onl...",English
3,"Although humans are part of nature, human acti...",English
4,[1] The word nature is borrowed from the Old F...,English


In [47]:
dataset.describe()

Unnamed: 0,Text,Language
count,10337,10337
unique,10267,17
top,Jag är ledsen.,English
freq,3,1385


In [17]:
#Clean the text data from punctuation signs and convert to lower letter
def clean(text):
  for p in string.punctuation:
    text = text.replace(p, '')
  text = text.lower()
  return (text)
text = dataset['Text'].apply(clean)
text.head(12)


0      nature in the broadest sense is the natural p...
1     nature can refer to the phenomena of the physi...
2     the study of nature is a large if not the only...
3     although humans are part of nature human activ...
4     1 the word nature is borrowed from the old fre...
5     2 in ancient philosophy natura is mostly used ...
6     34 \nthe concept of nature as a whole the phys...
7     during the advent of modern scientific method ...
8     56 with the industrial revolution nature incre...
9     however a vitalist vision of nature closer to ...
10    1 within the various uses of the word today na...
11    nature can refer to the general realm of livin...
Name: Text, dtype: object

In [18]:
#Split the dataset into input features and target/ label

x = dataset.iloc[:, 0].values
y = dataset.iloc[:, 1].values

In [19]:
# Label Encoding
if not np.issubdtype (y.dtype, np.number):
  le = LabelEncoder()
  y = le.fit_transform(y)

In [20]:
# Split the dataset into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size = 0.3, random_state = 42)

In [24]:
#Invoke the tokenizer
tokenizer = Tokenizer()
tokenizer.fit_on_texts(text)

In [26]:
#Word level tokenization
#from io import TextIOBase
#Fit the tokenizer
#processed_text = tokenizer.texts_to_sequences(text)

In [27]:
#processed_text

In [32]:
#Tokenize the training text
tokenizer = Tokenizer (char_level = True)
tokenizer.fit_on_texts(x_train)

In [35]:
#chartokens = tokenizer.texts_to_sequences(x_train)

In [36]:
#chartokens

In [40]:
# Convert text sequences to padded sequences
max_length = 200
x_train_sequences = tokenizer.texts_to_sequences(x_train)
x_train_padded = pad_sequences(x_train_sequences, maxlen = max_length, padding = 'post')
x_test_sequences = tokenizer.texts_to_sequences(x_test)
x_test_padded = pad_sequences(x_test_sequences, maxlen= max_length, padding = 'post')

In [43]:
# Convert labels to categorical
num_classes = len(np.unique(y_train))
y_train = to_categorical(y_train, num_classes = num_classes)
y_test = to_categorical(y_test, num_classes = num_classes)

In [48]:
from keras.src.layers.serialization import activation
embedding_dim = 200
vocab_size = 500
inputs = Input(shape = (max_length,))
x = Embedding(vocab_size, embedding_dim)(inputs)
x = LSTM(256, return_sequences = True)(x)
x = LSTM(256)(x)
x = Dense(256, activation = 'relu')(x)
outputs = Dense(num_classes, activation = 'softmax')(x)
model= Model(inputs = inputs, outputs = outputs)

In [52]:
learning_rate = 0.001
optimizer = Adam(learning_rate= learning_rate)
# Compile the model
model.compile(loss = 'categorical_crossentropy',optimizer = optimizer, metrics =
 ['accuracy'])

In [53]:
# Train the model
#specify the GPU device
#with tf.device('/GPU:0'):
model.fit(x_train_padded, y_train, validation_data =(x_test_padded, y_test), epochs = 113, batch_size =64) #Epoch can be over 1000


Epoch 1/113
Epoch 2/113
Epoch 3/113
Epoch 4/113
Epoch 5/113
Epoch 6/113
Epoch 7/113
Epoch 8/113
Epoch 9/113
Epoch 10/113
Epoch 11/113
Epoch 12/113
Epoch 13/113
Epoch 14/113
Epoch 15/113
Epoch 16/113
Epoch 17/113
Epoch 18/113
Epoch 19/113
Epoch 20/113
Epoch 21/113
Epoch 22/113
Epoch 23/113
Epoch 24/113
Epoch 25/113
Epoch 26/113
Epoch 27/113
Epoch 28/113
Epoch 29/113
Epoch 30/113
Epoch 31/113
Epoch 32/113
Epoch 33/113
Epoch 34/113
Epoch 35/113
Epoch 36/113
Epoch 37/113
Epoch 38/113
Epoch 39/113
Epoch 40/113
Epoch 41/113
Epoch 42/113
Epoch 43/113
Epoch 44/113
Epoch 45/113
Epoch 46/113
Epoch 47/113
Epoch 48/113
Epoch 49/113
Epoch 50/113
Epoch 51/113
Epoch 52/113
Epoch 53/113
Epoch 54/113
Epoch 55/113
Epoch 56/113
Epoch 57/113
Epoch 58/113
Epoch 59/113

KeyboardInterrupt: ignored