# Customization Basics : Tensors and Operators 

In this notebook we will explore how to import required packages, create and use tensors, use GPU acceleration and demonstrate tf.data.dataset

In [2]:
# Importing tensorflow

import tensorflow as tf 

## Tensors
A Tensor is a multi-dimensional array. Similar to NumPy ndarray objects, tf.Tensor objects have a data type and a shape. Additionally, tf.Tensors can reside in accelerator memory (like a GPU). TensorFlow offers a rich library of operations (tf.add, tf.matmul, tf.linalg.inv etc.) that consume and produce tf.Tensors. These operations automatically convert native Python types,

In [3]:
# some example tensors 

print(tf.add(1,2))


tf.Tensor(3, shape=(), dtype=int32)


In [4]:
print(tf.add ([1,2], [3,4]))

tf.Tensor([4 6], shape=(2,), dtype=int32)


In [5]:
print(tf.square(5))

tf.Tensor(25, shape=(), dtype=int32)


In [6]:
print(tf.reduce_sum([1,2,3]))

tf.Tensor(6, shape=(), dtype=int32)


Each tensor has a shape and a datatype

In [7]:
x = tf.matmul([[1]], [[2,3]])
print(x)
print(x.shape)
print(x.dtype)

tf.Tensor([[2 3]], shape=(1, 2), dtype=int32)
(1, 2)
<dtype: 'int32'>


Tensor and numpy

Although two types are kind of similar, there are some underlying differences among tensor and numpy arrays. Tensors are backed by GPU and numpy are backed by hosted memory.

In [8]:
import numpy as np 

ndarray = np.ones([3,2])
# here we will see that tf operations convert numpy arrays to tensors automatically

tensor = tf.multiply(ndarray, 42)
print(tensor)



tf.Tensor(
[[42. 42.]
 [42. 42.]
 [42. 42.]], shape=(3, 2), dtype=float64)


In [9]:
# And NumPy operations convert Tensors to numpy arrays automatically

print(np.add(tensor, 1))

[[43. 43.]
 [43. 43.]
 [43. 43.]]


In [10]:
# The .numpy() method explicitly converts a Tensor to a numpy array

print(tensor.numpy())

[[42. 42.]
 [42. 42.]
 [42. 42.]]


## GPU Acceleration 

Usually tensors use GPU and without any annotations, tensors use both cpu and GPU. 

In [11]:
x = tf.random.uniform([3,3])

# we can check whether there is any GPU available in the computer 
 #change runtime on colab to GPU and run the following
print(tf.config.list_physical_devices("GPU"))

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


We can see that there is a GPU running - GPU 0

In [14]:
# check on whether our tensor is running on GPU 0 or not

print(x.device.endswith('GPU:0'))

True


## Explicit Device Placement 

In tensorflow, placement refers to how individual operations are assigned on a device for execution. When there is no guidance provided, tensorflow automatically decides which device to execute an operation and copies tensors to that device. If needed, tensorflow operations are explicitly placed on specific devices using the tf.device context manager.

In [19]:
# Example 

import time 

def time_matmul(x):
  start = time.time()
  for loop in range(10):
    tf.matmul(x,x)

  result = time.time() - start 

  print("10 loops : {:0.2f}ms".format(1000*result))


# we will force this execution to cpu 

print('On CPU')
with tf.device("CPU:0"):
  x = tf.random.uniform([1000,1000]) #random numbers 1000 points X 1000 arrays
  assert x.device.endswith("CPU:0") # make sure it happens on CPU 0
  time_matmul(x) # running a matrix multiplication of x into x and displays how much time it takes for the operation

# lets force the execution to GPU 

if tf.config.list_physical_devices("GPU"): # if GPU is available
  print("On GPU")
  with tf.device("GPU:0"):
    x = tf.random.uniform([1000,1000])
    assert x.device.endswith("GPU:0")
    time_matmul(x)


On CPU
10 loops : 393.74ms
On GPU
10 loops : 2565.94ms


## Datasets
This section uses the tf.data.Dataset API to build a pipeline for feeding data to your model. The tf.data.Dataset API is used to build performant, complex input pipelines from simple, re-usable pieces that will feed your model's training or evaluation loops.

In [21]:
# create a source dataset with tf factory functions 

ds_tensors = tf.data.Dataset.from_tensor_slices([1,2,3,4,5,6])

# create csv 
import tempfile 
_, filename = tempfile.mkstemp()

with open(filename, 'w') as f:
  f.write("""Line 1
Line 2
Line 3
  """)

ds_file = tf.data.TextLineDataset(filename)

In [24]:
print(ds_tensors)

<TensorSliceDataset shapes: (), types: tf.int32>


In [25]:
ds_file

<TextLineDatasetV2 shapes: (), types: tf.string>

## Apply Transformations 

using map, batch and shuffle

In [26]:
ds_tensors = ds_tensors.map(tf.square).shuffle(2).batch(2)

ds_file = ds_file.batch(2)

## Iterate 

tf.data.Dataset objects support iteration to loop over records 

In [27]:
print('Elements of ds_tensors:')
for x in ds_tensors:
  print(x)

print('\n Elements of ds_file:')
for x in ds_file:
  print(x) 

Elements of ds_tensors:
tf.Tensor([4 9], shape=(2,), dtype=int32)
tf.Tensor([16 25], shape=(2,), dtype=int32)
tf.Tensor([ 1 36], shape=(2,), dtype=int32)

 Elements of ds_file:
tf.Tensor([b'Line 1' b'Line 2'], shape=(2,), dtype=string)
tf.Tensor([b'Line 3' b'  '], shape=(2,), dtype=string)
