## TensorFlow - Eager Execution

In [1]:
import tensorflow as tf

tf.enable_eager_execution()

#### Tensors

A **tensor** is a multi-dimensional array. Similar to NumPy `ndarray` objects, `Tensor` objects have a data type and a shape. Additionally, Tensors can reside in accelerator (like GPU) memory. TensorFlow offers a rich library of operations that consume and produce Tensors. These operations automatically convert native Python types.

The most obvious differences between NumPy arrays and TensorFlow Tensors are:
1. Tensors can be backed by accelerated memory (GPU, TPU),
2. Tensors are immutable.

In [2]:
print(tf.add(1, 2))
print(tf.add([1, 2], [3, 4]))
print(tf.square(5))
print(tf.reduce_sum([1, 2, 3]))
print(tf.encode_base64("hello world"))

tf.Tensor(3, shape=(), dtype=int32)
tf.Tensor([4 6], shape=(2,), dtype=int32)
tf.Tensor(25, shape=(), dtype=int32)
tf.Tensor(6, shape=(), dtype=int32)
tf.Tensor(b'aGVsbG8gd29ybGQ', shape=(), dtype=string)


In [3]:
print(tf.square(2) + tf.square(3)) # Operator overloading

tf.Tensor(13, shape=(), dtype=int32)


In [4]:
x = tf.matmul([[4]], [[2, 3]])
print(x)
print(x.shape)
print(x.dtype)

tf.Tensor([[ 8 12]], shape=(1, 2), dtype=int32)
(1, 2)
<dtype: 'int32'>


#### NumPy Compatibility

TensorFlow and NumPy operations first convert the argument to the appropriate type (from `ndarray` to `Tensor`, or the other way around) before carrying out the operation.

Tensors can be explicitly converted to NumPy `ndarray`s by invoking the `.numpy()` method on them. Even though `Tensor`s and `ndarray`s have the same memory representation, conversion may require copy memory from GPU to main memory and thus can be expensive.

In [5]:
import numpy as np

ndarray = np.ones([3, 3])
tensor = tf.multiply(ndarray, 42)
print(tensor) # First convert ndarray to tensor then multiply

print(np.add(tensor, 1)) # First convert tensor to ndarray then add

print(tensor.numpy()) # Explicitly convert tensor to ndarray

tf.Tensor(
[[42. 42. 42.]
 [42. 42. 42.]
 [42. 42. 42.]], shape=(3, 3), dtype=float64)
[[43. 43. 43.]
 [43. 43. 43.]
 [43. 43. 43.]]
[[42. 42. 42.]
 [42. 42. 42.]
 [42. 42. 42.]]


#### GPU Acceleration

Without any annotations, TensorFlow automatically decides whether to use the GPU or CPU for an acceleration (and copies the tensor between GPU and CPU memory if necessary). Tensors produced by an operation are typically backed by the memory of the device on which the operation executed:

In [6]:
x = tf.random_uniform([3, 3])

print(tf.test.is_gpu_available())
print(x.device.endswith('GPU:0'))

False
False


The `Tensor.device` property provides a fully qualified string name of the device hosting the contents of the tensor. The string ends with `GPU:<N>` if the tensor is placed on the `N`-th GPU on the host.

#### Explicit Device Placement

The term *placement* in TF refers to how individual operations are assigned (placed on) a device for execution. If no expliclty guidance is provided, TF automatically decides which device to use. You can also place an operation on one device by calling `tf.device` context manager:

In [7]:
def time_matmul(x):
    %timeit tf.matmul(x, x)
    
# CPU Execution
print("On CPU:")
with tf.device("CPU:0"):
    x = tf.random_uniform([1000, 1000])
    assert x.device.endswith("CPU:0")
    time_matmul(x)
    
# GPU Execution
if tf.test.is_gpu_available():
    with tf.device("GPU:0"):
        x = tf.random_uniform([1000, 1000])
        assert x.device.endswith("GPU:0")
        time_matmul(x)

On CPU:
26.5 ms ± 244 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


#### Datasets

You can use the `tf.data.Dataset` API to build pipelines to feed data to your model. It is recommended to use this API for building performant, complex input pipelines from simple, re-usable pieces that will feed your model's training or evaluation loops.

Generating temporary files and directories: https://docs.python.org/3/library/tempfile.html

In [8]:
ds_tensors = tf.data.Dataset.from_tensor_slices([1, 2, 3, 4, 5, 6])

# Create a CSV file
import tempfile
_, filename = tempfile.mkstemp()

with open(filename, 'w') as f:
    f.write("""Line 1
    Line 2
    Line 3
    """)
    
ds_file = tf.data.TextLineDataset(filename) # A `Dataset` comprising lines from one or more text files.

In [9]:
ds_tensors = ds_tensors.map(tf.square).shuffle(2).batch(2) # Apply transformation
ds_file = ds_file.batch(2) # batch(batch_size): combines consecutive elements of this dataset into batches

In [10]:
for x in ds_tensors: print(x)
for x in ds_file: print(x)

tf.Tensor([4 1], shape=(2,), dtype=int32)
tf.Tensor([ 9 25], shape=(2,), dtype=int32)
tf.Tensor([36 16], shape=(2,), dtype=int32)
tf.Tensor([b'Line 1' b'    Line 2'], shape=(2,), dtype=string)
tf.Tensor([b'    Line 3' b'    '], shape=(2,), dtype=string)
