<a href="https://colab.research.google.com/github/deenukhan/deep_learning/blob/main/1_1_Basic_TF_Operations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# This Notebook is inspired the book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

import tensorflow as tf
import numpy as np
from tensorflow import keras

### Tensors and Operations

In [7]:
# You can create a tensor with tf.constant(). For example, here is a tensor representing a matrix with two rows and three columns of floats:
matrix_constant = tf.constant([[1, 2, 3], [4, 5, 6]]) #Matrix
print(matrix_constant)
print(matrix_constant.shape)

scaler_constant = tf.constant(1)
print(scaler_constant)
print(scaler_constant.shape)

tf.Tensor(
[[1 2 3]
 [4 5 6]], shape=(2, 3), dtype=int32)
(2, 3)
tf.Tensor(1, shape=(), dtype=int32)
()


In [11]:
# Indexing works much like in NumPy in Tensorflow
matrix_constant[:, :2]

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [4, 5]], dtype=int32)>

In [17]:
matrix_constant[:,:, tf.newaxis].shape
# matrix_constant[..., tf.nexaxis].shape  both the statments are same
# ... represents the whole matrix all the rows and columns

TensorShape([2, 3, 1])

In [21]:
# Mostly all the operations we use in numpy are available in Tensorflow
print(matrix_constant + 20)
print(tf.square(matrix_constant))
print(tf.transpose(matrix_constant))

tf.Tensor(
[[21 22 23]
 [24 25 26]], shape=(2, 3), dtype=int32)
tf.Tensor(
[[ 1  4  9]
 [16 25 36]], shape=(2, 3), dtype=int32)
tf.Tensor(
[[1 4]
 [2 5]
 [3 6]], shape=(3, 2), dtype=int32)


In [22]:
matrix_constant @ tf.transpose(matrix_constant)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[14, 32],
       [32, 77]], dtype=int32)>

You will find all the basic math operations you need (tf.add(), tf.multiply(), tf.square(), tf.exp(), tf.sqrt(), etc.) and most operations that you can find in NumPy (e.g., tf.reshape(), tf.squeeze(), tf.tile()). Some functions have a different name than in NumPy; for instance, tf.reduce_mean(), tf.reduce_sum(), tf.reduce_max(), and tf.math.log() are the equivalent of np.mean(), np.sum(), np.max() and np.log(). When the name differs, there is often a good reason for it. For example, in TensorFlow you must write tf.transpose(t); you cannot just write t.T like in NumPy. The reason is that the tf.transpose() function does not do exactly the same thing as NumPy’s T attribute: in TensorFlow, a new tensor is created with its own copy of the transposed data, while in NumPy, t.T is just a transposed view on the same data. Similarly, the tf.reduce_sum() operation is named this way because its GPU kernel (i.e., GPU implementation) uses a reduce algorithm that does not guarantee the order in which the elements are added: because 32-bit floats have limited precision, the result may change ever so slightly every time you call this operation. The same is true of tf.reduce_mean() (but of course tf.reduce_max() is deterministic).

### Tensors and NumPy

In [27]:
# Tensors play nice with NumPy: you can create a tensor from a NumPy array, and vice versa. 
# You can even apply TensorFlow operations to NumPy arrays and NumPy operations to tensors:

a = np.array([1,2,4])
print(tf.constant(a))

print(matrix_constant.numpy())
print(tf.square(a))

# Notice that NumPy uses 64-bit precision by default, while TensorFlow uses 32-bit. 
# This is because 32-bit precision is generally more than enough for neural networks, 
# plus it runs faster and uses less RAM. So when you create a tensor from a NumPy array, make sure to set dtype=tf.float32.

tf.Tensor([1 2 4], shape=(3,), dtype=int64)
[[1 2 3]
 [4 5 6]]
tf.Tensor([ 1  4 16], shape=(3,), dtype=int64)


### Type Conversions
Type conversions can significantly hurt performance, and they can easily go unnoticed when they are done automatically. To avoid this, TensorFlow does not perform any type conversions automatically: it just raises an exception if you try to execute an operation on tensors with incompatible types. For example, you cannot add a float tensor and an integer tensor, and you cannot even add a 32-bit float and a 64-bit float:

In [28]:
tf.constant(4.) + tf.constant(40)

# Checkout the Error, we are not able to add Float and Integers

InvalidArgumentError: ignored

In [29]:
tf.constant(4.) + tf.constant(40., dtype = tf.float64)
# Checkout the Error, we are not able to add 32 and 64 bit floats

InvalidArgumentError: ignored

In [30]:
# We can use tf.cast()
tf.constant(4.) + tf.cast(tf.constant(40), tf.float32)

<tf.Tensor: shape=(), dtype=float32, numpy=44.0>

### Variables

The tf.Tensor values we’ve seen so far are immutable: you cannot modify them. This means that we cannot use regular tensors to implement weights in a neural network, since they need to be tweaked by backpropagation. Plus, other parameters may also need to change over time (e.g., a momentum optimizer keeps track of past gradients). What we need is a tf.Variable:

In [31]:
v = tf.Variable([[1,2,3], [5,5,7]])

In [34]:
v.assign(v*2)

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=int32, numpy=
array([[ 2,  4,  6],
       [10, 10, 14]], dtype=int32)>

In [35]:
v[0, 1].assign(42)  

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=int32, numpy=
array([[ 2, 42,  6],
       [10, 10, 14]], dtype=int32)>

In [37]:
v[:, 2].assign([0, 1])

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=int32, numpy=
array([[ 2, 42,  0],
       [10, 10,  1]], dtype=int32)>

In [39]:
v.scatter_nd_update(indices=[[0, 0], [1, 2]], updates=[10, 200])

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=int32, numpy=
array([[ 10,  42,   0],
       [ 10,  10, 200]], dtype=int32)>

In practice you will rarely have to create variables manually, since Keras provides an add_weight() method that will take care of it for you, as we will see. Moreover, model parameters will generally be updated directly by the optimizers, so you will rarely need to update variables manually.

# **Loading and Preprocessing Data with TensorFlow**

## Data API

In [41]:
# TFRecord is a flexible and efficient binary format usually containing protocol buffers (an open source binary format)

# The whole Data API revolves around the concept of a dataset: 
# as you might suspect, this represents a sequence of data items. 
# Usually you will use datasets that gradually read data from disk, 
# but for simplicity let’s create a dataset entirely in RAM using tf.data.Dataset.from_tensor_slices():

x = tf.range(10)
dataset = tf.data.Dataset.from_tensor_slices(x)
dataset

<TensorSliceDataset shapes: (), types: tf.int32>

In [42]:
# We can simply Iterate over the dataset Items

for data in dataset:
    print(data)

tf.Tensor(0, shape=(), dtype=int32)
tf.Tensor(1, shape=(), dtype=int32)
tf.Tensor(2, shape=(), dtype=int32)
tf.Tensor(3, shape=(), dtype=int32)
tf.Tensor(4, shape=(), dtype=int32)
tf.Tensor(5, shape=(), dtype=int32)
tf.Tensor(6, shape=(), dtype=int32)
tf.Tensor(7, shape=(), dtype=int32)
tf.Tensor(8, shape=(), dtype=int32)
tf.Tensor(9, shape=(), dtype=int32)


In [44]:
# Once you have a dataset, you can apply all sorts of transformations to it by calling its transformation methods. 
# Each method returns a new dataset, so you can chain transformations like this

dataset = dataset.repeat(3).batch(5)
for data in dataset:
    print(data)

# The dataset methods do not modify datasets, they create new ones, so make sure to keep a reference to 
# these new datasets (e.g., with dataset = ...), or else nothing will happen.

tf.Tensor([0 1 2 3 4], shape=(5,), dtype=int32)
tf.Tensor([5 6 7 8 9], shape=(5,), dtype=int32)
tf.Tensor([0 1 2 3 4], shape=(5,), dtype=int32)
tf.Tensor([5 6 7 8 9], shape=(5,), dtype=int32)
tf.Tensor([0 1 2 3 4], shape=(5,), dtype=int32)
tf.Tensor([5 6 7 8 9], shape=(5,), dtype=int32)


In this example, we first call the repeat() method on the original dataset, and it returns a new dataset that will repeat the items of the original dataset three times. Of course, this will not copy all the data in memory three times! (If you call this method with no arguments, the new dataset will repeat the source dataset forever, so the code that iterates over the dataset will have to decide when to stop.) Then we call the batch() method on this new dataset, and again this creates a new dataset. This one will group the items of the previous dataset in batches of seven items. 

In [45]:
dataset = dataset.map(lambda x: x * 2)
# We can use map functionas well 

In [None]:
# It is also possible to simply filter the dataset using the filter() method:
dataset = dataset.filter(lambda x: x < 10) 

In [46]:
# You will often want to look at just a few items from a dataset. You can use the take() method for that:
for data in dataset.take(3):
    print(data)

tf.Tensor([0 2 4 6 8], shape=(5,), dtype=int32)
tf.Tensor([10 12 14 16 18], shape=(5,), dtype=int32)
tf.Tensor([0 2 4 6 8], shape=(5,), dtype=int32)


In [47]:
# Shuffling the Data

dataset = tf.data.Dataset.range(10).repeat(3)
dataset  = dataset.shuffle(buffer_size=5, seed = 56, ).batch(7)

for data in dataset:
    print(data)

tf.Tensor([0 1 2 6 7 8 4], shape=(7,), dtype=int64)
tf.Tensor([5 0 3 2 9 3 7], shape=(7,), dtype=int64)
tf.Tensor([5 4 6 8 0 9 2], shape=(7,), dtype=int64)
tf.Tensor([5 1 3 6 4 8 1], shape=(7,), dtype=int64)
tf.Tensor([7 9], shape=(2,), dtype=int64)
