Learn Tensorflow Fundamentals 


-----------------------

# Tensors

A **tensor** is a container for *numeric* data. Tensors can contain data within an arbitrary number of dimensions. That is, it can be a zero-dimensional (0D), one-dimensional (1D), two-dimensional (2D), three-dimensional (3D), and so on. Within the context of tensors, a dimension is often called an **axis**.

So, tensors are a generalization of matrices represented by n-dimensional arrays. The dimensionality of a tensor is often described by its number of axes. The number of axes represented by a tensor is called its **rank**. Tensors are defined by how many axes they have in total.

![Imgur](https://imgur.com/342G6Hk.png) 

#### [source](https://www.guru99.com/tensor-tensorflow.html)


![Imgur](https://imgur.com/duJyL5S.png)

[source](https://www.tensorflow.org/guide/tensor)

## Scalars (0D tensors)

A **scalar** is a tensor of only one number. For example, a Numpy float32 or float64 number is a scalar tensor (or scalar array).


In [20]:
import tensorflow as tf
print(tf.__version__)

import warnings
warnings.filterwarnings('ignore')

2.6.0


In [21]:
tensor = tf.constant(5)
print(tensor)

tf.Tensor(5, shape=(), dtype=int32)


In [22]:
tensor = tf.constant([1, 2, 3])
print(tensor)

tf.Tensor([1 2 3], shape=(3,), dtype=int32)


In [23]:
tensor = tf.constant([[1, 2, 3], [2,3,9]])
# print(tensor)
print(tf.rank(tensor).numpy())

2



It is easy to dispay the number of axes (or dimensionality) of a Numpy tensor with the *ndim* attribute. Let's look at an example of a Numpy scalar:

In [24]:
print(tensor.ndim)

2


In [25]:
tensor = tf.constant([[[1, 2, 3],
                       [4, 5, 6]],
                      
                      [[7, 8, 9],
                       [10, 11, 12]],
                      
                      [[13, 14, 15],
                       [16, 17, 18]]])
tensor

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]], dtype=int32)>

In [26]:
tensor_float16 = tf.constant([[10, 7], [14, 15]], dtype=tf.float16)
tensor_float16

<tf.Tensor: shape=(2, 2), dtype=float16, numpy=
array([[10.,  7.],
       [14., 15.]], dtype=float16)>

In [27]:
mutable_tensor = tf.Variable([4, 5])

immuable_tensor = tf.constant([5, 6])

mutable_tensor

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([4, 5], dtype=int32)>

In [28]:
# mutable_tensor[0] = 12
mutable_tensor[0].assign(12)
mutable_tensor

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([12,  5], dtype=int32)>

## Creating Random Tensors

In [29]:
random_gen = tf.random.Generator.from_seed(42)

random_normal = random_gen.normal(shape=(2, 3))
random_normal

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[-0.7565803 , -0.06854702,  0.07595026],
       [-1.2573844 , -0.23193763, -1.8107855 ]], dtype=float32)>

In [30]:
random_uniform = random_gen.uniform(shape=(2, 3))
random_uniform

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[0.7647915 , 0.03845465, 0.8506975 ],
       [0.20781887, 0.711869  , 0.8843919 ]], dtype=float32)>

In [31]:
a = tf.random.shuffle([1,2,3,4,5], seed=2345).numpy()
b = tf.random.shuffle([1,2,3,4,5], seed=2345).numpy()
print(a == b)


[False False False False False]


## Special Notes on tf.random.set_seed()

In [32]:
tf.random.set_seed(2345)
a = tf.random.shuffle([1,2,3,4,5], seed=2345).numpy()
tf.random.set_seed(2345)
b = tf.random.shuffle([1,2,3,4,5], seed=2345).numpy()
print(a == b)

[ True  True  True  True  True]


[Official Doc](https://www.tensorflow.org/api_docs/python/tf/random/set_seed)


Operations that rely on a random seed actually derive it from two seeds: the global and operation-level seeds. This sets the global seed.

Its interactions with operation-level seeds is as follows:

* If neither the global seed nor the operation seed is set: A randomly picked seed is used for this op.
* If the global seed is set, but the operation seed is not: The system deterministically picks an operation seed in conjunction with the global seed so that it gets a unique random sequence. Within the same version of tensorflow and user code, this sequence is deterministic. However across different versions, this sequence might change. If the code depends on particular seeds to work, specify both global and operation-level seeds explicitly.
* If the operation seed is set, but the global seed is not set: A default global seed and the specified operation seed are used to determine the random sequence.
* If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence.

To illustrate the user-visible effects, consider these examples:

If neither the global seed nor the operation seed is set, we get different results for every call to the random op and every re-run of the program:


# The tf.data API

In [33]:
X = tf.range(5)
X

<tf.Tensor: shape=(5,), dtype=int32, numpy=array([0, 1, 2, 3, 4], dtype=int32)>

In [34]:
X.numpy()

array([0, 1, 2, 3, 4], dtype=int32)

In [35]:
dataset = tf.data.Dataset.from_tensor_slices(X)
print(dataset)

<TensorSliceDataset shapes: (), types: tf.int32>


In [36]:
for item in dataset:
  print(item)

tf.Tensor(0, shape=(), dtype=int32)
tf.Tensor(1, shape=(), dtype=int32)
tf.Tensor(2, shape=(), dtype=int32)
tf.Tensor(3, shape=(), dtype=int32)
tf.Tensor(4, shape=(), dtype=int32)


Depending on the data position and format, the `tf.data.Dataset` class offers quite a few
static methods to use to create a dataset easily:

* Tensors in memory: `tf.data.Dataset.from_tensors` or `tf.data.Dataset.from_tensor_slices`. In this case, the tensors can be NumPy arrays or tf.Tensor objects.

* From a Python generator: tf.data.Dataset.from_generator.

* From a list of files that matches a pattern: tf.data.Dataset.list_files.

Also, there are two specializations of the `tf.data.Dataset` object created for working 
with two commonly used file formats:

* tf.data.TFRecordDataset - to work with the TFRecord files

* tf.data.TextLineDataset - to work with text files, reading them line by line


In [37]:
def noise():
  while True:
    yield tf.random.uniform((100, 0))
    
dataset = tf.data.Dataset.from_generator(noise, (tf.float32))
dataset

<FlatMapDataset shapes: <unknown>, types: tf.float32>

The only peculiarity of the from_generator method is the need to pass the type of
the parameters (tf.float32, in this case) as the second parameter; this is required
since to build a graph we need to know the type of the parameters in advance.

Using method chaining, it is possible to create new dataset objects, transforming
the one just built to get the data our machine learning model expects as input.

For example, if we want to 

- sum 10 to every component of the noise vector,
- shuffle the dataset content, and 
- create batches of 32 vectors each, we can do so

by calling just three methods chained together:

In [38]:
buffer_size = 10
batch_size = 32

dataset = dataset.map(lambda x: x + 10).shuffle(buffer_size).batch(batch_size)

for idx, noise in enumerate(dataset):
  if idx == 2:
    break
  print(idx)
  print(noise.shape)

0
(32, 100, 0)
1
(32, 100, 0)


The `map` method is the most widely used method of the tf.data.Dataset object since
it allows us to apply a function to every element of the input dataset, producing a
new, transformed dataset.

The `shuffle` method is used in every training pipeline since this transformation
randomly shuffles the input dataset using a fixed-sized buffer; this means that
the shuffled data first fetches the buffer_size element from its input, then shuffles
them and produces the output.

The bat`c`h method gathers the batch_size elements from its input and creates a
batch as output. The only constraint of this transformation is that all elements of
the batch must have the same shape.