# In this notebook, we are going to cover some of the most fundamental concept of tensors using TensorFlow
More specifically, we're going to cover:
- Introduction to tensors
- Getting information from tensors
- Manipulating tensors
- Tensors and Numpy
- Using @tf.function (a way to speed up your regular python functions)
- Using GPUs with TensorFlow (or TPUs)
- Exercises

# Introduction to Tensors

In [1]:
# Import TensorFlow
import tensorflow as tf

In [5]:
# lets check the version of tensorflow
print(tf.__version__)

2.19.0


In [6]:
# Create tensors with tf.constant()
scalar = tf.constant(7)
scalar

<tf.Tensor: shape=(), dtype=int32, numpy=7>

In [8]:
# Check the number of dimensions of a tensor (ndim)
scalar.ndim

0

In [9]:
# Create a vector
vector = tf.constant([10,20])
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 20], dtype=int32)>

In [10]:
# check the dimension of the vector
vector.ndim

1

In [11]:
# lets create a matrix 
matrix = tf.constant([[1,9],
                      [16,1]])
matrix

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 1,  9],
       [16,  1]], dtype=int32)>

In [12]:
# check the dimension of the matrix
matrix.ndim

2

In [13]:
# Create another matrix
matrix2 = tf.constant([[1., 4.],
                       [9., 16.],
                       [25., 36.]], dtype=tf.float16) # specify the data type with dtype parameter
matrix2

<tf.Tensor: shape=(3, 2), dtype=float16, numpy=
array([[ 1.,  4.],
       [ 9., 16.],
       [25., 36.]], dtype=float16)>

In [16]:
# you can check that matrix2 has a dimension of 2
matrix2.ndim

2

In [18]:
# Now lets create a tensor
tensor = tf.constant([[[1, 2, 3],
                       [4, 5, 6]],
                      [[7, 8, 9],
                      [0, 0, 0]],
                      [[13, 19, 1],
                       [9, 3, 11]]])
tensor

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [ 0,  0,  0]],

       [[13, 19,  1],
        [ 9,  3, 11]]], dtype=int32)>

In [19]:
# check the dimension
tensor.ndim

3

**We have done so for**:
* Scalar: a single number
* Vector: a number with direction
* Matrix: a 2-dimensional array of numbers
* Tensor: an n-dimensional array of numbers

## Creating tensors with `tf.Variable`
### `tf.constant` vs `tf.Variable` in TensorFlow:

| **Feature**   | **tf.constant** (a function) | **tf.Variable** (a class)|
|--------------|---------------|---------------|
| **Mutability** | Immutable (cannot be changed) | Mutable (can be changed) |
| **Purpose**   | Fixed data, constants | Trainable parameters, state |
| **Initialization** | Required at creation | Required at creation |
| **Use Cases** | Input data, fixed parameters | Model weights, biases, accumulators |

`tf.constant` is great for fixed values that don‚Äôt change during computation, while `tf.Variable` is essential for parameters that need to be updated during training, such as weights in a neural network.




In [16]:
# Create a tensor with tf.Variable()
my_tensor = tf.constant([[1.0, 2.0], [3.0, 4.0]])
my_variable = tf.Variable([[1.0, 2.0], [3.0, 4.0]]) # or my_variable = tf.Variable(my_tensor)

In [17]:
my_tensor, my_variable

(<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
 array([[1., 2.],
        [3., 4.]], dtype=float32)>,
 <tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
 array([[1., 2.],
        [3., 4.]], dtype=float32)>)

In [14]:
# Variables can be all kinds of types, just like tensors
bool_variable = tf.Variable([False, False, False, True])
complex_variable = tf.Variable([5 + 4j, 6 + 1j])

In [15]:
bool_variable, complex_variable

(<tf.Variable 'Variable:0' shape=(4,) dtype=bool, numpy=array([False, False, False,  True])>,
 <tf.Variable 'Variable:0' shape=(2,) dtype=complex128, numpy=array([5.+4.j, 6.+1.j])>)

##### Change the elements of tensor using `.assign()` method.

In [19]:
my_variable[0].assign([0,0])
my_variable

<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[0., 0.],
       [3., 4.]], dtype=float32)>

In [20]:
# lets try to change in constant tensor
my_tensor[0].assign([0,0])
my_tensor

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

**üóù Note:** Rarely in practice will you need to decide whether to use tf.constant or tf.Variable to create tensors, as TensorFlow does this for us. However, if in doubt, use tf.constant and change it later if needed.

## Creating random tensors
In TensorFlow, random tensors are used to initialize variables, create randomized data samples, or generate inputs for models. You can create them using various functions from `tf.random.`

In [21]:
# Create two random (but the same) tensors
g1 = tf.random.Generator.from_seed(1) # set local seed for reproducibility 
random_normal_tensor1 = g1.normal(shape=[2, 3])
g2 = tf.random.Generator.from_seed(1) # set local seed for reproducibility 
random_normal_tensor2 = g2.normal(shape=[2, 3])

In [23]:
# are they equal
random_normal_tensor1, random_normal_tensor2, random_normal_tensor1 == random_normal_tensor2

(<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
 array([[ 0.43842274, -0.53439844, -0.07710262],
        [ 1.5658046 , -0.1012345 , -0.2744976 ]], dtype=float32)>,
 <tf.Tensor: shape=(2, 3), dtype=float32, numpy=
 array([[ 0.43842274, -0.53439844, -0.07710262],
        [ 1.5658046 , -0.1012345 , -0.2744976 ]], dtype=float32)>,
 <tf.Tensor: shape=(2, 3), dtype=bool, numpy=
 array([[ True,  True,  True],
        [ True,  True,  True]])>)

## Shuffle the order of the elements in a tensor
Read Here üîó: [tf.random.shuffle](https://www.tensorflow.org/api_docs/python/tf/random/shuffle)

In [25]:
# Define a tensor
tensor = tf.constant([1, 2, 3, 4, 5])

# Shuffle the tensor randomly
shuffled_tensor = tf.random.shuffle(tensor, seed=24)

print("Original Tensor:", tensor.numpy())
print("Shuffled Tensor:", shuffled_tensor.numpy())

Original Tensor: [1 2 3 4 5]
Shuffled Tensor: [5 4 1 3 2]


If the global and the operation seeds are set, we get same results for every re-run of the program: [random seed](https://www.tensorflow.org/api_docs/python/tf/random/set_seed)
> Rule 4: If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence.

In [28]:
tf.random.set_seed(55) # global level random seed
# Define a tensor
tensor2 = tf.constant([[1, 2], [3, 4], [5,6]])

# Shuffle the tensor randomly
shuffled_tensor2 = tf.random.shuffle(tensor2, seed=55) # operation level random seed

print("Original Tensor:\n", tensor2.numpy())
print("Shuffled Tensor:\n", shuffled_tensor2.numpy())

Original Tensor:
 [[1 2]
 [3 4]
 [5 6]]
Shuffled Tensor:
 [[5 6]
 [1 2]
 [3 4]]


## Other ways to create tensors
- **[tf.ones](https://www.tensorflow.org/api_docs/python/tf/ones):** Creates a tensor with all elements set to one (1) (similar to `NumPy`).
- **[tf.zeros](https://www.tensorflow.org/api_docs/python/tf/zeros):** Creates a tensor with all elements set to zeros (0) (similar to `NumPy`).

In [30]:
# create a tensor of all ones
tf.ones([3,4])

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], dtype=float32)>

In [31]:
# create a tensor of all zeros
tf.zeros([3,4])

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]], dtype=float32)>

## Turn NumPy arrays into tensors
The main difference between arrays and TensorFlow tensors is the tensors can be ran on GPU (much faster for numerical computing). Otherwise they are very similar.

In [33]:
# import the numpy
import numpy as np
np_vector = np.arange(1, 25, dtype=np.int32)
np_vector

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24], dtype=int32)

In [34]:
# Convert to tensors
tensor1 = tf.constant(np_vector)
tensor2 = tf.constant(np_vector, shape=(8,3))
tensor3 = tf.constant(np_vector, shape=(2,3,4))
tensor1, tensor2, tensor3

(<tf.Tensor: shape=(24,), dtype=int32, numpy=
 array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24], dtype=int32)>,
 <tf.Tensor: shape=(8, 3), dtype=int32, numpy=
 array([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12],
        [13, 14, 15],
        [16, 17, 18],
        [19, 20, 21],
        [22, 23, 24]], dtype=int32)>,
 <tf.Tensor: shape=(2, 3, 4), dtype=int32, numpy=
 array([[[ 1,  2,  3,  4],
         [ 5,  6,  7,  8],
         [ 9, 10, 11, 12]],
 
        [[13, 14, 15, 16],
         [17, 18, 19, 20],
         [21, 22, 23, 24]]], dtype=int32)>)

In [35]:
# check the dimensions
tensor1.ndim, tensor2.ndim, tensor3.ndim

(1, 2, 3)

## Getting Information from Tensors
When working with tensors in TensorFlow, it‚Äôs important to be aware of the following properties:  

- **Shape (`tensor.shape`)** ‚Äì Represents the number of elements along each dimension of a tensor.  
- **Rank (`tf.rank(tensor)`)** ‚Äì The number of dimensions a tensor has. A scalar has rank `0`, a vector has rank `1`, a matrix has rank `2`, and higher-dimensional tensors have rank `n`.  
- **ndim (`tensor.ndim`)** ‚Äì The number of dimensions in a tensor, similar to rank. However, `ndim` is primarily used in NumPy-style operations within TensorFlow.  
- **Axis or Dimension (`tensor[i]`, `tensor[:, n]`, etc.)** ‚Äì Refers to a particular dimension of a tensor that can be accessed or manipulated.  
- **Size (`tf.size(tensor)`)** ‚Äì The total number of elements in a tensor.
>Both `tf.rank(tensor)` and `tensor.ndim` give the number of dimensions of a tensor, meaning they provide the same information.

In [36]:
# Create a rank 4 tensor (4 dimensions)
tensor1 = tf.zeros(shape=(2,3,2,4))
tensor1

<tf.Tensor: shape=(2, 3, 2, 4), dtype=float32, numpy=
array([[[[0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.]]],


       [[[0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.]]]], dtype=float32)>

In [37]:
tensor1[0]

<tf.Tensor: shape=(3, 2, 4), dtype=float32, numpy=
array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]]], dtype=float32)>

In [39]:
tensor1.shape, tf.rank(tensor1),  tensor.ndim, tf.size(tensor1)

(TensorShape([2, 3, 2, 4]),
 <tf.Tensor: shape=(), dtype=int32, numpy=4>,
 1,
 <tf.Tensor: shape=(), dtype=int32, numpy=48>)

In [41]:
# Get various attributes of the tensor
print("Datatype of every element: ", tensor1.dtype)
print("Number of dimensions (rank): ", tensor1.ndim, " or ", tf.rank(tensor1).numpy())
print("Shape of tensor: ",  tensor1.shape)
print("Elements along the 0 axis: ", tensor1.shape[0])
print("Elements along the last axis: ", tensor1.shape[-1])
print("Total number of elements in tensor: ", tf.size(tensor1).numpy())

Datatype of every element:  <dtype: 'float32'>
Number of dimensions (rank):  4  or  4
Shape of tensor:  (2, 3, 2, 4)
Elements along the 0 axis:  2
Elements along the last axis:  4
Total number of elements in tensor:  48


## Indexing tensors
Tensors can be indexed just like Python lists.

In [47]:
# Get the first 2 elements of each dimension
tensor1[:2, :2, :2, :2]

<tf.Tensor: shape=(2, 2, 2, 2), dtype=float32, numpy=
array([[[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]],


       [[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]]], dtype=float32)>

In [53]:
# Get the first element froma each dimension from each index except the last one
tensor1[:1, :1, :1, :]

<tf.Tensor: shape=(1, 1, 1, 4), dtype=float32, numpy=array([[[[0., 0., 0., 0.]]]], dtype=float32)>

In [60]:
# Create a rank 2 tensor (2 dimensions)
rank2_tensor = tf.random.uniform([2,2], minval=2, maxval=5, dtype=tf.int32)
rank2_tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[2, 3],
       [3, 3]], dtype=int32)>

In [61]:
# Get the last item of each of row of our rank 2 tensor
rank2_tensor[:, -1]

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([3, 3], dtype=int32)>

## Add in extra dimension to a tensor
- **Using `tf.newaxis`:** It is an alias for None, allowing you to insert a new dimension into a tensor when indexing.
- **Using `tf.expand_dims()`:** It is more explicit‚Äîit allows you to specify which axis to insert the new dimension at.


In [62]:
# add axis using newaxis
tensor_new = rank2_tensor[..., tf.newaxis]  # '...' stands for ':, :'
tensor_new

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[2],
        [3]],

       [[3],
        [3]]], dtype=int32)>

In [63]:
# Add axis using expand_dims
tensor_exp = tf.expand_dims(rank2_tensor, axis=-1) # -1 means expand the final axis
tensor_exp

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[2],
        [3]],

       [[3],
        [3]]], dtype=int32)>

> üëÜ we got the exact same output

In [64]:
# Expand the 0-axis
tensor_exp1 = tf.expand_dims(rank2_tensor, axis=0) 
tensor_exp1

<tf.Tensor: shape=(1, 2, 2), dtype=int32, numpy=
array([[[2, 3],
        [3, 3]]], dtype=int32)>

## Manipulating tensors (tensor operations)
### I. Basic operaitons
- Addition ‚ûï 
- Multiplication ‚úñ 
- Subtractions ‚ûñ 
- Division ‚ûó 

In [66]:
tensor = tf.random.uniform([2,2], minval=2, maxval=9, dtype=tf.int32, seed=7)
tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[3, 8],
       [6, 5]], dtype=int32)>

In [68]:
tensor+10, tensor-20, tensor*2, tensor/3

(<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[13, 18],
        [16, 15]], dtype=int32)>,
 <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[-17, -12],
        [-14, -15]], dtype=int32)>,
 <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[ 6, 16],
        [12, 10]], dtype=int32)>,
 <tf.Tensor: shape=(2, 2), dtype=float64, numpy=
 array([[1.        , 2.66666667],
        [2.        , 1.66666667]])>)

>To run on GPU/ speed up the operation for large tensors use built-in functions

- `tf.math.divide` or `tf.divide` similarly for other operations. ‚û° **[tf.math.add](https://www.tensorflow.org/api_docs/python/tf/math/add)**

In [70]:
tf.add(tensor, 10), tf.subtract(tensor, 20), tf.multiply(tensor, 2), tf.divide(tensor, 3)

(<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[13, 18],
        [16, 15]], dtype=int32)>,
 <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[-17, -12],
        [-14, -15]], dtype=int32)>,
 <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[ 6, 16],
        [12, 10]], dtype=int32)>,
 <tf.Tensor: shape=(2, 2), dtype=float64, numpy=
 array([[1.        , 2.66666667],
        [2.        , 1.66666667]])>)

### II. Matrix multiplication
In machine learning, matrix multiplication is one of the most common tensor operation.
#### II(a): Matrix multiplication using transpose of a tenosr matrix
$\implies$ **Syntax**: `tf.linalg.matmul` or **[tf.matmul](https://www.tensorflow.org/api_docs/python/tf/linalg/matmul)**

In [72]:
# lets generate two random tensors
A = tf.random.uniform([2,2], minval=2, maxval=9, dtype=tf.int32, seed=7)
B = tf.random.uniform([2,2], minval=2, maxval=9, dtype=tf.int32, seed=7)
A.numpy(), B.numpy()

(array([[2, 4],
        [3, 8]], dtype=int32),
 array([[6, 2],
        [6, 3]], dtype=int32))

In [73]:
# multiply using matmul
C = tf.linalg.matmul(A, B)
C

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[36, 16],
       [66, 30]], dtype=int32)>

In [76]:
# Take another example of tensor matrx multiplication
X = tf.random.uniform([3, 2], minval=2, maxval=9, dtype=tf.int32, seed=7)
Y = tf.random.uniform([3, 2], minval=2, maxval=9, dtype=tf.int32, seed=7)
Y.numpy(), X.numpy()

(array([[7, 8],
        [5, 2],
        [7, 7]], dtype=int32),
 array([[6, 2],
        [5, 2],
        [6, 5]], dtype=int32))

In [79]:
# We need to take transpose of a matrix to perform the multiplication since inner dimensions are not same
tf.matmul(X, tf.transpose(Y))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[58, 34, 56],
       [51, 29, 49],
       [82, 40, 77]], dtype=int32)>

**Another method can be done by changing the shape a matrix s.t. inner dimensions match.**
#### II(b): Matrix multiplication using reshape
$\implies$ **Syntax**: `tf.linalg.matmul` or **[tf.matmul](https://www.tensorflow.org/api_docs/python/tf/reshape)**
>**Note:** reshape $\ne$ transpose

In [80]:
# this is just to exit multiplication
tf.matmul(X, tf.reshape(Y, shape=[2,3]))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[46, 62, 44],
       [39, 54, 39],
       [52, 83, 65]], dtype=int32)>

**üóù Note:** Since python >= 3.5 the `@` operator is supported (PEP 465). In TensorFlow, it simply calls the `tf.matmul()` function, so the following lines are equivalent:

In [81]:
X @ tf.reshape(Y, shape=[2,3])

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[46, 62, 44],
       [39, 54, 39],
       [52, 83, 65]], dtype=int32)>

In [82]:
# Perform matrix multiplication between X reshaped and Y
tf.matmul(tf.reshape(X, shape=[2, 3]), Y)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[87, 87],
       [79, 63]], dtype=int32)>

#### II(c): Tensordot (also known as tensor contraction) sums the product of elements
$\implies$**Syntax**: `tf.linalg.tensordot` or **[tf.tensordot](https://www.tensorflow.org/api_docs/python/tf/tensordot)**

| Feature | `tf.tensordot(X, Y, axes)` | `tf.matmul(X, Y)` |
|---|---|---|
| **Purpose** | Computes **generalized dot products** with customizable axes. | Performs **standard matrix multiplication**. |
| **Axes Required?** | Yes, you **must specify `axes`** for contraction. | No, it directly multiplies matrices following standard rules. |
| **Usage** | Use `tf.tensordot()` for **more complex contractions**, where you define which axes to sum over. | Use `tf.matmul()` when performing **standard matrix multiplication** (dot product along the second axis of the first tensor and first axis of the second). |
| **Shape Constraints** | Can work with tensors of **any rank**. | Requires valid matrix dimensions: `(M, N) * (N, P) ‚Üí (M, P)`. |

In [86]:
# Lets do the dot product of tensor matrix X and Y transposed
tf.tensordot(X, tf.transpose(Y), axes=1)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[58, 34, 56],
       [51, 29, 49],
       [82, 40, 77]], dtype=int32)>

In [88]:
tf.tensordot(X, Y, axes=0)

<tf.Tensor: shape=(3, 2, 3, 2), dtype=int32, numpy=
array([[[[42, 48],
         [30, 12],
         [42, 42]],

        [[14, 16],
         [10,  4],
         [14, 14]]],


       [[[35, 40],
         [25, 10],
         [35, 35]],

        [[14, 16],
         [10,  4],
         [14, 14]]],


       [[[42, 48],
         [30, 12],
         [42, 42]],

        [[35, 40],
         [25, 10],
         [35, 35]]]], dtype=int32)>

## Changing the datatype of a tensor using `tf.cast`
**Signature**: `tf.dtype.cast` or **[tf.cast](https://www.tensorflow.org/api_docs/python/tf/cast)**
>Reducing the datatypes decreases the precision of a numbers.

In [91]:
# Create a tensor
A = tf.constant([1, 2, 3, 4])
A.dtype

tf.int32

In [92]:
# change the datatype
B = tf.cast(A, tf.int16)
B.dtype

tf.int16

In [94]:
# change the datatype to float
tf.cast(A, tf.float32)

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([1., 2., 3., 4.], dtype=float32)>

## Aggregating tensors
In the context of **machine learning (ML) and deep learning (DL)**, **aggregators** refer to functions or mechanisms that **combine multiple values** into a **single result**. They are essential in summarizing data, making computations more efficient, and guiding optimization processes.

### **Types of Aggregators in ML/DL**
1. **Statistical Aggregators**  
   - Summarize numerical data from multiple sources.  
   - Examples:  
     - `tf.abs(tensor)`
     - `tf.reduce_sum(tensor)` or `tf.math.reduce_sum`
     - `tf.reduce_mean(tensor)` or `tf.math.reduce_mean`
     - `tf.reduce_max(tensor)` or `tf.math.reduce_max`
     - `tf.reduce_min(tensor)` or `tf.math.reduce_min`
     - `tf.math.reduce_std` or `tfp.stats.std` where import tensorflow_probability as tfp
     - `tf.math.reduce_variance` or `tfp.stats.variance` where import tensorflow_probability as tfp  

2. **Gradient Aggregators**  
   - Used in **backpropagation** to compute weight updates across batches.  
   - Example: Aggregating gradients from mini-batches before updating the model.

3. **Attention Aggregators (Neural Networks)**  
   - Used in **Transformer models** (like GPT, BERT) to aggregate attention scores from multiple input tokens.  
   - Example: Softmax-weighted sum of input embeddings in attention mechanisms.

4. **Ensemble Aggregators**  
   - Used in **ensemble learning** to combine predictions from multiple models.  
   - Examples:  
     - Majority Voting (classification)  
     - Averaging (regression)  

In [109]:
# Example Tensor
tensor = tf.constant(np.random.randint(-50, 50, size=64))
tensor = tf.reshape(tensor, [8, 8])
tensor

<tf.Tensor: shape=(8, 8), dtype=int32, numpy=
array([[ 36,  26,  -5, -40,  18, -25,  -6,  10],
       [  3,  -6, -46,  33, -32, -10,  46,  40],
       [ 28, -28, -14, -35,  12,  21, -19, -32],
       [ 48, -34,  32,  43, -16, -15,  27,   3],
       [-33, -33, -35, -22,  47,  49,  35, -27],
       [-10,  32,  15,  35, -45,   5, -22,  48],
       [ 24,  24,  22,  -4, -14, -42,  18, -45],
       [-45,  -2, -13,   1,  43, -46,  48, -47]], dtype=int32)>

In [118]:
# Aggregating values
absolute = tf.abs(tensor)
sum_result = tf.reduce_sum(tensor)  
mean_result = tf.reduce_mean(tensor)
min_result = tf.reduce_min(tensor)
max_result = tf.reduce_max(tensor)
variance = tf.math.reduce_variance(tf.cast(tensor, tf.float32)) # convert it to float type since takes input of real numbers
standard_deviation = tf.math.reduce_std(tf.cast(tensor, tf.float32))

In [119]:
print("Absolute:\n", absolute.numpy())
print("Min:", min_result.numpy())
print("Max:", max_result.numpy())
print("Sum:", sum_result.numpy())
print("Mean:", mean_result.numpy())
print("Variance:", variance.numpy())
print("Standard Deviation:", standard_deviation.numpy())

Absolute:
 [[36 26  5 40 18 25  6 10]
 [ 3  6 46 33 32 10 46 40]
 [28 28 14 35 12 21 19 32]
 [48 34 32 43 16 15 27  3]
 [33 33 35 22 47 49 35 27]
 [10 32 15 35 45  5 22 48]
 [24 24 22  4 14 42 18 45]
 [45  2 13  1 43 46 48 47]]
Min: -47
Max: 49
Sum: 24
Mean: 0
Variance: 933.9844
Standard Deviation: 30.561157


## Find the positional maximum and minimum
The **positional maximum** and **positional minimum** refer to the locations (indices) of the **maximum** and **minimum** values within a tensor. In TensorFlow, you can find these using:
| **Function** | **Purpose** |
|-------------|------------|
| `tf.argmax(tensor, axis)` | Returns the index of the maximum value along a given axis |
| `tf.argmin(tensor, axis)` | Returns the index of the minimum value along a given axis |

In [123]:
# Find positional maximum and minimum along axis 1 (row-wise)
max_indices = tf.argmax(tensor, axis=1)  # Index of max value in each row
min_indices = tf.argmin(tensor, axis=1)  # Index of min value in each row

In [124]:
print("Tensor:\n", tensor.numpy())
print("Max Position Indices:", max_indices.numpy())  # Output: [1, 2]
print("Min Position Indices:", min_indices.numpy())  # Output: [2, 1]

Tensor:
 [[ 36  26  -5 -40  18 -25  -6  10]
 [  3  -6 -46  33 -32 -10  46  40]
 [ 28 -28 -14 -35  12  21 -19 -32]
 [ 48 -34  32  43 -16 -15  27   3]
 [-33 -33 -35 -22  47  49  35 -27]
 [-10  32  15  35 -45   5 -22  48]
 [ 24  24  22  -4 -14 -42  18 -45]
 [-45  -2 -13   1  43 -46  48 -47]]
Max Position Indices: [0 6 0 0 5 7 0 6]
Min Position Indices: [3 2 3 1 2 4 7 7]


In [129]:
# Find the overall max and min positional values
overall_max_index = tf.argmax(tf.reshape(tensor, [-1])) # Flatten the tensor and get overall argmax
overall_min_index = tf.argmin(tf.reshape(tensor, [-1])) # Flatten the tensor and get overall argmin
tf.reshape(tensor, [-1])[overall_max_index] == tf.reduce_max(tf.reshape(tensor, [-1]))
assert tf.reshape(tensor, [-1])[overall_min_index] == tf.reduce_min(tf.reshape(tensor, [-1]))

>Note: **Use `tf.gather()` Instead of Direct Indexing**
- TensorFlow **does not allow** direct indexing on tensors like NumPy (`tensor[index]`).
- `tf.gather()` is the correct way to **fetch elements by index** in TensorFlow.


## Squeezing a tensor: remove all **singleton dimensions** (dimensions with size `1`).
This is useful when you have unnecessary extra dimensions that don‚Äôt carry any information and you want to simplify the tensor shape.

 - **Signature:** `tf.squeeze()`


In [130]:
# Create a tensor
Z = tf.constant(tf.random.uniform([1,1,1,24,1], 2, 19, seed=5))
Z

<tf.Tensor: shape=(1, 1, 1, 24, 1), dtype=float32, numpy=
array([[[[[ 8.305319 ],
          [14.736302 ],
          [ 9.158327 ],
          [ 8.556081 ],
          [13.738097 ],
          [18.751696 ],
          [ 9.115873 ],
          [18.608292 ],
          [ 3.108357 ],
          [12.848146 ],
          [14.04483  ],
          [12.017685 ],
          [14.425882 ],
          [ 9.267995 ],
          [ 5.0534716],
          [15.717931 ],
          [ 6.8860636],
          [ 2.9985175],
          [ 7.9140534],
          [14.921027 ],
          [ 6.069438 ],
          [18.644289 ],
          [ 8.900104 ],
          [ 5.51715  ]]]]], dtype=float32)>

In [131]:
# check the shape
Z.shape

TensorShape([1, 1, 1, 24, 1])

In [132]:
# Squeeze the tensor
Z_squeezed = tf.squeeze(Z)
Z_squeezed

<tf.Tensor: shape=(24,), dtype=float32, numpy=
array([ 8.305319 , 14.736302 ,  9.158327 ,  8.556081 , 13.738097 ,
       18.751696 ,  9.115873 , 18.608292 ,  3.108357 , 12.848146 ,
       14.04483  , 12.017685 , 14.425882 ,  9.267995 ,  5.0534716,
       15.717931 ,  6.8860636,  2.9985175,  7.9140534, 14.921027 ,
        6.069438 , 18.644289 ,  8.900104 ,  5.51715  ], dtype=float32)>

## One-Hot encoding
**One-Hot Encoding** is a technique used to convert categorical data into a binary matrix representation. In TensorFlow, you can achieve this using `tf.one_hot()`.

### **How Does One-Hot Encoding Work?**
- Each unique category is assigned an **integer index**.
- A **binary vector** is created where only the index corresponding to that category is `1`, and all others are `0`.

In [135]:
# create a list of indices
mylist = range(4) # could be red, green, purple, blue

In [136]:
# one hot ecode our list of indices
tf.one_hot(mylist, depth=4) # depth = 4 = len(mylist)

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)>

In [137]:
# Specify custom values for one hot encoding
tf.one_hot(mylist, depth=4, on_value="I love DL", off_value="I like ML")

<tf.Tensor: shape=(4, 4), dtype=string, numpy=
array([[b'I love DL', b'I like ML', b'I like ML', b'I like ML'],
       [b'I like ML', b'I love DL', b'I like ML', b'I like ML'],
       [b'I like ML', b'I like ML', b'I love DL', b'I like ML'],
       [b'I like ML', b'I like ML', b'I like ML', b'I love DL']],
      dtype=object)>

## Some mathematical functions
- tf.math.log
- tf.math.square
- tf.math.sqrt
- tf.math.sin
  etc...
> Most function takes input as non-int type so better take float datatype.

In [139]:
# Create a tensor
K = tf.range(1, 10)
K = tf.cast(K, tf.float32)
K

<tf.Tensor: shape=(9,), dtype=float32, numpy=array([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)>

In [140]:
tf.math.log(K), tf.math.square(K), tf.math.sqrt(K), tf.math.sin(K)

(<tf.Tensor: shape=(9,), dtype=float32, numpy=
 array([0.       , 0.6931472, 1.0986123, 1.3862944, 1.609438 , 1.7917595,
        1.9459102, 2.0794415, 2.1972246], dtype=float32)>,
 <tf.Tensor: shape=(9,), dtype=float32, numpy=array([ 1.,  4.,  9., 16., 25., 36., 49., 64., 81.], dtype=float32)>,
 <tf.Tensor: shape=(9,), dtype=float32, numpy=
 array([1.       , 1.4142135, 1.7320508, 2.       , 2.236068 , 2.4494898,
        2.6457512, 2.828427 , 3.       ], dtype=float32)>,
 <tf.Tensor: shape=(9,), dtype=float32, numpy=
 array([ 0.84147096,  0.9092974 ,  0.14112   , -0.7568025 , -0.9589243 ,
        -0.2794155 ,  0.6569866 ,  0.98935825,  0.4121185 ], dtype=float32)>)

## Tensors and NumPy
TensorFlow interact beautifully with NumPy arrays.

In [142]:
# Create a tensor directly from a NumPy array
J = tf.constant(np.array([1., 4., 9.]))
J

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([1., 4., 9.])>

In [143]:
# Convert the tensor back to a NumPy array
np.array(J), type(np.array(J))

(array([1., 4., 9.]), numpy.ndarray)

In [144]:
# Convert tensor J to a NumPy array
J.numpy(), type(J.numpy())

(array([1., 4., 9.]), numpy.ndarray)

In [145]:
# do slicing 
J.numpy()[0]

np.float64(1.0)

In [147]:
# The default types of each are slightly different
np_tensor = tf.constant(np.array([1., 4., 9.]))
tf_tensor = tf.constant ([1., 4., 9.])
np_tensor.dtype, tf_tensor.dtype

(tf.float64, tf.float32)

**üóù Note:** you can see both are of different type 