In [2]:
import torch
import tensorflow as tf
import numpy as np

## Multiplication

#### Multiplying a tensor with a scalar

In [2]:
# Tensorflow: mutiply a tensor with a scalar
tensor = tf.constant(
    [[1, 2], 
     [3, 4]], 
    dtype=tf.float32)
scalar = tf.constant(2, dtype=tf.float32)
print(tensor * scalar)

tf.Tensor(
[[2. 4.]
 [6. 8.]], shape=(2, 2), dtype=float32)


In [3]:
# PyTorch: mutiply a tensor with a scalar 
tensor = torch.tensor(
    [[1, 2], 
     [3, 4]], 
    dtype=torch.float32)
scalar = torch.tensor(2, dtype=torch.float32)
print(tensor * scalar)

tensor([[2., 4.],
        [6., 8.]])


#### Multiplying a tensor with a vector

In [19]:
# Tensorflow: mutiply a tensor with a vector
tensor = tf.constant(
    [[1, 2],
        [3, 4]],
    dtype=tf.float32)
vector = tf.constant([2, 3], dtype=tf.float32)
print(tensor.shape, " ", vector.shape)
print(tensor * vector)

(2, 2)   (2,)
tf.Tensor(
[[ 2.  6.]
 [ 6. 12.]], shape=(2, 2), dtype=float32)


In [9]:
# PyTorch: mutiply a tensor with a vector
tensor = torch.tensor(
    [[1, 2],
        [3, 4],
        [5, 6]],
    dtype=torch.float32)
vector = torch.tensor([0.5, 1], dtype=torch.float32)
print(tensor.shape, " ", vector.shape)
print(tensor * vector)

torch.Size([3, 2])   torch.Size([2])
tensor([[0.5000, 2.0000],
        [1.5000, 4.0000],
        [2.5000, 6.0000]])


#### Multiplying a tensor with a matrix

In [25]:
# Tensorflow: mutiply a tensor with a matrix
tensor = tf.constant(
    [
        [
            [1, 2, 3],
            [4, 5, 6],
            [7, 8, 9]
        ],
        [
            [10, 11, 12],
            [13, 14, 15],
            [16, 17, 18]
        ]
    ],dtype=tf.float32)
matrix = tf.constant(
    [
        [2, 0, 1],
        [1, 2, 3],
        [0, 1, 2]
    ],
    dtype=tf.float32
)
print(tensor.shape, " ", matrix.shape)
print(tensor * matrix)

(2, 3, 3)   (3, 3)
tf.Tensor(
[[[ 2.  0.  3.]
  [ 4. 10. 18.]
  [ 0.  8. 18.]]

 [[20.  0. 12.]
  [13. 28. 45.]
  [ 0. 17. 36.]]], shape=(2, 3, 3), dtype=float32)


In [26]:
# PyTorch: mutiply a tensor with a matrix
tensor = torch.tensor(
    [
        [
            [1, 2, 3],
            [4, 5, 6],
            [7, 8, 9]
        ],
        [
            [10, 11, 12],
            [13, 14, 15],
            [16, 17, 18]
        ]
    ],
    dtype=torch.float32
)
matrix = torch.tensor(
    [
        [2, 0, 1],
        [1, 2, 3],
        [0, 1, 2]
    ],
    dtype=torch.float32
)
print(tensor.shape, " ", matrix.shape)
print(tensor * matrix)

torch.Size([2, 3, 3])   torch.Size([3, 3])
tensor([[[ 2.,  0.,  3.],
         [ 4., 10., 18.],
         [ 0.,  8., 18.]],

        [[20.,  0., 12.],
         [13., 28., 45.],
         [ 0., 17., 36.]]])


#### Dot product

__Example analogy__: Matching people to projects

Think of tensor1 as a group of people, with each row representing a person's skills. For instance, the person represented by the first row in tensor1 has skills valued at `[1, 2]`.

Now, think of tensor2 as a list of projects, with each column representing the required skills for a project. For example, the project represented by the first column in tensor2 requires skills valued at `[5, 6]`.

The matrix multiplication operation (torch.matmul) is like matching each person from tensor1 to each project in tensor2, and calculating a score that represents how well the person's skills fit the project's requirements.

For the first person (with skills `[1, 2]`) and the first project (requiring skills `[5, 6]`), the match score is calculated as `(1x5) + (2x6)`, which equals `17`. This means that the first person's skills `[1, 2]` meet the first project's requirements `[5, 6]` with a score of `17`.

We repeat this operation for every combination of people and projects, and end up with a new tensor that represents the match scores of each person (from tensor1) for each project (from tensor2). Therefore, torch.matmul(tensor1, tensor2) gives us an idea of how well the skills of each person match the requirements of each project, with higher scores indicating better matches.

In [47]:
# Tensorflow: Dot product of two tensors
tensor1 = tf.constant(
    [
        [1, 2],
        [3, 4]
    ], 
    dtype=tf.float32)
tensor2 = tf.constant(
    [
        [5, 7],
        [6, 8]
    ],
    dtype=tf.float32)
print(tf.tensordot(tensor1, tensor2, axes=1))

(2, 2)   (2, 2)
tf.Tensor(
[[17. 23.]
 [39. 53.]], shape=(2, 2), dtype=float32)


In [48]:
# PyTorch: Dot product of two tensors
tensor1 = torch.tensor(
    [
        [1, 2],
        [3, 4]
    ],
    dtype=torch.float32
)
tensor2 = torch.tensor(
    [
        [5, 7],
        [6, 8]
    ],
    dtype=torch.float32
)
# [1,2]   [5,7]   [1x5 + 2x6, 1x7 + 2x8]   [17, 23]
# [3,4] . [6,8] = [3x5 + 4x6, 3x7 + 4x8] = [39, 53]
print(torch.matmul(tensor1, tensor2))

torch.Size([2, 2])   torch.Size([2, 2])
tensor([[17., 23.],
        [39., 53.]])


#### Element-wise multiplication VS Dot product

__Element-wise multiplication analogy__:

Let's say you're a farmer with three apple trees, and you're trying to predict your apple yield for next year. You know that the number of apples each tree produces depends on the amount of water and sunlight it gets.

You represent the amount of water each tree got this year as a vector:
```
Water = [15, 20, 25]  # in gallons
```
And you do the same for sunlight:
```
Sunlight = [200, 250, 300]  # in hours
```
You know that each tree's apple yield is proportional to the product of the amount of water and sunlight it gets. So you can predict next year's yield for each tree using element-wise multiplication:
```
Yield = Water * Sunlight = [15*200, 20*250, 25*300] = [3000, 5000, 7500]  # in apples
```
Here, element-wise multiplication helped you make a simple model of how water and sunlight affect your apple yield.

__Dot product analogy__:

Suppose you're a game developer, and you're programming the behavior of an NPC (non-player character) who follows the player around in a 2D world. Let's say the NPC is at point A, and the player is at point B. You can represent the positions of A and B as 2D vectors:
```
A = [3, 2]
B = [5, 4]
```
The difference vector, D, which points from A to B, can be calculated as B - A:
```
D = [5-3, 4-2] = [2, 2]
```
Now, let's say the NPC can only move in four cardinal directions: up, down, left, and right, represented as unit vectors:
```
Up = [0, 1]
Down = [0, -1]
Left = [-1, 0]
Right = [1, 0]
```
You want to decide in which direction the NPC should move next to get closer to the player. This is where the dot product comes in. The dot product of D with each of the four directions will tell you how much D aligns with each direction:
```
Dot(D, Up) = 2*0 + 2*1 = 2
Dot(D, Down) = 2*0 + 2*(-1) = -2
Dot(D, Left) = 2*(-1) + 2*0 = -2
Dot(D, Right) = 2*1 + 2*0 = 2
```
The direction with the highest dot product is the best direction for the NPC to move in. In this case, both Up and Right have the highest dot product (2), so the NPC should move either up or right to get closer to the player.

### Addition

In [50]:
# Tensorflow: Add a scalar to a tensor
tensor = tf.constant(
    [[1, 2],
        [3, 4]],
    dtype=tf.float32)
scalar = tf.constant(2, dtype=tf.float32)
print(tensor + scalar)

tf.Tensor(
[[3. 4.]
 [5. 6.]], shape=(2, 2), dtype=float32)


In [11]:
# PyTorch: Add a scalar to a tensor
tensor = torch.tensor(
    [[1, 2],
        [3, 4]],
    dtype=torch.float32)
scalar = torch.tensor(2, dtype=torch.float32)
print(tensor + scalar)     

tensor([[3., 4.],
        [5., 6.]])


In [13]:
# Tensorflow: Add a vector to a tensor
tensor = tf.constant(
    [[1, 2],
        [3, 4]],
    dtype=tf.float32)
vector = tf.constant([-1, -.5], dtype=tf.float32)
print(tensor + vector)

tf.Tensor(
[[0.  1.5]
 [2.  3.5]], shape=(2, 2), dtype=float32)


### Transpose

In [5]:
# Define a tensor in PyTorch
A = torch.tensor([
    [
        [1, 2],
        [3, 4]
    ],
    [
        [5, 6],
        [7, 8]
    ]
    ], dtype=torch.float32)
A

tensor([[[1., 2.],
         [3., 4.]],

        [[5., 6.],
         [7., 8.]]])

In [12]:
# Transposed A
A_t = torch.transpose(A, 0, 1)
A_t

tensor([[[1., 2.],
         [5., 6.]],

        [[3., 4.],
         [7., 8.]]])

In [14]:
# Define a tensor in TensorFlow
B = tf.constant([
    [
        [1, 2],
        [3, 4]
    ],
    [
        [5, 6],
        [7, 8]
    ]
    ], dtype=tf.float32)
B

<tf.Tensor: shape=(2, 2, 2), dtype=float32, numpy=
array([[[1., 2.],
        [3., 4.]],

       [[5., 6.],
        [7., 8.]]], dtype=float32)>

In [15]:
# Transposed B
B_t = tf.transpose(B, perm=[1, 0, 2])
B_t

<tf.Tensor: shape=(2, 2, 2), dtype=float32, numpy=
array([[[1., 2.],
        [5., 6.]],

       [[3., 4.],
        [7., 8.]]], dtype=float32)>

## Lp Norms

#### L1 Norm

```
|x| + |y| ≤ 1
```
Example:
```
L1 norm = |3| + |4| = 7
```

#### L2 Norm

```
sqrt(x² + y²) ≤ 1
```
Example:
```
L2 norm = sqrt(3² + 4²) = 5
```

#### Frobenuis Norm

measures the "size" of a matrix by adding up the squares of all its elements and then taking the square root.

$||A||_F = sqrt(sum({A_{ij}}^2))$

Example:
Consider a 2x2 matrix A:
```
A = [ 1 2      3 4 ]
```
The Frobenius norm of this matrix would be calculated as follows:
```
||A||_F = sqrt((1)^2 + (2)^2 + (3)^2 + (4)^2)
= sqrt(1 + 4 + 9 + 16)
= sqrt(30)
```

### Eigendecomposition

The eigendecomposition of a matrix is a type of decomposition that involves representing a square matrix as the product of its eigenvalues and eigenvectors. This is useful because eigenvalues and eigenvectors have many nice properties that make them easier to work with.

The basic process of eigendecomposition involves two steps:

1. Find the eigenvalues of the matrix.
2. For each eigenvalue, find the corresponding eigenvectors.

Let's solve this for the matrix A.

**Step 1: Find the Eigenvalues**

We first need to solve the characteristic equation, which is det(A - λI) = 0, where A is the matrix in question, λ are the eigenvalues, I is the identity matrix, and det is the determinant of a matrix.

Our matrix A is:

```
A = [[4,1],[2,3]]
```

So, A - λI is:

```
A - λI = [[4 - λ, 1], [2, 3 - λ]]
```

Now, we take the determinant of this and set it equal to zero. The determinant of a 2x2 matrix [[a,b], [c,d]] is (a*d - b*c). So:

```
det(A - λI) = (4 - λ)*(3 - λ) - 2*1
```

Setting this equal to zero gives us a quadratic equation:

```
(4 - λ)*(3 - λ) - 2 = 0
```

which simplifies to

```
λ^2 - 7λ + 10 = 0
```

The solutions to this equation are the eigenvalues. Solving this quadratic equation (for example, by using the quadratic formula) gives us λ1 = 2, λ2 = 5.

**Step 2: Find the Eigenvectors**

Now that we have the eigenvalues, we need to find the corresponding eigenvectors. To do this, we substitute each eigenvalue back into the equation (A - λI)v = 0, and solve for v.

1. For λ1 = 2, the matrix A - λI is:

```
A - λI = [[4 - 2, 1], [2, 3 - 2]] = [[2, 1], [2, 1]]
```

The eigenvectors are found by plugging the eigenvalues back into the equation (A - λI)v = 0 and finding a non-zero vector v that satisfies it.

For λ₁ = 2, we solve:
```
[[4 - 2, 1], [2, 3 - 2]] [v₁, v₂] = 0
=> [[2, 1], [2, 1]] [v₁, v₂] = 0
```

This gives us the eigenvector v₁ = [1, -2] (we find a vector that satisfies the equation, and often we will normalize it but in this case we'll keep it as is for simplicity).

Similarly, for λ₂ = 5, we get the eigenvector v₂ = [1, 1].

2. For λ2 = 5, the matrix A - λI is:

```
A - λI = [[4 - 5, 1], [2, 3 - 5]] = [[-1, 1], [2, -2]]
```

Setting this equal to zero gives us a system of linear equations:

```
-v1 + v2 = 0
2*v1 - 2*v2 = 0
```

Again, these equations are equivalent, and we can take any non-zero solution, such as v = [1, 1]. This is the eigenvector corresponding to λ2 = 5.

**Step 3: Formulate the Eigendecomposition**

So we have the eigenvalues and corresponding eigenvectors, which are:

```
λ1 = 2, v1 = [1, -2]
λ2 = 5, v2 = [1, 1]
```

We typically write the eigendecomposition of A as A = PDP^-1, where P is a matrix whose columns are the eigenvectors of A, D is a diagonal matrix whose entries are the eigenvalues of A, and P^-1 is the inverse of P.

In this case, we have:

```
P = [v₁, v₂] = [[1, 1], [-2, 1]]
D = [[λ₁, 0], [0, λ₂]] = [[2, 0], [0, 5]]
P^-1 = P.inverse()
```
Where P is a matrix composed of the eigenvectors; D is a diagonal matrix composed of the eigenvalues; and P^-1 is the inverse of P.

So the eigendecomposition of A is given by $A = PDP^{-1}$.

In return, from $PDP^{-1}$ we can get A

Given:
```
P = [[1, 1], [-2, 1]]
D = [[2, 0], [0, 5]]
```
and we have previously computed P^-1 to be:
```
P^-1 = [[1/3, -1/3], [2/3, 1/3]]
```

The calculation of PDP^-1 involves a series of matrix multiplications. Let's compute that again:

First, we compute PD:
```
PD = P * D
= [[1, 1], [-2, 1]] * [[2, 0], [0, 5]]
= [[12 + 10, 10 + 15], [-22 + 10, -20 + 15]]
= [[2, 5], [-4, 5]]
```

Next, we compute PDP^-1:
```
PDP^-1 = PD * P^-1
= [[2, 5], [-4, 5]] * [[1/3, -1/3], [2/3, 1/3]]
= [[21/3 + 52/3, 2*(-1/3) + 51/3], [-41/3 + 52/3, -4(-1/3) + 5*1/3]]
= [[4, 1], [2, 3]]
```