<a href="https://colab.research.google.com/github/gt-cse-6040/bootcamp/blob/main/Module%201/Session%2024/s24nb0_numpy_multiply_dot_matrix_multiplication_FA25.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Numpy multiply and matrix multiplication

## When to use np.dot(), np.matmul(), np.multiply()

## Today we are covering the three above functions for numpy, what they do, and when you should use each. In general, each of these functions does some form of matrix multiplication.

We will focus on 2D arrays in this discussion, as most of what we will be working with is 2D.

There are significant differences between np.dot() and np.matmul() in higher than 2-D space, so while the `rules` below can be considered `guidelines` for 2-D and below, they are `unbreakable rules` in 3-D and higher spaces.

We recommend, therefore, that you apply the rules in all dimensions, so that you never run into issues.

In [None]:
import numpy as np

### np.multiply() function

Use this function to return the element-wise multiplication of two arrays.

The arrays must be identical in their shapes (same number of rows and columns in each).

If you have not yet watched the Understanding Array Shapes video in the Topic 10 Module on Canvas/edX, please do so. It provides foundational material required for this discussion.

Documentation link:  https://numpy.org/doc/stable/reference/generated/numpy.multiply.html

Good detailed explanation of the multiply() function:  https://www.sharpsightlabs.com/blog/numpy-multiply/


**If you are going to be required to do an element-wise multiplication of two (or more) matrices in the class, you will be told that this is what you want to do, either directly or via a hint.**

![np_multiply.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/np_multiply.png?raw=1)

In [None]:
# let's look at an example
A = np.array([[1,2,3], [4,5,6]])
B = np.array([[1,2,3], [4,5,6]])

print("Matrix A is:\n",A)
print("Matrix B is:\n",B)

C = np.multiply(A,B)
print("Matrix multiplication of matrix A and B is:\n",C)

#### Questions on np.multiply()?

## Now let's look at matrix multiplication conceptually.

### Matrix multiplication returns the "matrix product" of two given arrays.

### What is the definition of the "matrix product"?

Fundamentally, it is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix.

If `a` is an `*m × n*` matrix and `b` is an `*n × p*` matrix, then the matrix product

**c = ab**

is defined to be the *m × p* matrix such that:

![case_1_summation.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/case_1_summation.png?raw=1)

Each value of the resulting matrix, `c`, is the `dot product` of the `*i*th` row of `a` and the `*j*th` column of `b`.

The matrix itself is called the **dot product matrix**.

**`Why is dot product useful?`**

What the dot product formula is really saying is that take two vectors of equal length (each dimension is 1, n). We multiply each term in list A by the corresponding term in list B and then add all n terms together.

Dot product is the basis of matrix multiplication and if you understand dot product well, it becomes easier to understand the intent behind the way two matrices are multiplied together

![dot_useful.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/dot_useful.png?raw=1)

**`Matrix Multiplication`**

As we noted above in our definition, for matrix multiplication to work, the columns of the second matrix must have the same number of entries as do the rows of the first matrix.

![matrix_mult.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/matrix_mult.png?raw=1)

**There are just 3 simple rules to help you remember how to do the calculations.**

`1. Rows come first, so first matrix provides row numbers.`


`2. Columns come second, so second matrix provide column numbers.`


`3. The 'inner' values must be equal to each other.`


`Matrix multiplication is just a way of organizing vectors we want to find the dot product of.`

**`row x column vs. column x row`**

![row_x_column.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/row_x_column.png?raw=1)

In the picture above, the left matrix is [1,3], while the right matrix is [3,1]

In the two rules, the left matrix provides the rows, so there will be 1 row in the result.

The right matrix provides the columns, so there will be 1 column in the result.

The inner values are both 3, so they equal each other.

**Note that in the below examples we are using np.dot(), for simplicity.**

In [None]:
a = np.array([[2,3,4]])
b = np.array([[6],
              [4],
              [3]])
print("array shapes")
display(a.shape)
display(b.shape)

c = np.dot(a,b)
print("shape of c")
display(c.shape)

display(c)

![column_x_row.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/column_x_row.png?raw=1)

In the picture above, the left matrix is [3,1], while the right matrix is [1,3]

In the two rules, the left matrix provides the rows, so there will be 3 rows in the result.

The right matrix provides the columns, so there will be 3 columns in the result.

The inner values are both 1, so they equal each other.

In [None]:
a = np.array([[2],
              [3],
              [4]])
b = np.array([[6,4,3]])

display(a.shape)
display(b.shape)

c = np.dot(a,b)
display(c.shape)
c

**`Two ways of looking at a matrix multiplication`**

![two_ways_1.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/two_ways_1.png?raw=1)

![two_ways_2.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/two_ways_2.png?raw=1)

**`So what do you do when the two array shapes do not correspond to the rules?`**

Remember the transpose function, from the video on array shapes?

#### Numpy’s transpose() function is used to reverse the dimensions of the given array. It changes the row elements to column elements and column to row elements.

In [None]:
a = np.array([[2,3,4]])
b = np.array([[6,4,3]])

display(a.shape)
display(b.shape)

# c = np.dot(a,b)  # errors out

In [None]:
display(a.T.shape)
display(b.shape)

c = np.dot(a.T,b)  # this is good, as we are transposing a
display(c.shape)
c

#### As a final note, the Wikipedia page for matrix multiplication is a decent starting reference.

https://en.wikipedia.org/wiki/Matrix_multiplication

### There are two functions in numpy to do matrix multiplication (and therefore return a dot product):

#### 1. np.matmul() (also using `@` notation)

#### 2. np.dot().

Both functions generally perform the same operations, but which function to use depends on the shapes of the two arrays (also scalar values) that you are using.

**`So why not just use loops?`**

1. Loops are slow in real life, when you have millions of rows.

2. Matrix multiplication can quickly change numbers using optimized math in the back end (and some of the fastest frameworks are built with C++ to speed them up even further).

Here is what such a loop would look like:

![def_dot.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/def_dot.png?raw=1)

It should be very easy to see that this is a very inefficient way of computing the dot product.

### What are your questions on the matrix multiplication (in general)?

## Now let's look at when to use np.dot() and np.matmul() (AKA @).

### Here are the rules for when to use each:

Please hold your questions on these until we have a chance to go through the examples below, then ask in chat.

#### Use np.matmul() (or `@`) in the following scenario:

1. **If both `a` and `b` are 2-D arrays**, it is matrix multiplication, and use matmul or a @ b.

2. **If `a` is a 2-D (or higher) array and `b` is a 3-D (or higher) array** you **MUST USE** matmul or a @ b.

#### Use np.dot() (or `.`) in the following scenarios:

1. **If both `a` and `b` are 1-D arrays**, it is inner product of vectors (without complex conjugation).

2. **If `a` is an N-D array and `b` is a 1-D array**, it is a sum product over the last axis of a and b.

#### Finally, use np.multiply() (or `*`) in the following scenario:

3. **If either `a` or `b` is 0-D (scalar)**, it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.

One last note:  When working with nothing higher than 2-D arrays, np.dot() will generally work in all of the above scenarios. This is NOT PREFERRED, and your code will certainly be less efficient, but you will get the correct answer to the exercise, when your arrays are 2-D or lower.

Additionally, in the case that both are 1-D arrays, np.matmul() will return the same as np.dot(). Again, not preferred, but this will return the correct result.

**From the documentation:**

np.matmul() differs from np.dot() in two important ways:

1. Multiplication by scalars is not allowed in np.matmul().

2. Stacks of matrices are broadcast together as if the matrices were elements.

The last point makes it clear that np.dot() and np.matmul() methods behave differently when passed 3D (or higher dimensional) arrays.

Documentation link for np.matmul():  https://numpy.org/doc/stable/reference/generated/numpy.matmul.html

Documentation linkf for np.dot():  https://numpy.org/doc/stable/reference/generated/numpy.dot.html

### And as we noted above, our guidance is to learn and apply the rules in all cases, so that you are consistent, and you never run into problems in your exercises.

**The syntax for the two functions is similar:**

`numpy.matmul (a, b, out=None)`

`a @ b`

`numpy.dot (a, b, out=None)`

In the parameters:

a is the left array

b is the right array

ndarray, which is optional. This is mostly not used.

### Both `a` and `b` are 2-D arrays:

![both%202%20D%20arrays.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/both%202%20D%20arrays.png?raw=1)

![case_2_summation.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/case_2_summation.png?raw=1)

In [None]:
a = np.array([[3,4,5],
              [6,7,8]])
b = np.array([[10,11],
             [12,13],
             [14,15]])

display(a.shape)
display(b.shape)

In [None]:
np.matmul(a,b)

In [None]:
a@b

In [None]:
np.dot(a,b)

As we can see, both functions return the same result for 2-D arrays.

### Both `a` and `b` are 1-D arrays:

![both%201%20D%20arrays.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/both%201%20D%20arrays.png?raw=1)

In [None]:
a = np.array([3,4,5])
b = np.array([7,8,9])

display(a.shape)
display(b.shape)

In [None]:
np.dot(a,b)

In [None]:
np.matmul(a,b)

Again, we see in this case that both functions return the same result.

### `a` is an N-D array and `b` is a 1-D array.

![A%201D%20B%202D%20array.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%201/Session%206/A%201D%20B%202D%20array.png?raw=1)

In [None]:
a = np.array([[1,2,3],[1,2,1]])
# b is a list, which is the same as a 1-D array
b = [2,3,4]

display(a.shape)
# display(b.shape)  # errors out, why?
# display(len(b))

In [None]:
np.dot(a,b)

In [None]:
np.matmul(a,b)

In [None]:
a @ b

### Either `a` or `b` is scalar.

#### From the rules above:

**If either a or b is 0-D (scalar)**, it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.

We prefer using np.multiply() in this case, although np.dot() will also work.

In [None]:
a = np.array([3,4,5])
b = 5

display(a.shape)
#  display(b.shape) # error, because a scalar does not have a shape

In [None]:
np.multiply(a,b)

In [None]:
a * b

In [None]:
np.dot(a,b)

In [None]:
# a @ b  # errors out, why?

### Finally, let's look at a pair of 3-D arrays, and why understanding and applying the above rules are important.

With 3-D arrays, the best way to visualize them is as **CUBES.**

In [None]:
# create two 3x3 numpy arrays
a = np.random.randint(0,10,(3,3,3))
b = np.random.randint(0,10,(3,3,3))
print('Shape of a')
display(a.shape)
print('Shape of b')
display(b.shape)

In [None]:
display(a)
display(b)

In [None]:
# So now perform matrix multiplication on them.
c = np.dot(a,b)
display(c.shape)

In [None]:
d = np.matmul(a,b)
display(d.shape)

In [None]:
e = a @ b
display(e.shape)

We can see that using np.dot() (incorrectly) adds a 4th dimension to the result, whereas using np.matmul() (or a@b) keeps the correct dimensions of 3-D.

So this is why understanding and correctly applying the rules above is so important.

Here is a good (simple) article with an explanation:  https://www.geeksforgeeks.org/numpy-3d-matrix-multiplication/

## What are your questions?