<a href="https://colab.research.google.com/github/Nurlyssultan/ML-DS-Cheat-Sheet/blob/main/Numpy_Cheat_Sheet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A Quick Introduction to Numerical Data Manipulation with Python and NumPy

## What is NumPy?

[NumPy](https://numpy.org/doc/stable/index.html) stands for numerical Python. It's the backbone of all kinds of scientific and numerical computing in Python.


## Why NumPy?

You can do numerical calculations using pure Python. In the beginning, you might think Python is fast but once your data gets large, you'll start to notice slow downs.

One of the main reasons you use NumPy is because it's fast. Behind the scenes, the code has been optimized to run using C. Which is another programming language, which can do things much faster than Python.

The benefit of this being behind the scenes is you don't need to know any C to take advantage of it. You can write your numerical computations in Python using NumPy and get the added speed benefits.

If your curious as to what causes this speed benefit, it's a process called vectorization. [Vectorization](https://en.wikipedia.org/wiki/Vectorization) aims to do calculations by avoiding loops as loops can create potential bottlenecks.

NumPy achieves vectorization through a process called [broadcasting](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html#module-numpy.doc.broadcasting).

## 0. Importing NumPy

In [1]:
import numpy as np

# Check the version
print(np.__version__)

1.25.2


## 1. DataTypes and attributes


In [2]:
# 1-dimensonal array, also referred to as a vector
a1 = np.array([1, 2, 3])

# 2-dimensional array, also referred to as matrix
a2 = np.array([[1, 2.0, 3.3],
               [4, 5, 6.5]])

# 3-dimensional array, also referred to as a matrix
a3 = np.array([[[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]],
                [[10, 11, 12],
                 [13, 14, 15],
                 [16, 17, 18]]])

In [3]:
a1.shape, a1.ndim, a1.dtype, a1.size, type(a1)

((3,), 1, dtype('int64'), 3, numpy.ndarray)

In [4]:
a2.shape, a2.ndim, a2.dtype, a2.size, type(a2)

((2, 3), 2, dtype('float64'), 6, numpy.ndarray)

In [5]:
a3.shape, a3.ndim, a3.dtype, a3.size, type(a3)

((2, 3, 3), 3, dtype('int64'), 18, numpy.ndarray)

### Anatomy of an array

Key terms:
* **Array** - A list of numbers, can be multi-dimensional.
* **Scalar** - A single number (e.g. `7`).
* **Vector** - A list of numbers with 1-dimension (e.g. `np.array([1, 2, 3])`).
* **Matrix** - A (usually) multi-dimensional list of numbers (e.g. `np.array([[1, 2, 3], [4, 5, 6]])`).

a2

## 2. Creating arrays

* `np.array()`
* `np.ones()`
* `np.zeros()`
* `np.random.rand(5, 3)`
* `np.random.randint(10, size=5)`
* `np.random.seed()` - pseudo random numbers

In [8]:
# Create a simple array
simple_array = np.array([1, 2, 3])
simple_array

array([1, 2, 3])

In [9]:
simple_array = np.array((1, 2, 3))
simple_array, simple_array.dtype

(array([1, 2, 3]), dtype('int64'))

In [10]:
# Create an array of ones
ones = np.ones((10, 2))
ones

array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.]])

In [11]:
# The default datatype is 'float64'
ones.dtype

dtype('float64')

In [12]:
# You can change the datatype with .astype()
ones.astype(int)

array([[1, 1],
       [1, 1],
       [1, 1],
       [1, 1],
       [1, 1],
       [1, 1],
       [1, 1],
       [1, 1],
       [1, 1],
       [1, 1]])

In [13]:
# Create an array of zeros
zeros = np.zeros((5, 3, 3))
zeros

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

In [14]:
zeros.dtype

dtype('float64')

In [15]:
# Create an array within a range of values
range_array = np.arange(0, 10, 2)
range_array

array([0, 2, 4, 6, 8])

In [16]:
# Random array
random_array = np.random.randint(10, size=(5, 3))
random_array

array([[8, 0, 4],
       [7, 7, 5],
       [2, 7, 2],
       [0, 0, 2],
       [1, 0, 7]])

In [17]:
# Random array of floats (between 0 & 1)
np.random.random((5, 3))

array([[0.77289001, 0.55910452, 0.77505403],
       [0.96695001, 0.53099171, 0.55587505],
       [0.76253041, 0.00344647, 0.43558407],
       [0.82501644, 0.57667688, 0.93152114],
       [0.01914063, 0.13681395, 0.15061463]])

In [18]:
np.random.random((5, 3))

array([[0.00113812, 0.52734613, 0.59970252],
       [0.60831144, 0.16156476, 0.77385713],
       [0.37394521, 0.67674385, 0.92764362],
       [0.78728415, 0.13483956, 0.72600094],
       [0.04392253, 0.56320572, 0.03263898]])

In [19]:
# Random 5x3 array of floats (between 0 & 1), similar to above
np.random.rand(5, 3)

array([[0.02991605, 0.24750135, 0.7125273 ],
       [0.69335338, 0.24564803, 0.21402424],
       [0.67979309, 0.69611799, 0.11017338],
       [0.77582959, 0.17772384, 0.80107197],
       [0.26507493, 0.56516986, 0.9495781 ]])

In [21]:
np.random.rand(5, 3)

array([[0.62735075, 0.08519877, 0.69435221],
       [0.88717293, 0.35061696, 0.95013075],
       [0.61963845, 0.42265947, 0.13842524],
       [0.44482289, 0.11697494, 0.79559523],
       [0.02817484, 0.10305526, 0.30682099]])

## `Numpy Seed`
NumPy uses pseudo-random numbers, which means, the numbers look random but aren't really, they're predetermined.


To do this, you can use [`np.random.seed()`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.seed.html).


In [24]:
# Set random seed to 0
np.random.seed(0)

# Make 'random' numbers
np.random.randint(10, size=(5, 3))

array([[5, 0, 3],
       [3, 7, 9],
       [3, 5, 2],
       [4, 7, 6],
       [8, 8, 1]])

In [27]:
# Make more random numbers
np.random.randint(10, size=(5, 3))

array([[6, 7, 7],
       [8, 1, 5],
       [9, 8, 9],
       [4, 3, 0],
       [3, 5, 0]])

Because `np.random.seed()` is set to 0, the random numbers are the same as the cell with `np.random.seed()` set to 0 as well.

Setting `np.random.seed()` is not 100% necessary but it's helpful to keep numbers the same throughout your experiments.

For example, say you wanted to split your data randomly into training and test sets.

Every time you randomly split, you might get different rows in each set.

If you shared your work with someone else, they'd get different rows in each set too.

Setting `np.random.seed()` ensures there's still randomness, it just makes the randomness repeatable. Hence the 'pseudo-random' numbers.

In [28]:
import pandas as pd
np.random.seed(0)
df = pd.DataFrame(np.random.randint(10, size=(5, 3)))
df

Unnamed: 0,0,1,2
0,5,0,3
1,3,7,9
2,3,5,2
3,4,7,6
4,8,8,1


## 3. Viewing arrays and matrices (indexing)

Remember, because arrays and matrices are both `ndarray`'s, they can be viewed in similar ways.

Let's check out our 3 arrays again.

In [29]:
a1

array([1, 2, 3])

In [30]:
a2

array([[1. , 2. , 3.3],
       [4. , 5. , 6.5]])

In [31]:
a3

array([[[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9]],

       [[10, 11, 12],
        [13, 14, 15],
        [16, 17, 18]]])

Array shapes are always listed in the format `(row, column, n, n, n...)` where `n` is optional extra dimensions.

In [32]:
a1[0]

1

In [33]:
a2[0]

array([1. , 2. , 3.3])

In [34]:
a3[0]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [35]:
# Get 2nd row (index 1) of a2
a2[1]

array([4. , 5. , 6.5])

In [36]:
# Get the first 2 values of the first 2 rows of both arrays
a3[:2, :2, :2]

array([[[ 1,  2],
        [ 4,  5]],

       [[10, 11],
        [13, 14]]])

In [37]:
a4 = np.random.randint(100, size=(2, 3, 4, 5))
a4

array([[[[39, 87, 46, 88, 81],
         [37, 25, 77, 72,  9],
         [20, 80, 69, 79, 47],
         [64, 82, 99, 88, 49]],

        [[29, 19, 19, 14, 39],
         [32, 65,  9, 57, 32],
         [31, 74, 23, 35, 75],
         [55, 28, 34,  0,  0]],

        [[36, 53,  5, 38, 17],
         [79,  4, 42, 58, 31],
         [ 1, 65, 41, 57, 35],
         [11, 46, 82, 91,  0]]],


       [[[14, 99, 53, 12, 42],
         [84, 75, 68,  6, 68],
         [47,  3, 76, 52, 78],
         [15, 20, 99, 58, 23]],

        [[79, 13, 85, 48, 49],
         [69, 41, 35, 64, 95],
         [69, 94,  0, 50, 36],
         [34, 48, 93,  3, 98]],

        [[42, 77, 21, 73,  0],
         [10, 43, 58, 23, 59],
         [ 2, 98, 62, 35, 94],
         [67, 82, 46, 99, 20]]]])

In [38]:
a4.shape

(2, 3, 4, 5)

In [39]:
# Get only the first 4 numbers of each single vector
a4[:, :, :, :4]

array([[[[39, 87, 46, 88],
         [37, 25, 77, 72],
         [20, 80, 69, 79],
         [64, 82, 99, 88]],

        [[29, 19, 19, 14],
         [32, 65,  9, 57],
         [31, 74, 23, 35],
         [55, 28, 34,  0]],

        [[36, 53,  5, 38],
         [79,  4, 42, 58],
         [ 1, 65, 41, 57],
         [11, 46, 82, 91]]],


       [[[14, 99, 53, 12],
         [84, 75, 68,  6],
         [47,  3, 76, 52],
         [15, 20, 99, 58]],

        [[79, 13, 85, 48],
         [69, 41, 35, 64],
         [69, 94,  0, 50],
         [34, 48, 93,  3]],

        [[42, 77, 21, 73],
         [10, 43, 58, 23],
         [ 2, 98, 62, 35],
         [67, 82, 46, 99]]]])

`a4`'s shape is (2, 3, 4, 5), this means it gets displayed like so:
* Inner most array = size 5
* Next array = size 4
* Next array = size 3
* Outer most array = size 2

## 4. Manipulating and comparing arrays
* Arithmetic
    * `+`, `-`, `*`, `/`, `//`, `**`, `%`
    * `np.exp()`
    * `np.log()`
    * [Dot product](https://www.mathsisfun.com/algebra/matrix-multiplying.html) - `np.dot()`
    * Broadcasting
* Aggregation
    * `np.sum()` - faster than Python's `.sum()` for NumPy arrays
    * `np.mean()`
    * `np.std()`
    * `np.var()`
    * `np.min()`
    * `np.max()`
    * `np.argmin()` - find index of minimum value
    * `np.argmax()` - find index of maximum value
    * These work on all `ndarray`'s
        * `a4.min(axis=0)` -- you can use axis as well
* Reshaping
    * `np.reshape()`
* Transposing
    * `a3.T`
* Comparison operators
    * `>`
    * `<`
    * `<=`
    * `>=`
    * `x != 3`
    * `x == 3`
    * `np.sum(x > 3)`

### Arithmetic

In [41]:
a1

array([1, 2, 3])

In [42]:
ones = np.ones(3)
ones

array([1., 1., 1.])

In [43]:
# Add two arrays
a1 + ones

array([2., 3., 4.])

In [44]:
# Subtract two arrays
a1 - ones

array([0., 1., 2.])

In [45]:
# Multiply two arrays
a1 * ones

array([1., 2., 3.])

In [46]:
# Multiply two arrays
a1 * a2

array([[ 1. ,  4. ,  9.9],
       [ 4. , 10. , 19.5]])

In [47]:
a1.shape, a2.shape

((3,), (2, 3))

In [48]:
# This will error as the arrays have a different number of dimensions (2, 3) vs. (2, 3, 3)
a2 * a3

ValueError: operands could not be broadcast together with shapes (2,3) (2,3,3) 

In [49]:
a3

array([[[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9]],

       [[10, 11, 12],
        [13, 14, 15],
        [16, 17, 18]]])

### Broadcasting

- What is broadcasting?
    - Broadcasting is a feature of NumPy which performs an operation across multiple dimensions of data without replicating the data. This saves time and space. For example, if you have a 3x3 array (A) and want to add a 1x3 array (B), NumPy will add the row of (B) to every row of (A).

- Rules of Broadcasting
    1. If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
    2. If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
    3. If in any dimension the sizes disagree and neither is equal to 1, an error is raised.
    
    
**The broadcasting rule:**
In order to broadcast, the size of the trailing axes for both arrays in an operation must be either the same size or one of them must be one.

In [50]:
a1

array([1, 2, 3])

In [51]:
a1.shape

(3,)

In [52]:
a2.shape

(2, 3)

In [53]:
a2

array([[1. , 2. , 3.3],
       [4. , 5. , 6.5]])

In [54]:
a1 + a2

array([[2. , 4. , 6.3],
       [5. , 7. , 9.5]])

In [55]:
a2 + 2

array([[3. , 4. , 5.3],
       [6. , 7. , 8.5]])

In [56]:
# Raises an error because there's a shape mismatch (2, 3) vs. (2, 3, 3)
a2 + a3

ValueError: operands could not be broadcast together with shapes (2,3) (2,3,3) 

In [57]:
# Divide two arrays
a1 / ones

array([1., 2., 3.])

In [58]:
# Divide using floor division
a2 // a1

array([[1., 1., 1.],
       [4., 2., 2.]])

In [59]:
# Take an array to a power
a1 ** 2

array([1, 4, 9])

In [60]:
# You can also use np.square()
np.square(a1)

array([1, 4, 9])

In [61]:
# Modulus divide (what's the remainder)
a1 % 2

array([1, 0, 1])

You can also find the log or exponential of an array using `np.log()` and `np.exp()`.

In [62]:
# Find the log of an array
np.log(a1)

array([0.        , 0.69314718, 1.09861229])

In [63]:
# Find the exponential of an array
np.exp(a1)

array([ 2.71828183,  7.3890561 , 20.08553692])

### Aggregation

Aggregation - bringing things together, doing a similar thing on a number of things.

In [64]:
sum(a1)

6

In [65]:
np.sum(a1)

6

**Tip:** Use NumPy's `np.sum()` on NumPy arrays and Python's `sum()` on Python `list`s.

In [66]:
massive_array = np.random.random(100000)
massive_array.size, type(massive_array)

(100000, numpy.ndarray)

In [67]:
%timeit sum(massive_array) # Python sum()
%timeit np.sum(massive_array) # NumPy np.sum()

11.6 ms ± 3.05 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
39.9 µs ± 909 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


Notice `np.sum()` is faster on the Numpy array (`numpy.ndarray`) than Python's `sum()`.

Now let's try it out on a Python list.

In [68]:
import random
massive_list = [random.randint(0, 10) for i in range(100000)]
len(massive_list), type(massive_list)

(100000, list)

In [69]:
massive_list[:10]

[5, 2, 0, 8, 0, 2, 3, 3, 0, 8]

In [70]:
%timeit sum(massive_list)
%timeit np.sum(massive_list)

734 µs ± 14.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
6.93 ms ± 1.94 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


NumPy's `np.sum()` is still fast but Python's `sum()` is faster on Python `list`s.

In [71]:
a2

array([[1. , 2. , 3.3],
       [4. , 5. , 6.5]])

In [72]:
# Find the mean
np.mean(a2)

3.6333333333333333

In [73]:
# Find the max
np.max(a2)

6.5

In [74]:
# Find the min
np.min(a2)

1.0

In [75]:
# Find the standard deviation
np.std(a2)

1.8226964152656422

In [76]:
# Find the variance
np.var(a2)

3.3222222222222224

In [77]:
# The standard deviation is the square root of the variance
np.sqrt(np.var(a2))

1.8226964152656422

In [78]:
# The unique values of an array
np.unique(a2)

array([1. , 2. , 3.3, 4. , 5. , 6.5])

**What's mean?**

Mean is the same as average. You can find the average of a set of numbers by adding them up and dividing them by how many there are.

**What's standard deviation?**

[Standard deviation](https://www.mathsisfun.com/data/standard-deviation.html) is a measure of how spread out numbers are.

**What's variance?**

The [variance](https://www.mathsisfun.com/data/standard-deviation.html) is the averaged squared differences of the mean.

In [79]:
# Demo of variance
high_var_array = np.array([1, 100, 200, 300, 4000, 5000])
low_var_array = np.array([2, 4, 6, 8, 10])

np.var(high_var_array), np.var(low_var_array)

(4296133.472222221, 8.0)

In [80]:
np.std(high_var_array), np.std(low_var_array)

(2072.711623024829, 2.8284271247461903)

In [81]:
# The standard deviation is the square root of the variance
np.sqrt(np.var(high_var_array))

2072.711623024829

In [82]:
np.mean(low_var_array), np.mean(high_var_array)

(6.0, 1600.1666666666667)

### Reshaping

In [83]:
a2

array([[1. , 2. , 3.3],
       [4. , 5. , 6.5]])

In [84]:
a2.shape

(2, 3)

In [85]:
a2 + a3

ValueError: operands could not be broadcast together with shapes (2,3) (2,3,3) 

In [86]:
a2.reshape(2, 3, 1)

array([[[1. ],
        [2. ],
        [3.3]],

       [[4. ],
        [5. ],
        [6.5]]])

In [87]:
a2.reshape(2, 3, 1) + a3

array([[[ 2. ,  3. ,  4. ],
        [ 6. ,  7. ,  8. ],
        [10.3, 11.3, 12.3]],

       [[14. , 15. , 16. ],
        [18. , 19. , 20. ],
        [22.5, 23.5, 24.5]]])

### Transpose

A tranpose reverses the order of the axes.

For example, an array with shape `(2, 3)` becomes `(3, 2)`.

In [88]:
a2.shape

(2, 3)

In [89]:
a2.T

array([[1. , 4. ],
       [2. , 5. ],
       [3.3, 6.5]])

In [90]:
a2.transpose()

array([[1. , 4. ],
       [2. , 5. ],
       [3.3, 6.5]])

In [91]:
a2.T.shape

(3, 2)

For larger arrays, the default value of a tranpose is to swap the first and last axes.

For example, `(5, 3, 3)` -> `(3, 3, 5)`.

In [92]:
matrix = np.random.random(size=(5, 3, 3))
matrix

array([[[0.93403976, 0.44815431, 0.65217734],
        [0.17761123, 0.30906708, 0.98105221],
        [0.6965136 , 0.97409949, 0.46277196]],

       [[0.37507357, 0.83906482, 0.56186619],
        [0.51873391, 0.76998937, 0.66149819],
        [0.57913005, 0.5433098 , 0.76886136]],

       [[0.97085988, 0.30496649, 0.67530963],
        [0.22337143, 0.74435904, 0.48892904],
        [0.38805839, 0.88669099, 0.73350055]],

       [[0.53682108, 0.86580175, 0.14591431],
        [0.89764846, 0.87863886, 0.76680462],
        [0.39979513, 0.24851335, 0.03043538]],

       [[0.77255815, 0.73953937, 0.54434284],
        [0.82291795, 0.97261967, 0.9903615 ],
        [0.6358931 , 0.15307913, 0.31523468]]])

In [93]:
matrix.shape

(5, 3, 3)

In [94]:
matrix.T

array([[[0.93403976, 0.37507357, 0.97085988, 0.53682108, 0.77255815],
        [0.17761123, 0.51873391, 0.22337143, 0.89764846, 0.82291795],
        [0.6965136 , 0.57913005, 0.38805839, 0.39979513, 0.6358931 ]],

       [[0.44815431, 0.83906482, 0.30496649, 0.86580175, 0.73953937],
        [0.30906708, 0.76998937, 0.74435904, 0.87863886, 0.97261967],
        [0.97409949, 0.5433098 , 0.88669099, 0.24851335, 0.15307913]],

       [[0.65217734, 0.56186619, 0.67530963, 0.14591431, 0.54434284],
        [0.98105221, 0.66149819, 0.48892904, 0.76680462, 0.9903615 ],
        [0.46277196, 0.76886136, 0.73350055, 0.03043538, 0.31523468]]])

In [95]:
matrix.T.shape

(3, 3, 5)

In [96]:
# Check to see if the reverse shape is same as tranpose shape
matrix.T.shape == matrix.shape[::-1]

True

In [97]:
# Check to see if the first and last axes are swapped
matrix.T == matrix.swapaxes(0, -1) # swap first (0) and last (-1) axes

array([[[ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]],

       [[ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]],

       [[ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]]])

You can see more advanced forms of tranposing in the NumPy documentation under [`numpy.transpose`](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html).

### Dot product

The main two rules for dot product to remember are:

1. The **inner dimensions** must match:
  * `(3, 2) @ (3, 2)` won't work
  * `(2, 3) @ (3, 2)` will work
  * `(3, 2) @ (2, 3)` will work
  
2. The resulting matrix has the shape of the **outer dimensions**:
 * `(2, 3) @ (3, 2)` -> `(2, 2)`
 * `(3, 2) @ (2, 3)` -> `(3, 3)`

**Note:** In NumPy, `np.dot()` and `@` can be used to acheive the same result for 1-2 dimension arrays. However, their behaviour begins to differ at arrays with 3+ dimensions.

In [98]:
np.random.seed(0)
mat1 = np.random.randint(10, size=(3, 3))
mat2 = np.random.randint(10, size=(3, 2))

mat1.shape, mat2.shape

((3, 3), (3, 2))

In [99]:
mat1

array([[5, 0, 3],
       [3, 7, 9],
       [3, 5, 2]])

In [100]:
mat2

array([[4, 7],
       [6, 8],
       [8, 1]])

In [101]:
np.dot(mat1, mat2)

array([[ 44,  38],
       [126,  86],
       [ 58,  63]])

In [102]:
# Can also achieve np.dot() with "@"
# (however, they may behave differently at 3D+ arrays)
mat1 @ mat2

array([[ 44,  38],
       [126,  86],
       [ 58,  63]])

In [103]:
np.random.seed(0)
mat3 = np.random.randint(10, size=(4,3))
mat4 = np.random.randint(10, size=(4,3))
mat3

array([[5, 0, 3],
       [3, 7, 9],
       [3, 5, 2],
       [4, 7, 6]])

In [104]:
mat4

array([[8, 8, 1],
       [6, 7, 7],
       [8, 1, 5],
       [9, 8, 9]])

In [105]:
# This will fail as the inner dimensions of the matrices do not match
np.dot(mat3, mat4)

ValueError: shapes (4,3) and (4,3) not aligned: 3 (dim 1) != 4 (dim 0)

In [106]:
mat3.T.shape

(3, 4)

In [107]:
# Dot product
np.dot(mat3.T, mat4)

array([[118,  96,  77],
       [145, 110, 137],
       [148, 137, 130]])

In [108]:
# Element-wise multiplication, also known as Hadamard product
mat3 * mat4

array([[40,  0,  3],
       [18, 49, 63],
       [24,  5, 10],
       [36, 56, 54]])

### Comparison operators

Finding out if one array is larger, smaller or equal to another.

In [109]:
a1

array([1, 2, 3])

In [110]:
a2

array([[1. , 2. , 3.3],
       [4. , 5. , 6.5]])

In [111]:
a1 > a2

array([[False, False, False],
       [False, False, False]])

In [112]:
a1 >= a2

array([[ True,  True, False],
       [False, False, False]])

In [113]:
a1 > 5

array([False, False, False])

In [114]:
a1 == a1

array([ True,  True,  True])

In [115]:
a1 == a2

array([[ True,  True, False],
       [False, False, False]])

## 5. Sorting arrays

* [`np.sort()`](https://numpy.org/doc/stable/reference/generated/numpy.sort.html) - sort values in a specified dimension of an array.
* [`np.argsort()`](https://numpy.org/doc/stable/reference/generated/numpy.argsort.html) - return the indices to sort the array on a given axis.
* [`np.argmax()`](https://numpy.org/doc/stable/reference/generated/numpy.argmax.html) - return the index/indicies which gives the highest value(s) along an axis.
* [`np.argmin()`](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html) - return the index/indices which gives the lowest value(s) along an axis.

In [121]:
random_array

array([[8, 0, 4],
       [7, 7, 5],
       [2, 7, 2],
       [0, 0, 2],
       [1, 0, 7]])

In [122]:
np.sort(random_array)

array([[0, 4, 8],
       [5, 7, 7],
       [2, 2, 7],
       [0, 0, 2],
       [0, 1, 7]])

In [143]:
np.argsort(random_array)

array([[1, 2, 0],
       [2, 0, 1],
       [0, 2, 1],
       [0, 1, 2],
       [1, 0, 2]])

In [144]:
a1 = np.array([34,1,24,5,45,424])

In [145]:
# Return the indices that would sort an array
np.argsort(a1,axis=0)

array([1, 3, 2, 0, 4, 5])

In [146]:
# No axis
np.argmin(a1)

1

In [148]:
np.argmax(a1)

5

In [149]:
random_array

array([[8, 0, 4],
       [7, 7, 5],
       [2, 7, 2],
       [0, 0, 2],
       [1, 0, 7]])

In [150]:
# Down the vertical
np.argmax(random_array, axis=1)

array([0, 0, 1, 2, 2])

In [151]:
# Across the horizontal
np.argmin(random_array, axis=0)

array([3, 0, 2])