# NumPy Arrays

A NumPy Array can is a regular array that can be of two classes:
1. a vector - A vector is a one-dimensional array
2. a matrix - A matrix is a multi-dimensional array

Let's start with the basis and see how to create an array.

## Creating Arrays

### Basics

In [1]:
import numpy as np

One of the methods to create an array is by building an array from an existing list (one-dimensional) or from an existing matrix (multi-dimensional).

In [2]:
# Create a regular list
my_list = [1, 2, 3]

# Make an array from the list
np.array(my_list)

array([1, 2, 3])

In [3]:
# Create a regular nested list
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Make an array from the list
np.array(my_matrix)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

However, more often than not, we will use the NumPy built-in functions to create arrays for us.

### Methods

#### np.arange

The `np.arange` method is very similar to Python's inbuilt `range` method. It takes a `start`, a `stop` and a `step` argument and builds a list (or in this case, an array) of evenly spaced values ranging from the `start` to the `stop` with increments of `step`.

In [4]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [5]:
np.arange(10, 50, 2)

array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42,
       44, 46, 48])

#### np.zeros

The `np.zeros` method takes a `shape` argument and will create an array with the dimensions of the given `shape` where each element of the array will be equal to `0`.<br/>For example, if we write `np.zeros(5)` we will get an array of 5 elements. Each of these elements will be a `0`.

Passing a tuple to the `shape` argument, will create a multi-dimensional array of the given dimensions. The first number of the tuple will represent the number of **rows** and the second number will represent the **columns**.

Passing a tuple of 3 numbers will create a 3-dimensional array where the numbers correspond to:
- amount of nested arrays
- number of rows per nested array
- number of columns per nested array

In [6]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [7]:
# Create an array of 3 rows and 5 columns
np.zeros((3, 5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [8]:
np.zeros((2, 3, 5))

array([[[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]])

#### np.ones

The `np.ones` method is identical to the `np.zeros` method except it will populate the array with `1` instead of `0`.

#### np.linspace

The `np.linspace` method is similar to the `np.arange` method. Just as the `np.arange` method, it take a `start` and a `stop` argument, but the difference lies in the 3rd argument.

`np.arange` requests the steps or **incrementation** added to each number. `np.linspace` needs the amount of elements that need to be created. Each generated element is **equidistant** to each other between the `start` and the `stop`.

The result of the `np.linspace` method is a one-dimentional array with `num` elements.

In [9]:
np.linspace(0, 5, 6)

array([0., 1., 2., 3., 4., 5.])

In [10]:
np.linspace(0, 5, 3)

array([0. , 2.5, 5. ])

#### np.eye

The `np.eye` allows to easily create an **identiy matrix** of `n` elements.

An identity matrix is a 2-dimentional matrix of equal rows and columns populated by `0` except for the column where the column index is equal to the row index.

In [11]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

#### Arrays with random numbers

There are multiple ways to create NumPy arrays containing random numbers.

##### ***np.random.rand***

The `np.random.rand()` will create an array of the given dimensions. It takes as many arguments as there are dimensions, meaning that with one number, we will have a 1D array. With two numbers, a 2D array and so on.

The created array will populated with random samples from a uniform distribution over `[0, 1]`.

In [12]:
np.random.rand(5)

array([0.39352018, 0.22639418, 0.36400704, 0.89217093, 0.92816237])

In [13]:
np.random.rand(3, 5)

array([[0.98996785, 0.81821535, 0.21977257, 0.35599894, 0.04697529],
       [0.69970075, 0.60412965, 0.36020962, 0.09862006, 0.66141586],
       [0.62391203, 0.37389056, 0.89640647, 0.48693083, 0.49530425]])

##### ***np.random.randn***

The `np.random.randn()` method is identical to the `np.random.rand()` method except that it also can populate negatives (standard normal distribution) over `[-1, 1]`.

In [14]:
np.random.randn(5)

array([ 1.26259031,  0.29504051,  0.07740469, -0.86034266, -0.25798844])

##### ***np.random.randint***

The `np.random.randint` method will generate an array of integers. It takes 3 arguments: the `start` or lower inclusive, the `stop` or highest exclusive and the `shape` of the array.

In [15]:
np.random.randint(5, 10, 3)

array([5, 7, 6])

In [16]:
np.random.randint(5, 10, (3, 5))

array([[9, 9, 6, 5, 9],
       [6, 5, 6, 6, 5],
       [9, 7, 8, 5, 8]])

## Array Methods

### array.reshape

The `reshape` method allows to change the shape of an existing array. Its arguments corresponds to the desired shape of the new array.

In [17]:
arr = np.arange(10)
arr.reshape(2, 5)

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [18]:
# Now what happens if the shape doesn't correpsond exactly
arr = np.arange(10)
arr.reshape(3, 3)

ValueError: cannot reshape array of size 10 into shape (3,3)

### max and min

The `max` and `min` methods return the maximum and minimum values of an array respectively.

To get the index of the maximum and minimum values you can use the `argmax` and `argmin`methods.

In [19]:
arr = np.random.randint(10, 100, 50)
arr_max = arr.max()
arr_min = arr.min()
arr_max_index = arr.argmax()
arr_min_index = arr.argmin()

print({
    'max': arr_max,
    'max_index': arr_max_index,
    'min': arr_min,
    'min_index': arr_min_index
})

{'max': 98, 'max_index': 8, 'min': 11, 'min_index': 34}


### Shape

The `shape` property returns a tuple with the size of each dimension.

In [20]:
arr = np.random.rand(np.random.randint(1, 4), np.random.randint(1, 4), np.random.randint(1, 4))
arr.shape

(1, 2, 2)

### dtype

The `dtype` property indicates the type of the elements present in an array.

In [21]:
arr = np.random.randint(10, 100, 50)
arr.dtype

dtype('int64')

## Indexing and Selection from a NumPy Array

### Indexing a one-dimension array (Vector)

This works the same as Python lists. We can select an index with `array[n]` and a slice with `array[n:m]`.

In [22]:
# Create an array
arr = np.arange(11)

In [23]:
# Get value at specific index
arr[8]

8

In [24]:
# Get slice between 2 indexes
a = arr[0:5]
b = arr[:6]
c = arr[4:]
print(a)
print(b)
print(c)

[0 1 2 3 4]
[0 1 2 3 4 5]
[ 4  5  6  7  8  9 10]


#### Watch out

One major difference between a list and an array is the broadcasting. This means that if I create a variable that is a slice of the array and that I change its value, it will also change the value on the original array.

This means that **the data is not copied** but the slice **is just a view of the original array**.

The reason for this is that NumPy is for working with large amounts of data. In order to improve memory performances, it won't copy the data by default.<br/>
It is, however possible to copy the data with the `copy()` method.

In [25]:
# Example of broadcasting
arr = np.arange(11)
slice_of_arr = arr [:6]
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [26]:
slice_of_arr[:3] = 99
slice_of_arr

array([99, 99, 99,  3,  4,  5])

In [27]:
arr

array([99, 99, 99,  3,  4,  5,  6,  7,  8,  9, 10])

In [28]:
# If we copy the array, it allocates new memory to the copy
slice_of_arr = arr.copy()[6:]
slice_of_arr[:] = 100
arr

array([99, 99, 99,  3,  4,  5,  6,  7,  8,  9, 10])

### Indexing a 2D-Array (Matrix)

There are 2 ways to index a matrix: With _double brackets_ and with _single brackets_.

- The _double brackets_ means that you are going to select per dimensions: `arr[n1][n2]`
- The _single bracket comma_ notation means this: `arr[n1, n2]`

In [29]:
arr = np.array([[5, 10, 15], [20, 25, 30], [25, 40, 45]])
arr

array([[ 5, 10, 15],
       [20, 25, 30],
       [25, 40, 45]])

In [30]:
# The double bracket method
arr[1][2]

30

In [31]:
# The single bracket comma notation
arr[1, 2]

30

#### Grabbing an entire row

To grab an entire row, just give one integer representing the row: `arr[n]`.

***Note:*** _This returns an array._

In [None]:
# Selecting a row
arr[2]

#### Grabbing a column

To grab a column use the single or double bracket notation replacing the row integer by `:` or `...`.

In [32]:
arr[:][0]

array([ 5, 10, 15])

In [33]:
arr[:, 1]

array([10, 25, 40])

In [34]:
arr[..., 2]

array([15, 30, 45])

#### Use Slice notation to grab a selection

In [35]:
# Grab all elements where:
# -- the row as the index 0 and 1
# -- the columns as the index 1 and 2
arr[:2, 1:]

array([[10, 15],
       [25, 30]])

### Conditional selection

We can create a **boolean array** from any array by analyzing it with logical operators.

It is possible to select the elements of an array based on a boolean array. This is easier to show than to explain:

In [36]:
arr = np.random.randint(0, 100, 10)
arr

array([45, 60, 26, 23, 11, 53, 42, 90, 73, 78])

In [37]:
# Creating a boolean array
bool_arr = arr > 65
bool_arr

array([False, False, False, False, False, False, False,  True,  True,
        True])

In [38]:
# Select only the elements where the condition is true in the original array
arr[bool_arr]

array([90, 73, 78])

In [39]:
# Or the shorthand:
arr[arr > 65]

array([90, 73, 78])

## Operations on arrays

### Arithmetics

For any array we can perform arithmethic operations. For example, I can _'add 5'_ to each element of my array simply by typing `array + 5`.<br/>This also works for substractions `- 5`, multiplication `* 5` and divisions `/ 5`.

In [41]:
arr = np.arange(10)
arr + 5

array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

It is however interesting to add two arrays together.

If we were to write `array_1 + array_2`, each index of the first array will be summed up with the corresponding index of the second array.

In [43]:
arr_1 = np.array([1, 1, 5, 1, 1])
arr_2 = np.array([3, 3, 5, 3, 3])
arr_1 + arr_2

array([ 4,  4, 10,  4,  4])

A list of universal functions (or `ufunc`) can be found on [the official documentation](https://numpy.org/doc/stable/reference/ufuncs.html)