# Into to NumPy

[NumPy](https://numpy.org/) is a fundamental Python package for scientific computation. It has grown in popularity because it provides powerful, performant numerical computing tools while maintaining ease of use.

NumPy is used in many scientific domains such as Quantum Computing, 3D Visualization, Bioinformatics, Astronomy, Data Science, and Machine Learning. Basically anywhere a lot of numbers need to be processed NumPy is probably being used.


## Basics of NumPy Arrays

The main object provided by NumPy is its N-dimensional array (`ndarray`). This is a table of elements which are all of the same type.

The 'dimensionality' of an array refers to the level of nesting of arrays within the array. For example, the arrays you are probably used to such as `[1, 2, 3]` are 1-dimensional as there is just the top level array. We can create a 1-dimensional array in NumPy as follows.

In [None]:
# We import numpy with the alias np by convention
import numpy as np

a1 = np.array([1, 2, 3])
print("a1: ", a1)

arrNums = np.array([1,2,3,4,5,6,7,8,9,10])
arrStrs = np.array(["as", "ds", "fr"])

print("Numbers array: ", arrNums)
print("Strings array: ", arrStrs)



We can also create arrays with more dimensions. This is where the NumPy module starts to become useful.

In [None]:
a2 = np.array([[1, 2, 3], [4, 5, 6]])
# print("a2: ", a2)

arr = np.array([[[[1,3,4], [3,4,5]], [[5,6,7], [6,7,8]]]])
print(arr)
print(arr.shape)
# print(f"This is a {arr.ndim}-dimensional array")

In NumPy the dimensions of an array are called *axes*. 1-dimensional arrays have one axis, 2-dimensional arrays have two axes, and so on.

Our first example above has single axis with 3 elements, so we say its axis has a length of 3.

Our second example has two axes. The first axis (the rows) has a length of 2, the second axis (the columns) has a length of 3.

Arrays in NumPy can have an arbitrary number of axes, which is why they are called N-dimensional arrays. The object we create with `np.array` is an `ndarray`. There are some important attributes of the `ndarray`:

We can get the number of axes (dimensions) through the `ndim` attribute:

In [None]:
print(f'This is a {a1.ndim}-dimensional array.')
print(f'This is a {a2.ndim}-dimensional array.')

We can see the length of each axis with `shape`. `shape` is a tuple with an integer for each dimension, so it has length `ndim`.

In [None]:
print('The dimensions of this array are:', a1.shape)
print('The dimensions of this array are:', a2.shape)

We can get the total number of elements in the array with `size`.

In [None]:
print(f'This array has {a1.size} elements.')
print(f'This array has {a2.size} elements.')

We can see the data type of the element of an array with `dtype`.

In [None]:
print(f'This array has {a2.dtype} type elements in it.')
a_flt = np.array([1.3])
print(f'This array has {a_flt.dtype} type elements in it.')

There are often times when we don't know what an array will contain, but we do know the dimensions of it. In this case it is a good practice to create an array with the correct size and some initial filler content. This is because the alternative of growing an array as you discover the contents is an expensive operation.

There are a couple of ways we can create arrays of a given size with initial filler content.

We can use `np.zeros` to create an array filled with zeros. This function takes a tuple representing the shape of the array to create.

In [None]:
np.zeros((2,4,3))

`np.ones` works in the same way as `zeros` but fills the array with ones instead.

In [None]:
np.ones((2,2))

Another alternative is `np.empty` which creates an array of the given shape but makes no guarantees about its contents. The contents are determined by whatever is in memory at the time of creation and you cannot assume they are all consistent, for example all zeros.

In [None]:
np.empty((2,3))

For functionality analogous to Python's `range`, we can use the `np.arange` to generate an `ndarray` filled with a range of values.

In [None]:
print(np.arange(10, 20))
print(np.arange(10, 30, 5))
print(np.arange(10, 31, 5))

If you want to generate a range of numbers which can include floats, use `np.linspace`. This function is similar to `arange` except it takes the number of values to generate instead of the step. `linspace` determines the step automatically and evenly (linearly, hence 'lin') spaces the generated values.

In [None]:
print(np.linspace(0, 2, 9))

**Challenge:** Create a 2D array of shape (2, 3) (2 rows, 3 columns), filled with ones, in two different ways. Make sure their data types are the same.

In [None]:
print(np.array([[1., 1., 1.], [1., 1., 1.]]))
print(np.ones((2, 3)))

## Indexing, Slicing, and Iterating

One-dimensional NumPy arrays work very similarly to Python lists when it comes to accessing elements and iterating over elements.

In [None]:
a = np.array([1,2,3,4, 5])
# print(a)

# Access the 3rd element (remember indices start at 0)
# print(a[2])

# Access the from 5th element up to but not including the 10th
# print(a[1:4])
# print(a[1:4:2])

# Access every third element
# print(a[::2]) # equivalent to a[0:len(a)+1:3]]

# From start to the 8th element, set every second element to 1000
# a[:3:2] = 1000 # equivalent to a[0:7:2] = 1000
# print(a)

# Get the reverse of a
print(a[::-1])

# Loop through a
for el in a:
    print(el)

**Challenge:** Create an array of 100 elements from 0.1 to 1 named `my_array` and change every 10th element to 100.

In [None]:
my_array = np.linspace(0.1, 1, num=100)
my_array[::10] = 100
print(my_array)

Multidimensional arrays can have one index per axis, each separated by a comma within the square brackets: `a[rows, cols]`. In a 2D array, the first index selects the rows and the second selects the columns. Each index works in the same way as with 1D arrays, they just operate on their respective axis.

In [40]:
# A 2D array
a2 = np.array([[1, 2, 3], [4, 5, 6]])
print(a2)

# Select the 2nd row and the 3rd column
# print(a2[1,2])

# Select all the rows and the 2nd column in three different ways
# Note: the second way is preferred because it works regardless of the number of rows
# print(a2[0:3,1])
# print(a2[:,1])
# This way tells NumPy you only want to specify the very last axis and get everything from the rest.
# This is useful for arrays with more dimensions
# print(a2[...,0])  

# The 1st row and all columns in three different ways
print(a2[0,:])
print(a2[0])
print(a2[0,...])

[[1 2 3]
 [4 5 6]]
[1 2 3]
[1 2 3]
[1 2 3]


Note that when we don't provide all indices, for example leaving out the column axis, NumPy just assumes we want everything from the axis.


**Challenge:** Create an array of ones of shape (5, 5) and index into it in such a way that the result is an array of shape (2, 3).

In [88]:
a5 = np.zeros((5, 5))
print(a5)
# print(a5[:2,:3])
# print(a5[3:5,2:5])

a5 = a5[0:2, ...]
a5 = a5[..., 0:3]
print(a5)


[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[[0. 0. 0.]
 [0. 0. 0.]]



When you iterate over an array, it loops over the rows.

In [89]:
for row in a5:
    print("row", row)
    for r in row:
        print(r)
        
        

row [0. 0. 0.]
0.0
0.0
0.0
row [0. 0. 0.]
0.0
0.0
0.0


If you want to loop through all the elements of an array, use the `flat` attribute.

In [97]:
for el in a5.flat:
    print(el)

# The above is the same as this...
# for row in a5:
    # for el in row:
        # print(el)
# Except flat works for any number of dimensions in the array

0.0
0.0
0.0
0.0
0.0
0.0


**Challenge:** Create an empty array of shape (3,3,3) and iterate over its elements in two different ways.

In [None]:
a3 = np.empty((3,3,3))

for el in a3:
    print(el)

# for el in a3.flat:
#     print(el)

# print('Way #1:')
# for el in a3.flat:
#     print(el)

# print('\nWay #2:')
# for axis1 in a3:
#     for axis2 in axis1:
#         for el in axis2:
#             print(el)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
