# Day 1-part 2: Arrays

Like lists, arrays are collections of items, but of the same type (e.g., all numbers or all strings). To use arrays in Python, you need to import the `numpy` library:

In [None]:
import numpy as np # import numpy library under the alias np

However, contrary to lists, arrays need to be declared:

In [None]:
my_array = np.array([-10, 30, 60, 90, 120, 150, -100]) # create array
print(my_array)

Let's modify this array:

In [None]:
my_array[0] = 0 # modify first element
my_array[-1] = 180 # modify last element
print(my_array)

In [None]:
my_array = np.append(my_array,[240, 270, 300, 330, 360]) # append elements
print(my_array)

In [None]:
my_array = np.insert(my_array,7,210) # insert element at index 7
print(my_array) 

In [None]:
my_array = np.delete(my_array,-1) # delete last element
print(my_array)

Notice that `my_array` does not have append, insert or delete methods. Instead, we need to use the `numpy` `append`, `insert`, or `delete` methods as shown above. This creates a new array.

Now, let's use index ranges to print some elements of the array:

In [None]:
print(my_array[4:8]) # print elements with indexes 4 to 7

Index ranges are quite powerful. Suppose we want to calculate the differences between successive elements of `my_array`. We can do this in one line of code as follows:

In [None]:
# differences between succesive elements of my_array
diffs = my_array[1:] - my_array[:-1]
print(diffs)

`my_array[1:]` contains the second to the last element of `my_array`, while `my_array[:-1]` contains the first to the penultimate element of `my_array`. Subtracting these two arrays gives us the differences between the elements of `my_array`.

We can use the `numpy.size` function to retrieve the number of elements in the array:

In [None]:
print("number of elements in array =", my_array.size) # print the number of elements in array

and the `numpy.dtype` function to find out the type of elements in the array:

In [None]:
my_array.dtype

### 2D arrays

A 2D array is an array of 1D arrays. It can be constructed as follows:

In [None]:
# create a 3 x 4 array
my_2d_array = np.array([[1, 2, 3, 4], [5, 6, 7, 8], 
                        [9, 10, 11, 12]])
print(my_2d_array)

To access an element of the array, we use two indexes within brackets. The first index refers to the row, and the second index to the column of the array. This is illustrated with a library cabinet below for the box at row index 2 and column index 2:<br><br>

<img src="../figures/2dArray.png" alt="varTypes" width="600"/><br><br>

In [None]:
# print element in third row and second column my_2d_array
print(my_2d_array[2,1])

Index ranges allow us to quickly access several elements of the array. This is referred to as *slicing* the array. This figure shows how to access the row with index 3 in my cabinet:

<img src="../figures/slicingRow.png" alt="varTypes" width="600"/><br><br>

In [None]:
# print the first row of my_2d_array
print(my_2d_array[0,:]) # print(my_2d_array[0]) does the same

And this figure shows how to select the column with index 3 in my cabinet:

<img src="../figures/slicingColumn.png" alt="varTypes" width="600"/><br><br>

In [None]:
# print the first column of my_2d_array
print(my_2d_array[:,0]) # : means all rows in first column

In [None]:
# print the first two rows of my_2d_array
print(my_2d_array[:2,:]) # print(my_2d_array[:2]) does the same

In [None]:
# print the last two columns of my_2d_array
print(my_2d_array[:,2:4]) # means all rows in the last two columns

We can use the `numpy.shape` function to obtain the number of rows and columns in the array. This returns a tuple whose first element is the number of rows, and second element is the number of columns:

In [None]:
# print number of rows in my_2d_array
print("number of rows in array =", my_2d_array.shape[0]) # shape[0] is number of rows

# print number of columns in my_2d_array
print("number of columns in array =", my_2d_array.shape[1]) # shape [1] is number of columns

### 3D arrays

3D arrays work the same way, they are arrays of 2D arrays:

In [None]:
my_3d_array = np.arange(24).reshape(2,3,4) # constructing a 2 x 3 x 4 array
print(my_3d_array)

Here we use the `numpy.arange` function to generate 24 elements from 0 to 23, and the `numpy.reshape` function to reshape these elements into a 2 x 3 x 4  array. Let's run some of the operations above for this 3D array:

In [None]:
print(my_3d_array[0], "\n") # print the first 2D array in the 3D array
print(my_3d_array[1], "\n") # print the second 2D array in the 3D array
print(my_3d_array[0,1], "\n") # print the second row of first 2D array 
print(my_3d_array[1,:,2], "\n") # print the third column of second 2D array 
print("shape of array", my_3d_array.shape) # print the shape of the 3D array

Finally, the `numpy.ndim` function tells us the dimensions of the array:

In [None]:
print(my_array.ndim)
print(my_2d_array.ndim)
print(my_3d_array.ndim)

### Arrays versus lists

There are two main reasons to use arrays as opposed to lists:

- Arrays are more efficient for storing large amounts of data than lists. 
- Arrays are great for numerical operations; lists cannot handle directly math operations.

To make the last point clear, let's look at the following example:

In [None]:
my_sines = np.sin(np.radians(my_array)) # compute the sine of the elements in my_array
print(np.around(my_sines,2)) # np.around prints the array with just two decimal places

Here, we use the `numpy.radians` function to convert the elements of the array from degrees to radians, and the `numpy.sin` function to compute the sine of the elements in the array. All in one line of code! In the `print` statement, we use the `numpy.around` function to print the array elements with just two decimal places. Operating at once over all elements of an array makes your code faster. By the way, this is very similar to how arrays in programs such as [Matlab](https://www.mathworks.com) or [Octave](https://www.gnu.org/software/octave/) work.

So, when should we use a list or an array?
- If you are storing a relatively short sequence of items and don't plan to do any mathematical operations with it, use a list.
- If you have a very long list of items, and you plan to do numerical operations with them, use an array.

More about array operations in the next section.

## Array operations

There are two main groups of operations that involve `numpy` arrays:

- Element-wise operations
- Linear algebra operations

### Element-wise operations

These are simple element-wise operations that involve an array and a scalar, or two arrays of the same dimension. For example:

In [None]:
# create a 3 x 3 array
array_a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(array_a)

In [None]:
print(array_a + 2) # array plus scalar

In [None]:
print(array_a - 2) # array minus scalar

In [None]:
print(array_a * 2) # array times scalar

In [None]:
print(array_a / 2) # array divided by scalar

In [None]:
print(array_a ** 2) # array elevated to the scalar

In [None]:
# Create another 3 x 3 array
array_b = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]])

print(array_a + array_b) # element-wise sum

In [None]:
print(array_a - array_b) # element-wise difference

In [None]:
print(array_a * array_b) # element-wise multiplication

In [None]:
print(array_a / array_b) # element-wise division

In [None]:
print(array_a ** array_b) # element-wise exponentiation

### Linear algebra operations

Linear algebra is very important in geosciences and engineering (check our [online resource in computational geosciences](https://github.com/nfcd/compGeo)). Let's look at some examples:

In [None]:
# create two vectors (1 x 3 arrays)
vector_u = np.array([1, 2, 3])
vector_v = np.array([4, 5, 6])
# compute the magnitude of the vector u
length_u = np.linalg.norm(vector_u)
print("{:.3f}".format(length_u)) # print just 3 decimal places

In [None]:
# make the vector a unit vector by dividing it by its magnitude
vector_uu = vector_u / length_u
print(np.linalg.norm(vector_uu)) # this should print 1.0 

In [None]:
# compute the dot product of the vectors, this gives a scalar
print(np.dot(vector_u, vector_v))

In [None]:
# compute the cross product of the vectors, this gives another vector
print(np.cross(vector_u, vector_v))

In [None]:
# create two conformable matrices
# columns in matrix a = rows in matrix b
matrix_a = np.array([[1, 2, 3], [4, 5, 6]]) # 2 x 3 matrix
matrix_b = np.array([[7, 8], [9, 10], [11, 12]]) # 3 x 2 matrix
print("matrix a has", matrix_a.shape[0], "rows and", matrix_a.shape[1], "columns")
print("matrix b has", matrix_b.shape[0], "rows and", matrix_b.shape[1], "columns")

In [None]:
# multiply the matrices, this gives a 2 x 2 matrix
print(np.dot(matrix_a, matrix_b))

In [None]:
# create a square (rows = columns) 3 x 3 matrix
matrix_c = np.array([[1, 7, 9], [3, 5, 8], [4, 2, 6]])

# compute the determinant of the matrix
print(np.linalg.det(matrix_c)) 

In [None]:
# compute the inverse of the matrix
matrix_ci = np.linalg.inv(matrix_c)
print(matrix_ci) 

In [None]:
# the matrix times its inverse is equal to the identity matrix
# a matrix with 1s along the diagonal, and 0s outside
print(np.dot(matrix_c, matrix_ci))

Here we use the `numpy` linear algebra functions (some of them in the `linalg` module) to perform these operations.

`numpy` includes many more [operations](https://numpy.org/doc/stable/reference/routines.html). It is not surprising that data analytics and data science rely so heavily on `numpy` arrays.

To practice the first two notebooks, try the exercises in [lab1_1](../lab/lab1_1.ipynb).