<a href="https://colab.research.google.com/github/alvaphelan/Python-HowTos/blob/main/Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Numpy

Numpy is an extremely useful library that lets python work with data at a high speed. You will be using it regularly throughout your code. First, you need to import the package at the start of your notebook.

In [22]:
import numpy as np

One of the most important things numpy can do is import data, but this will be dealt with in it's own section here ([importing data](https://github.com/alvaphelan/Python-HowTos/blob/main/Importing_Data.ipynb)).

Arrays are a very useful aspect of numpy and are used widely. An array is just a grid of values and can be one or multidimensional. 

In [23]:
# example of 1 dimensional array
one_d = np.array([2, 4, 6, 8, 10])
# example of 2 dimensional array 
two_d = np.array([(1,3,5), (2,4,6)]) #(2x3 matrix in this case)

Numpy has a lot of tools to find out about arrays (very useful if dealing with long ones) and creating them!

#### **Finding information about arrays**

It is possible to find the size of an array by using the command 'np.size()', or you can use the pure python function 'len()'.


In [24]:
print(np.size(one_d))
print(len(one_d))

5
5


However, you have to be careful with the 'len()' function, as for two dimensional arrays, it only returns the number of rows!


In [25]:
print(np.size(two_d))
print(len(two_d))

6
2


For multidimensional arrays, np.shape will give you the dimensions of the matrix.

In [26]:
print(np.shape(two_d))

(2, 3)


It is possible to get certain elements by the name of the array, followed by a square bracket surrounding the position of interest.


*note: the first element is always 0, not 1!*

In [27]:
#to find the first and third element of the one d array
print(one_d[0])
print(one_d[2])

2
6


The same works for a two dimensional array, but remember to put in the full position!

In [28]:
#to find the element in the first row, third column
print(two_d[0, 2])

5


##### **Slicing**
If creating a smaller array from a larger one that already exists, e.g. looking at a particular part of interest, slicing is a good way to do so. It works similarly to getting a specific element, except in this case you put a ':' in between the first element of interest and the one after the last. (for example, [2:6] means the third element up to but not including the 7th element).

*note: simply putting [:] will return the original array, while [:4] will return all points up to but not inlcuding the fifth element and vice versa.

In [29]:
#for example, to make a smaller array of the second, third and fourth element of the one d array
sliced_array = one_d[1:5]
print(sliced_array)

[ 4  6  8 10]


This also works for two dimensional arrays, you just need to put a , in between.

In [30]:
# to get second row, all columns
print(two_d[1,:])
# to get all rows but second column
print(two_d[:,1])

[2 4 6]
[3 4]


#### **Creating arrays**

If creating an array from scratch, it is possible to make an evenly spaced out array where you customise the start point, end point and step size. 

*Be mindful, as in the slicing, it is up to but not including the end point*

In [35]:
# for example to get from 1 to 5 in steps of 0.5
space = np.arange(1, 5.5, 0.5)
print(space)

[1.  1.5 2.  2.5 3.  3.5 4.  4.5 5. ]


If you want to create an array but with a certain number of steps, rather than a step size, you can use the function 'np.linspace()' where you customise the start point, endpoint and number of steps

In [36]:
# for example go from 1 to 10 in 5 steps
numbers = np.linspace(1,10,5)
print(numbers)

[ 1.    3.25  5.5   7.75 10.  ]


Another function similar to linspace is 'np.logspace()'. This function returns an array of evenly spaced numbers on a logarithmic scale. It has the following parameters:
numpy.logspace(start,stop, num = 50, endpoint = True, base = 10.0)
Start and stop refer to the beginning and end of your array, 'num' is the number of points you want, base is the log base you want and setting the endpoint equal to 'True' includes the last point of your array.
Example:

In [37]:
print(np.logspace(2, 4, num = 6, endpoint = True, base = 10))

[  100.           251.18864315   630.95734448  1584.89319246
  3981.07170553 10000.        ]


The function 'np.geomspace()' works very similarly, but the startpoint and endpoint are the specific start and end of the array (already on a log scale).
Example:

In [38]:
print(np.geomspace(2, 4, num = 6, endpoint = True))

[2.         2.29739671 2.63901582 3.03143313 3.48220225 4.        ]


##### **Various Arrays**

There are many numpy functions that can be used to create arrays. If looking to create one containing only 0's, you can use 'np.zeros()' and fill in the bracket with the length of the array you want, 'np.ones()' will do the same but fill with only 1's, 'np.eye(4)' will create a 4x4 identity matrix (fill with whatever square matrix is needed). 

You can also easily create a matrix full of ones or zeros in the same dimensions as an array already defined by the functions 'np.ones_like()' or 'np.zeros_like()'. 

For example:

In [39]:
print(np.zeros_like(two_d))

[[0 0 0]
 [0 0 0]]


#### **Operations on arrays**

In order to perform an operation on every item in an array, it is possible to create a for loop (as seen in [this section](https://github.com/UCD-Physics/Python-HowTos/blob/main/Writing_Loops.ipynb)), however a much faster way to do this is by simply performing an operation on the whole array at once.

For example, if needed to multiply every element by 3:


In [40]:
one_d_by_three = 3*one_d
print(one_d_by_three)

[ 6 12 18 24 30]


or if you want to create a log/log plot you can just plot the log of the array:

In [41]:
print(np.log(one_d))

[0.69314718 1.38629436 1.79175947 2.07944154 2.30258509]


There are also many functions to find out various bits of information about the array without having to create a loop, such as:

In [42]:
print(f"max value in array is {np.max(one_d)}")
print(f"min value in array is {np.min(one_d)}")
print(f"sum of all numbers in array is {np.sum(one_d)}")
print(f"mean of the numbers in array is {np.mean(one_d)}")
print(f"standard deviation of numbers in array is {np.std(one_d):.3f}")

max value in array is 10
min value in array is 2
sum of all numbers in array is 30
mean of the numbers in array is 6.0
standard deviation of numbers in array is 2.828


#### **Min and Max**

To find the position of the maximum and minimum in the array, you can use the 'argmax' and 'argmin' function. Important not to mix the two up.


In [43]:
print(f"The max value in the array is {np.max(one_d)} and occurs at the position {np.argmax(one_d)}")
print(f"The min value in the array is {np.min(one_d)} and occurs at the position {np.argmin(one_d)}")


The max value in the array is 10 and occurs at the position 4
The min value in the array is 2 and occurs at the position 0


#### **Matrix Operations**

Matrices can be very useful and numpy has a lot of 'built-in' functions to deal with them. As seen before with the two dimensional array, this is a 2x3 matrix. Each row is seperated by a ','.
An example of a 3x4 matrix is:

In [51]:
three_by_four = ([0, 2, 4, 6], 
                 [3, 2, 1, 5], 
                 [2, 4, 2, 1])

In [56]:
# remember, np.shape will return the amount of rowsxcolumns
np.shape(three_by_four)

(3, 4)

It is possible to easily transpose this matrix by the 'np.transpose()' function.

In [57]:
three_by_four_t = np.transpose(three_by_four)
print(three_by_four_t)

[[0 3 2]
 [2 2 4]
 [4 1 2]
 [6 5 1]]


Another useful function is 'np.reshape'. This can reshape a matrix that's already defined into a new nxm matrix.

Example:

In [63]:
#changing 3x4 matrix to 6x2
np.reshape(three_by_four, (6,2))
#note: there must be an equal amount of elements or you will get an error

array([[0, 2],
       [4, 6],
       [3, 2],
       [1, 5],
       [2, 4],
       [2, 1]])