# Introduction to numpy

   Numpy is a package that contains types and functions for mathematical calculations on arrays. The numpy library is vast and encapsulates a wide range of tools for linear algebra, Fourrier analysis, satistics and much more. The full manual for numpy can be found here:
    https://docs.scipy.org/doc/numpy/reference/
    
   In this tutorial, we will touch on the basics of numpy and see how numpy can be a convenient tool for artithmetic operations on arbitrarilly large arrays. 
   In addition to this, we will also see a brief introduction to matplotlib, which contains tools to display diagrams or images that help illustrating the results of your calculations.
   
   Numpy is almost universally import using the alias 'np'. Given that codes using numpy will generally make frequent use of calls to numpy functions or objects, the loss of the 3 letters actually matters.

In [1]:
# Let's first import the package
import numpy as np
#Tadaaaa now we have all the power of the mighty numpy at our disposal. 
#Let's use it responsibly

## Numpy's ndarray

One of the reasons that makes numpy a great tool for computations on arrays is it ndarray calls. This class allows to declare arrays with a number of convenient methods and attributes that makes our life easier when programming complex algorithms on large arrays.

In [2]:
#Let's take a look at the class:
print(np.ndarray.__doc__)


ndarray(shape, dtype=float, buffer=None, offset=0,
            strides=None, order=None)

    An array object represents a multidimensional, homogeneous array
    of fixed-size items.  An associated data-type object describes the
    format of each element in the array (its byte-order, how many bytes it
    occupies in memory, whether it is an integer, a floating point number,
    or something else, etc.)

    Arrays should be constructed using `array`, `zeros` or `empty` (refer
    to the See Also section below).  The parameters given here refer to
    a low-level method (`ndarray(...)`) for instantiating an array.

    For more information, refer to the `numpy` module and examine the
    methods and attributes of an array.

    Parameters
    ----------
    (for the __new__ method; see Notes below)

    shape : tuple of ints
        Shape of created array.
    dtype : data-type, optional
        Any object that can be interpreted as a numpy data type.
    buffer : object exposing buf

In [3]:
#And its attributes and methods
dir(np.ndarray)

['T',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_finalize__',
 '__array_interface__',
 '__array_prepare__',
 '__array_priority__',
 '__array_struct__',
 '__array_ufunc__',
 '__array_wrap__',
 '__bool__',
 '__class__',
 '__complex__',
 '__contains__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__ifloordiv__',
 '__ilshift__',
 '__imatmul__',
 '__imod__',
 '__imul__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__irshift__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lshift__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_e

In [4]:
#Now let's see what one of its instances looks like:
a = np.ndarray(4)
b = np.ndarray([3,4])
print(type(b))
print('a: ', a)
print('b: ', b)

<class 'numpy.ndarray'>
a:  [2.35541533e-312 6.79038654e-313 2.22809558e-312 2.14321575e-312]
b:  [[0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000]
 [0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000]
 [0.00000000e+000 1.66148838e-287 1.28822975e-231 1.28822975e-231]]


There is a wide range of numpy functions that allow to declare ndarrays filled with your favourite flavours:

https://docs.scipy.org/doc/numpy/reference/routines.array-creation.html

In [5]:
# zeros
z = np.zeros(5)
print(type(z))
print(z)

<class 'numpy.ndarray'>
[0. 0. 0. 0. 0.]


In [6]:
# ones
o = np.ones((4,2))
print(type(o))
print(o)

<class 'numpy.ndarray'>
[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]


In [7]:
# ordered integers
oi = np.arange(10) #Only one-dimensional
print(type(oi))
print(oi)

<class 'numpy.ndarray'>
[0 1 2 3 4 5 6 7 8 9]


### Operations on ndarrays

Arithmetic operations on ndarrays are possible using python's symbols. It is important to notice that these operations are performed term by term on arrays of same size and dimensions. It is also possible to make operations between ndarrays and numbers, in which case, the same operation is performed on all the elements of the array. This is more generally true for operations on arrays where one array lacks one or several dimensions.

In [8]:
#An array of ones
x = np.arange(5)
#An array of random values drawn uniformly between 0 and 1
y = np.random.rand(5)
print('x: ', x)
print('y: ', y)

x:  [0 1 2 3 4]
y:  [0.71002536 0.43486962 0.63831997 0.08011913 0.32726266]


In [9]:
print('addition: ', x + y)
print('mutliplication: ', x * y)
print('power: ', x ** y)

addition:  [0.71002536 1.43486962 2.63831997 3.08011913 4.32726266]
mutliplication:  [0.         0.43486962 1.27663994 0.2403574  1.30905065]
power:  [0.         1.         1.55651553 1.09200981 1.57409796]


In [10]:
#Operation with numbers
print('subtraction: ', x - 3)
print('fraction: ', x / 2)
print('power: ', x ** 0.5)

subtraction:  [-3 -2 -1  0  1]
fraction:  [0.  0.5 1.  1.5 2. ]
power:  [0.         1.         1.41421356 1.73205081 2.        ]


In [11]:
#Beware incompatible shapes: (play with the dimensions of y)
y = np.ones((6))
print('addition: ', x + y)
print('mutliplication: ', x * y)
print('power: ', x ** y)

ValueError: operands could not be broadcast together with shapes (5,) (6,) 

ndarrays and numpy also have methods or functions to perform matrix operations:

In [12]:
#Let's just declare some new arrays
x = (np.random.rand(4,5)*10).astype(int) # note, astype is a method that allows to change the type of all the elements in the ndarray
y = np.ones((5))+1
# Note: here, show addition of non-matching shapes
#np.ones((5,3,4))+np.random.randn(4)

In [13]:
#transpose
print('the array x: \n', x)
print('its transpose: \n', x.T)

the array x: 
 [[1 2 1 1 5]
 [6 3 1 6 1]
 [3 2 9 3 4]
 [0 3 7 1 5]]
its transpose: 
 [[1 6 3 0]
 [2 3 2 3]
 [1 1 9 7]
 [1 6 3 1]
 [5 1 4 5]]


In [14]:
#Matrix multiplication (play with the dimensions of y to see how this impact the results)
z1 = np.dot(x,y)
z2 = x.dot(y)
print(z1)
print(z2)


[20. 34. 42. 32.]
[20. 34. 42. 32.]


### ndarray method for simple operations on array elements 

Here I list a small number of ndarray methods that are very convenient and often used in astronomy and image processing. It is always a good thing to have them in mind to simplfy your code. Of course, we only take a look at a few of them, but there is plenty more where it comes from.

In [15]:
a = np.linspace(1,6,3) # 3 values evenly spaced between 1 and 6
b = np.arange(16).reshape(4,4)
c = np.random.randn(3,4)*10 # random draws from a normal distribution with standard deviation 10
print('Here are 3 new arrays:\n {}, \n\n {}\n and \n {}'.format(a, b, c))

Here are 3 new arrays:
 [1.  3.5 6. ], 

 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
 and 
 [[  1.15176078  -2.83274596 -11.26326049  10.40245995]
 [ 22.78299066 -12.72256218   1.53934741  27.99788344]
 [ -8.40377001  13.40163784   9.4279334   -2.5229953 ]]


In [16]:
#Sum the elements of an array
print('Sum over all of the array\'s elements: ', a.sum())
print('Sum along the lines: ', b.sum(axis = 1))
print('Sum along the columns: ', b.sum(axis = 0))
#The axis option will be available for most numpy functions/methods

Sum over all of the array's elements:  10.5
Sum along the lines:  [ 6 22 38 54]
Sum along the columns:  [24 28 32 36]


In [17]:
#Compute the mean and standard deviation:
print('mean of an array: ', b.mean())
print('std of an array: ', c.std())

mean of an array:  7.5
std of an array:  12.440977946221222


In [18]:
#min and max of an array and teir positions
print('the minimum value of array b is {} and it is at position {}'.format(b.min(), b.argmin()))
print('the maximum value of array c is {} and it is at position {}'.format(c.max(), c.argmax()))

the minimum value of array b is 0 and it is at position 0
the maximum value of array c is 27.99788343724404 and it is at position 7


In [19]:
#sort an array's elements along one axis or return the indexes of the sorted array's element:
argc = c.argsort()
print('c, the indexes that sort c and a sorted verison of c: \n {}, \n \n {} and \n {} \n'.format(c,argc,c.sort()))

c, the indexes that sort c and a sorted verison of c: 
 [[-11.26326049  -2.83274596   1.15176078  10.40245995]
 [-12.72256218   1.53934741  22.78299066  27.99788344]
 [ -8.40377001  -2.5229953    9.4279334   13.40163784]], 
 
 [[2 1 0 3]
 [1 2 0 3]
 [0 3 2 1]] and 
 None 



Oups, not what we were expecting, but what happened is that c was replaced by its sorted version. 
This is an in-place computation.

In [20]:
print(c)

[[-11.26326049  -2.83274596   1.15176078  10.40245995]
 [-12.72256218   1.53934741  22.78299066  27.99788344]
 [ -8.40377001  -2.5229953    9.4279334   13.40163784]]


In [21]:
#Your turn now: give me the ALL the elements of c sorted (not just along one axis). 

#Your answer....

In [22]:
#Then, create an array with the same shape as c and its elements are the elements of c sorted in decreasing order.
#The returned sorted array should be read from left to right and top to bottom.

#Your answer....

### array shapes
It is possible to access the shape and size (there is a difference!) of an array, and even to alter its shape in various different way.

In [23]:
print('Shape of x: ',x.shape) # From ndarray attributes
print('Shape of y: ',np.shape(y)) # From numpy function

Shape of x:  (4, 5)
Shape of y:  (5,)


In [24]:
print('Size of x: ', x.size) # From ndarray attributes
print('Size of y: ', np.size(y)) # From numpy function

Size of x:  20
Size of y:  5


Now this is how we can change an array's size:

In [25]:
print('the original array: \n', x)
print('change of shape: \n', x.reshape((10,2)))#reshape 4x5 into 10x2
print('change of shape and number of dimensions: \n', x.reshape((5,2,2)))#reshape 4x5 into 5x2x2
print('the size has to be conserved: \n', x.reshape((10,2)).size)

the original array: 
 [[1 2 1 1 5]
 [6 3 1 6 1]
 [3 2 9 3 4]
 [0 3 7 1 5]]
change of shape: 
 [[1 2]
 [1 1]
 [5 6]
 [3 1]
 [6 1]
 [3 2]
 [9 3]
 [4 0]
 [3 7]
 [1 5]]
change of shape and number of dimensions: 
 [[[1 2]
  [1 1]]

 [[5 6]
  [3 1]]

 [[6 1]
  [3 2]]

 [[9 3]
  [4 0]]

 [[3 7]
  [1 5]]]
the size has to be conserved: 
 20


In [26]:
#flattenning an array:
xflat = x.flatten()

print('flattened array: \n {} \n with shape {}'.format(xflat, xflat.shape))


flattened array: 
 [1 2 1 1 5 6 3 1 6 1 3 2 9 3 4 0 3 7 1 5] 
 with shape (20,)


### Indexing with numpy

For the most part, indexing in numpy works exactly as we saw in python. We are going to use this section to introduce a couple of features for indexing (some native from python) that can significantly improve your coding skills. In particular, numpy introduces a particularly useful object: np.newaxis.

In [27]:
#conventional indexing
print(x)
print('first line of x: {}'.format(x[0,:]))
print('second column of x: {}'.format(x[:,1]))
print('last element of x: {}'.format(x[-1,-1]))

[[1 2 1 1 5]
 [6 3 1 6 1]
 [3 2 9 3 4]
 [0 3 7 1 5]]
first line of x: [1 2 1 1 5]
second column of x: [2 3 2 3]
last element of x: 5


In [28]:
#selection 
print('One element in 3 between the second and 13th element: ', xflat[1:14:3])
#This selection writes as array[begin:end:step]
#Equivalent to:
print('One element in 3 between the second and 13th element: ', xflat[slice(1,14,3)])
#Both notations are strictly equivalent, but slice allows to declare slices that can be used in different arrays:
sl1 = slice(1,3,1)
sl2 = slice(0,-1,2)
print('sliced array: ', x[sl1, sl2])

One element in 3 between the second and 13th element:  [2 5 1 3 3]
One element in 3 between the second and 13th element:  [2 5 1 3 3]
sliced array:  [[6 1]
 [3 9]]


In [29]:
#conditional indexing
print('all numbers greater that 3: ', x[x>3])
bool_array = (x == 8)
print('bool arrray is an array of booleans that can be used as indices: \n',bool_array)
print('all numbers greater that 3: ', x[bool_array])

all numbers greater that 3:  [5 6 6 9 4 7 5]
bool arrray is an array of booleans that can be used as indices: 
 [[False False False False False]
 [False False False False False]
 [False False False False False]
 [False False False False False]]
all numbers greater that 3:  []


In [30]:
#Ellipsis: select all across all missing dimensions
x_multi = np.arange(32).reshape(2,2,4,2)
print(x_multi)
print(x_multi[0,...,1])
print(x_multi[0,:,:,1])

[[[[ 0  1]
   [ 2  3]
   [ 4  5]
   [ 6  7]]

  [[ 8  9]
   [10 11]
   [12 13]
   [14 15]]]


 [[[16 17]
   [18 19]
   [20 21]
   [22 23]]

  [[24 25]
   [26 27]
   [28 29]
   [30 31]]]]
[[ 1  3  5  7]
 [ 9 11 13 15]]
[[ 1  3  5  7]
 [ 9 11 13 15]]


Now, we are going to see an important feature in numpy. While one can live without nowing this trick, one cannot be a good python coder without using it. I am talking about the mighty:
# Newaxis!!
Newaxis allows to add a dimension to an array. This allows to expand arrays in a cheap way, which leads to faster operations on large arrays.


In [58]:
import numpy as np
#A couple of arrays first:
x_arr = np.arange(10)
y_arr = np.arange(10)
print(x_arr.shape)
x = x_arr[np.newaxis,:]
print(x.shape)
print(x_arr)
print(x)
print(x+x_arr)

(10,)
(1, 10)
[0 1 2 3 4 5 6 7 8 9]
[[0 1 2 3 4 5 6 7 8 9]]
[[ 0  2  4  6  8 10 12 14 16 18]]


In [54]:
#Now let's index these with newaxes:
print('Newaxis indexed array {} and its shape {}'.format(x_arr[:,np.newaxis],x_arr[:,np.newaxis].shape))
print('None leads to the same result: array {} and shape {}'.format(y_arr[None,:],y_arr[None,:].shape))

Newaxis indexed array [[2]
 [4]
 [6]
 [8]] and its shape (4, 1)
None leads to the same result: array [[0 1 2 3]] and shape (1, 4)


In [55]:
#Sum of elements
print('sum of the arrays:', (x_arr + y_arr))


sum of the arrays: [ 2  5  8 11]


In [56]:
#Now sum between all elements in the array: How would you do that?


# A quick intro to matplotlib

When wrinting complex algorithms, it is important to be able to chack that calculations are done properly, but also to be able to display results in a clear manner. When dimaensionality and size are small, it is still possible to rely on printing, but more generally and for better clarity, drawing graphs will come handy

In [None]:
import matplotlib.pyplot as plt

In [None]:
%matplotlib inline

In [None]:
x = np.linspace(0,5,100)

In [None]:
#Plotting a curve
plt.plot(np.exp(x))
plt.show()

In [None]:
#The same curve with the right x-axis in red dashed line
plt.plot(x, np.exp(x), '--r')
plt.show()