# RIT IMGS 211: Probability and Statistics for Imaging Scientists
### Created by Gabriel J. Diaz

![](https://www.rit.edu/brandportal/sites/rit.edu.brandportal/files/inline-images/new_RIT_full_RGB_hor_k_0.png?export=view&id=XXX)


This notebook covers:

* Importing packages like numpy and matplotlib
* Importing packages in Google colab
* Creating and indexing into numpy arrays
* Creating numpy arrays using np.arange(), np.linspace(), np.ones(), and np.zeroes()
* Creating array of random numbers with the np.random package

# 1- Packages



Python has a large standard library, commonly cited as one of Python's greatest strengths, providing tools suited to many tasks. As of May, 2017, the official repository containing third-party software for Python, contains over 107,000 packages.

A *package* is a collection of *modules* i.e. groups of functions, classes, constants, types, etc.

## Built-in modules

To use a module, you have to *import* it using the command `import`.

In [2]:
import math

You can now use it in the code by using its name as a prefix.

In [3]:
x = math.cos(2 * math.pi)

print(x)

1.0


To explore the function and other content of the module/library:
 * Use the web documentation (e.g. for the `math` library <a href="https://docs.python.org/3/library/math.html">Doc for Python 3</a>)
 * Use the built-in`help`

In [4]:
help(math.cos)

#. You can also execute help.math, but that provides a LOT of output, so I'm not demonstrating that here.

Help on built-in function cos in module math:

cos(x, /)
    Return the cosine of x (measured in radians).



Using the name prefix can make the code obfuscated as it can get quite verbose (e.g. `scipy.optimize.minimize`) so Python provides simpler ways to import:
* `import name as nickname`: the prefix to call is now `nickname`

In [5]:
import math as m

print(m.pi)

3.141592653589793


* `from name import function1,constant1` : `function1`  `constant1` can now be called directly. You can even import all contents with `from name import *` but this may be dangerous as names may conflict or override former ones, it is thus not advised except on user-generated modules.

In [6]:
from math import e,log

print(log(e**4))

4.0



## Packages and Google colab

One of the advantages of using Google colab is that many packages are available to import without any additional work.  If you were running a local copy of Python on your machine, you would have to maintain your own Python "environment.".  

From:
https://www.dataquest.io/blog/a-complete-guide-to-python-virtual-environments/

"Python virtual environments create isolated contexts to keep dependencies required by different projects separate so they don't interfere with other projects or system-wide packages. Basically, setting up virtual environments is the best way to isolate different Python projects, especially if these projects have different and conflicting dependencies. As a piece of advice for new Python programmers, always set up a separate virtual environment for each Python project, and install all the required dependencies inside it — never install packages globally."

They can be tricky to get going for the first time, and that's one reason we're using Google colab:  to save time.  However, if you continue to use Python beyon this course, you will likely want to install your own local copy and maintin your own collection of virtual environments.  [Anaconda](https://www.anaconda.com/) can be a valuable tool for getting this started.

Once installed, you can import the packages as above.

In [7]:
import scipy

In [8]:
# help(scipy)

Since a package (like `scipy`) is a collection of modules, you can import only a part of it. See https://docs.scipy.org/doc/scipy/reference/tutorial/index.html#user-guide for instance.

In [9]:
from scipy.stats import norm

norm.pdf(0.5)

0.3520653267642995

# 2- Numpy and Matplotlib



**Numpy** is a numerical calculus and algebra package that is widely used, notably providing the *array* (vector/matrix format) type used in almost all numerical projects.  [Documentation](https://numpy.org/doc/stable/) and [Reference](https://numpy.org/doc/stable/reference/index.html)


**Matplotlib** is a module for generating 2D and 3D graphics. [Documentation ](https://matplotlib.org/stable/contents.html)


It is common to import them with the respective nicknames **np** and **plt** (for `matplotlib.pyplot`).

In [10]:
import numpy as np

## Numpy *arrays*

In Numpy, the type *array* is used for vector, matrices, tensors (a matrix type also exists but is more seldomly used).

Numpy arrays can be defined either directly from a list or outputted by a function.


### Convert a Python list into a numpy array

In [11]:
x = [1, 2.5, 5, 10]
print(x,type(x))

x = np.array([1, 2.5, 5, 10])
print(x,type(x))

[1, 2.5, 5, 10] <class 'list'>
[ 1.   2.5  5.  10. ] <class 'numpy.ndarray'>


### Create a multidimensional array

In [19]:
# Notice that a list of two lists.
m = [[1,2,3],[4,5,6]]
print(m)

M = np.array(m)
print(M)

[[1, 2, 3], [4, 5, 6]]
[[1 2 3]
 [4 5 6]]


# Array shape

The `size` of an array is the number of elements while the `shape` gives how they are arranged.

The notion of array *shape* is very important, especially when arrays are too large to inspect visually.  For example, an image might be 640 rows x 480 columns x 3 color channels (RGB) large!

In [72]:
m = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(m)

print(f'\nThe array size is: {np.size(m)}')  # or equivalently m.size
print(f'\nThe array shape is: {np.shape(m)}')


[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]

The array size is: 12

The array shape is: (3, 4)


## Array indexing

np arrays are powerful, in part, because they allow you to easily index into specific elements.

In [73]:
m = np.array([[1,2,3],[4,5,6]])
print(m)

# Remember, python indexing starts at 0.
print( f'The value in the first row and second column is: {m[0,1]}')
print( f'The value in the second row and second column is: {m[1,1]}')

# Notice the use of negative indexing still works.
print( f'The value in the last row and last column is: {m[-1,-1]}.')


[[1 2 3]
 [4 5 6]]
The value in the first row and second column is: 2
The value in the second row and second column is: 5
The value in the last row and last column is: 6.


# Slicing into an array with the ":" operator.  

Use "<start>:<end>" to specify a range or "slice" of the data that you want to grab.

You can also use ":" on its own to indicate that you want ALL data in a dimension.



In [74]:
m = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(m)

# Use ":" on its own to get all values in a dimension.
print( f'\nAll values in the third row are: {m[2,:]}.')
print( f'All values in the second column are: {m[:,1]}.')

# Use "<start>:<end>" to specify a range or "slice" of the data that you want to grab.
print( f'\nThe second and third values in the third row are: {m[2,1:3]}.')

# Use "<start>:" or ":<end>" to get everything before/after the start/end.
print( f'\nIn the first row, the values that come after the first are: {m[0,1:]}.')
print( f'The first three values in the second row are: {m[1,:3]}.')

# You can combine the techniques.
print( f'\nThe second and third values in the every row are:\n {m[:,1:3]}.')


[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]

All values in the third row are: [ 9 10 11 12].
All values in the second column are: [ 2  6 10].

The second and third values in the third row are: [10 11].

In the first row, the values that come after the first are: [2 3 4].
The first three values in the second row are: [5 6 7].

The second and third values in the every row are:
 [[ 2  3]
 [ 6  7]
 [10 11]].


# Numpy array generation

* np.linspace()
* np.arange()
* np.zeros()
* np.ones()

See the corresponding [documentation](https://numpy.org/doc/stable/user/basics.creation.html)

### Number sequences


`arange` returns an array of evenly spaced number from `start` to (at most) `stop` with a fixed jump `step`


`linspace` returns an array of evenly spaced number from `start` to (exactly) `stop` with a fixed number of points `num`

In [None]:
x = np.arange(0, 10, 1.5)
print(x,type(x))

[0.  1.5 3.  4.5 6.  7.5 9. ] <class 'numpy.ndarray'>


In [None]:
y = np.linspace(0, 10, 25)
print(y,type(y))

[ 0.          0.41666667  0.83333333  1.25        1.66666667  2.08333333
  2.5         2.91666667  3.33333333  3.75        4.16666667  4.58333333
  5.          5.41666667  5.83333333  6.25        6.66666667  7.08333333
  7.5         7.91666667  8.33333333  8.75        9.16666667  9.58333333
 10.        ] <class 'numpy.ndarray'>


## Zeros and Ones


`zeros` returns an array (of floats) of zeros  of the precised `shape`

`ones`  returns an array (of floats) of ones  of the precised `shape`

`eye`  returns a square 2D-array (of floats) with ones on the diagonal and zeros elsewhere  

In [None]:
x = np.zeros(3)
print(x,x.shape,type(x),x.dtype)

x = np.zeros((3,))
print(x,x.shape,type(x),x.dtype)

[0. 0. 0.] (3,) <class 'numpy.ndarray'> float64
[0. 0. 0.] (3,) <class 'numpy.ndarray'> float64


In [None]:
try:
    x = np.zeros(3,3) # This causes an error as 3,3 is not a shape, it is (3,3) -> double parentheses
except Exception as error:
    print(error)

print(x,x.shape,type(x),x.dtype)

Cannot interpret '3' as a data type
[0. 0. 0.] (3,) <class 'numpy.ndarray'> float64


In [None]:
x = np.zeros((3,3))
print(x,x.shape,type(x),x.dtype)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]] (3, 3) <class 'numpy.ndarray'> float64


In [None]:
y = np.ones(2)
y

array([1., 1.])

## Creating arrays of random data with numpy


Random arrays can be generated by Numpy's [random](https://numpy.org/doc/stable/reference/random/index.html) module.


`rand` returns an array (of floats) of uniformly distributed numbers in [0,1)  of the precised dimension

`randn`  returns an array (of floats) of numbers from the normal distribution of the precised dimension

`randint`  returns an array (of floats) of integers from the discrete uniform distribution

In [None]:
np.random.rand(5)

array([0.09708256, 0.28084182, 0.03623694, 0.83693802, 0.89064811])

In [None]:
np.random.randn(5,2)

array([[-0.13167876,  0.53288997],
       [ 1.94922176, -0.55397441],
       [-0.8463868 , -1.96584426],
       [ 0.23209953,  0.58799903],
       [-0.08861985,  0.27334093]])

In [None]:
np.random.randint(0,100,size=(10,))

array([88,  6, 78,  3, 18, 17, 96, 13, 13, 90])

## Mathematical operations on np.arrays


In [97]:
v = np.arange(0, 5)

print(f'    v = {v}')
print(f'  v+2 = {v+2.5}')
print(f'  v*2 = {v*2}')
print(f' v**2 = {v**2}') # ** means raised to the power of two (squared)


    v = [0 1 2 3 4]
  v+2 = [2.5 3.5 4.5 5.5 6.5]
  v*2 = [0 2 4 6 8]
 v**2 = [ 0  1  4  9 16]


### Transposition

It can be useful to transpose, it is simply done by suffixing `.T` (or equivalently using the function `np.transpose`). Similarly `.H` is the Hermitian conjugate,  `.imag`  `.real` are the real and imaginary parts and  `.abs` the modulus (their *full* versions are respectively `np.conjugate`, `np.imag`, etc.)

In [113]:
A = np.array([[n+m*10 for n in range(5)] for m in range(4)])

print(f'A=\n{A} \n')
print(f'A.T=\n{A.T}. \nThis is the transpose of A.')

A=
[[ 0  1  2  3  4]
 [10 11 12 13 14]
 [20 21 22 23 24]
 [30 31 32 33 34]] 

A.T=
[[ 0 10 20 30]
 [ 1 11 21 31]
 [ 2 12 22 32]
 [ 3 13 23 33]
 [ 4 14 24 34]]. 
This is the transpose of A.


# Flatten and reshape

* <np.array>.reshape()
* <np.array>.flatten()

See the Documentation on [arrays](https://numpy.org/doc/stable/reference/arrays.ndarray.html)  and  [array creation](https://numpy.org/doc/stable/reference/routines.array-creation.html).

*Warning:* Modificators such as transpose, reshape, etc. do not modify the matrix, if you want to keep the result of the operation, you have to assign a variable to it. The notable exceptions are precised as *in-place* in the documentation.

In [None]:
A.reshape((2,10))

array([[ 0,  1,  2,  3,  4, 10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24, 30, 31, 32, 33, 34]])

In [None]:
print(A)

[[ 0  1  2  3  4]
 [10 11 12 13 14]
 [20 21 22 23 24]
 [30 31 32 33 34]]


In [125]:
A = np.array([[n+m*10 for n in range(5)] for m in range(4)])

print(f'A:\n{A}\n')
print(f'The shape of A is: {np.shape(A)}.')
print(f'That means it has {np.shape(A)[0]} rows and {np.shape(A)[1]} columns')
print(f'The total number of elements is the product of those two numbers: {np.product(np.shape(A))}.')

print(f'\nUsing the <np.array>.flatten() operation concatenates all the rows of the multidimensional array into a one dimensional array')
print(f'It makes sense then that the shape of A.flatten() is: {np.shape(A.flatten())}.  Notice that it is a one dimensional array.')
print(f'...with a length equal to np.size(A): {np.size(A)}')


A:
[[ 0  1  2  3  4]
 [10 11 12 13 14]
 [20 21 22 23 24]
 [30 31 32 33 34]]

The shape of A is: (4, 5).
That means it has 4 rows and 5 columns
The total number of elements is the product of those two numbers: 20.

Using the <np.array>.flatten() operation concatenates all the rows of the multidimensional array into a one dimensional array
It makes sense then that the shape of A.flatten() is: (20,).  Notice that it is a one dimensional array.
...with a length equal to np.size(A): 20


In [114]:
print(A.max())

34


# Concatenation and stacking

In [None]:
np.concatenate((a, b), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

In [None]:
np.concatenate((a, b.T), axis=1)

array([[1, 2, 5],
       [3, 4, 6]])

In [None]:
np.vstack((a,b))

array([[1, 2],
       [3, 4],
       [5, 6]])

In [None]:
np.hstack((a,b.T))

array([[1, 2, 5],
       [3, 4, 6]])

### Iterating on arrays

In [None]:
v = np.array([1,2,3,4])

for element in v:
    print(element)

1
2
3
4


In [None]:
a = np.array([[1,2], [3,4]])

for row in a:
    print("row", row)

    for element in row:
        print(element)

row [1 2]
1
2
row [3 4]
3
4


<tt>enumerate</tt> can be used to get indexes along with elements.

In [None]:
for row_idx, row in enumerate(a):
    print("row_idx", row_idx, "row", row)

    for col_idx, element in enumerate(row):
        print("col_idx", col_idx, "element", element)

        # update the matrix a: square each element
        a[row_idx, col_idx] = element ** 2

row_idx 0 row [1 2]
col_idx 0 element 1
col_idx 1 element 2
row_idx 1 row [3 4]
col_idx 0 element 3
col_idx 1 element 4


In [None]:
a

array([[ 1,  4],
       [ 9, 16]])