## NumPy
### BIOINF 575 



_____


### NumPy - Numeric python <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/1a/NumPy_logo.svg/1200px-NumPy_logo.svg.png" alt="NumPy logo" width = "100">

____
#### A list contains refences to each of the values.
#### An array refers to a block of memory containg all values one after the other.
- <b>that is why we need to know the size of the array and the array size cannot change <br>


<img src = "https://www.python-course.eu/images/list_structure.png" width = 350 /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src = "https://www.python-course.eu/images/array_structure.png" width = 350 />
____

#### Arrays of different dimensions (`shape` gives the number of elements on each dimension):
<img src="https://raw.githubusercontent.com/elegant-scipy/elegant-scipy/master/figures/NumPy_ndarrays_v2.svg" alt="data structures" width="600">  

https://github.com/elegant-scipy/elegant-scipy
_____


#### <b>NumPy basics</b>

Arrays are designed to:
* <b>handle vectorized operations (lists cannot do that)</b>
    - if you apply a function it is performed on every item in the array, rather than on the whole array object
    - both arrays and lists have 0-based indexing
* <b>store multiple items of the same data type</b>
* <b>handle missing values </b>
    - missing numerical values are represented using the `np.nan` object (not a number)
    - the object `np.inf` represents infinite  
* <b>have an unchangeable size</b>
    - array size cannot be changed, should create a new array if you want to change the size
    - you know when you create the array how much space you need for it and that will not change  
* <b>have efficient memory usage</b>
    - an equivalent numpy array occupies much less space than a python list of lists

#### <b>Basic array attributes:</b>
* shape: array dimension - tuple with the number of elements in each dimension
* size: Number of elements in array
* ndim: Number of array dimension (len(arr.shape))
* dtype: Data-type of the array

#### <b>Importing NumPy
The recommended convention to import numpy is to use the <b>np</b> alias:

In [1]:
import numpy as np


##### -----

In [5]:
# all functionality available in numpy
# dir(np)


##### -----

#### <b>Documentation and help
https://numpy.org/doc/

In [7]:
np.lookfor('sum') 

Search results for 'sum'
------------------------
numpy.sum
    Sum of array elements over a given axis.
numpy.cumsum
    Return the cumulative sum of the elements along a given axis.
numpy.einsum
    einsum(subscripts, *operands, out=None, dtype=None, order='K',
numpy.nansum
    Return the sum of array elements over a given axis treating Not a
numpy.nancumsum
    Return the cumulative sum of array elements over a given axis treating Not a
numpy.einsum_path
    Evaluates the lowest cost contraction order for an einsum expression by
numpy.trace
    Return the sum along diagonals of the array.
numpy.ma.sum
    Return the sum of the array elements over the given axis.
numpy.bool_.sum
    Scalar method identical to the corresponding array attribute.
numpy.polyadd
    Find the sum of two polynomials.
numpy.ma.cumsum
    Return the cumulative sum of the array elements over the given axis.
numpy.logaddexp
    Logarithm of the sum of exponentiations of the inputs.
numpy.bool_.cumsum
    Scalar

In [9]:
np.me*?

np.mean
np.median
np.memmap
np.meshgrid

In [11]:
np.mean?

[0;31mSignature:[0m      
[0mnp[0m[0;34m.[0m[0mmean[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0ma[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0maxis[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdtype[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mout[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mkeepdims[0m[0;34m=[0m[0;34m<[0m[0mno[0m [0mvalue[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m*[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mwhere[0m[0;34m=[0m[0;34m<[0m[0mno[0m [0mvalue[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mCall signature:[0m  [0mnp[0m[0;34m.[0m[0mmean[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mType:[0m            _ArrayFunctionDispatcher
[0;31mString form:[0m     <function mean at 0x11349ba60>
[0;31mFile:[0

In [13]:
help(np.mean)

Help on _ArrayFunctionDispatcher in module numpy:

mean(a, axis=None, dtype=None, out=None, keepdims=<no value>, *, where=<no value>)
    Compute the arithmetic mean along the specified axis.

    Returns the average of the array elements.  The average is taken over
    the flattened array by default, otherwise over the specified axis.
    `float64` intermediate and return values are used for integer inputs.

    Parameters
    ----------
    a : array_like
        Array containing numbers whose mean is desired. If `a` is not an
        array, a conversion is attempted.
    axis : None or int or tuple of ints, optional
        Axis or axes along which the means are computed. The default is to
        compute the mean of the flattened array.

        .. versionadded:: 1.7.0

        If this is a tuple of ints, a mean is performed over multiple axes,
        instead of a single axis or all the axes as before.
    dtype : data-type, optional
        Type to use in computing the mean.  For

#### <b>Motivating example</b> - transform temperatures from Celsius to Farenheit

In [17]:
temp_list_C = [-20, 25, 3, 10]

In [19]:
# using lists we need a loop to apply the formula to 
# each element of the list

temp_list_F = []

for temp in temp_list_C:
    temp_list_F.append(temp * 1.8 + 32)

temp_list_F

[-4.0, 77.0, 37.4, 50.0]

In [21]:
temp_list_F = [temp * 1.8 + 32 for temp in temp_list_C]
temp_list_F

[-4.0, 77.0, 37.4, 50.0]

In [23]:
# using arrays we can apply the formula directly to the array and 
# it will be applied to each element

temp_array_C = np.array(temp_list_C)
temp_array_C

array([-20,  25,   3,  10])

In [25]:
temp_array_F = temp_array_C * 1.8 + 32
temp_array_F

array([-4. , 77. , 37.4, 50. ])

#### <b>Functions for creating arrays</b>
https://docs.scipy.org/doc/numpy-1.13.0/user/basics.creation.html

##### np.array() - array from lists - e.g. 2D array from a list of lists

In [27]:
help(np.array)



Help on built-in function array in module numpy:

array(...)
    array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
          like=None)

    Create an array.

    Parameters
    ----------
    object : array_like
        An array, any object exposing the array interface, an object whose
        ``__array__`` method returns an array, or any (nested) sequence.
        If object is a scalar, a 0-dimensional array containing object is
        returned.
    dtype : data-type, optional
        The desired data-type for the array. If not given, NumPy will try to use
        a default ``dtype`` that can represent the values (by applying promotion
        rules when necessary.)
    copy : bool, optional
        If true (default), then the object is copied.  Otherwise, a copy will
        only be made if ``__array__`` returns a copy, if obj is a nested
        sequence, or if a copy is needed to satisfy any of the other
        requirements (``dtype``, ``order``, etc.).
  

In [29]:
np.array("AACGT")

array('AACGT', dtype='<U5')

In [31]:
np.array(("A","C"))

array(['A', 'C'], dtype='<U1')


##### -----

In [None]:
# all functionality of a numpy array
# dir(np.array([1]))

'T', 
 'all',
 'any',
 'argmax',
 'argmin',
 'argpartition',
 'argsort',
 'astype',
 'base',
 'byteswap',
 'choose',
 'clip',
 'compress',
 'conj',
 'conjugate',
 'copy',
 'ctypes',
 'cumprod',
 'cumsum',
 'data',
 'diagonal',
 'dot',
 'dtype',
 'dump',
 'dumps',
 'fill',
 'flags',
 'flat',
 'flatten',
 'getfield',
 'imag',
 'item',
 'itemset',
 'itemsize',
 'max',
 'mean',
 'min',
 'nbytes',
 'ndim',
 'newbyteorder',
 'nonzero',
 'partition',
 'prod',
 'ptp',
 'put',
 'ravel',
 'real',
 'repeat',
 'reshape',
 'resize',
 'round',
 'searchsorted',
 'setfield',
 'setflags',
 'shape',
 'size',
 'sort',
 'squeeze',
 'std',
 'strides',
 'sum',
 'swapaxes',
 'take',
 'tobytes',
 'tofile',
 'tolist',
 'tostring',
 'trace',
 'transpose',
 'var',
 'view'


##### -----

##### np.arange() - vector of evenly spaced values form a range (arange) given by start, stop and step

In [41]:
# help(np.arange)

np.arange(2,7, dtype = float)


array([2., 3., 4., 5., 6.])

In [45]:
x = np.arange(5)
x

array([0, 1, 2, 3, 4])

In [47]:
x.size

5

In [49]:
x.ndim

1

In [51]:
x.shape

(5,)

In [53]:
x.dtype

dtype('int64')

##### np.linspace() - vector of evenly spaced values (known number, linspace) given by start, stop and number of points

In [61]:
# help(np.linspace)
np.linspace(1,100, 5)


array([  1.  ,  25.75,  50.5 ,  75.25, 100.  ])

##### np.zeros() - array of zeros (e.g. 3D array), there is also a np.ones()

In [73]:
# help(np.zeros)

np.zeros(shape = (3,2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [75]:
np.ones((4,5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

##### More functions to create special arrays:      
    np.identity(n) - 2D square array filled with 1 on the diagonal      
    np.eye(n,m) - 2D array filled with 1 on the diagonal      
    np.full((n,m), val) - array filled with a given value     

In [79]:
np.identity(5, int)

array([[1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1]])

In [81]:
np.eye(4,5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.]])

In [83]:
np.eye(5,4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.]])

In [85]:
np.full((3,4), 50)

array([[50, 50, 50, 50],
       [50, 50, 50, 50],
       [50, 50, 50, 50]])

#### <b>Basic array attributes:</b>
* shape: array dimension
* size: Number of elements in array
* ndim: Number of array dimension (len(arr.shape))
* dtype: Data-type of the array

In [87]:
# nested lists give us multi dimensional arrays

matrix = np.array([[1,2,3],[4,5,6]])
matrix

array([[1, 2, 3],
       [4, 5, 6]])

In [91]:
# dir(matrix)

In [95]:
# transpose
matrix.T

array([[1, 4],
       [2, 5],
       [3, 6]])

In [97]:
# .size - length of array

matrix.size

6

In [99]:
# .shape tells us the size on each dimension and implicit the number of dimensions

matrix

array([[1, 2, 3],
       [4, 5, 6]])

In [101]:
matrix.shape

(2, 3)

In [103]:
# .ndim - number of array dimensions

matrix.ndim

2

In [105]:
# .dtype - type of the dsata stored in the array
matrix.dtype


dtype('int64')

In [107]:
matrix

array([[1, 2, 3],
       [4, 5, 6]])

In [109]:
# .T - transpose of the array (rows and columns switched)
matrix.T

array([[1, 4],
       [2, 5],
       [3, 6]])

#### <b>Reshaping</b> - changing the numbers of rows and columns - data and size stay the same

In [111]:
# .reshape((n,m)) - Reshaping

matrix.reshape(3,2)

array([[1, 2],
       [3, 4],
       [5, 6]])

In [115]:
matrix.T


array([[1, 4],
       [2, 5],
       [3, 6]])

In [113]:
matrix

array([[1, 2, 3],
       [4, 5, 6]])

In [117]:
matrix.reshape(1,6)

array([[1, 2, 3, 4, 5, 6]])

In [119]:
matrix.reshape(6,1)

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

#### <b>Indexing/Slicing(subsetting): [][] or [,]</b>
___
<img src = "http://scipy-lectures.org/_images/numpy_indexing.png" width = 400/>

In [121]:
matrix = np.full((6,6),range(6)) + 10 * np.full((6,6),range(6)).T
matrix

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [123]:
range(6)

range(0, 6)

In [127]:
np.full((6,6),[0,1,2,3,4,5])

array([[0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5]])

In [129]:
np.full((6,6),range(6))

array([[0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4, 5]])

In [131]:
np.full((6,6),range(6)) * 10

array([[ 0, 10, 20, 30, 40, 50],
       [ 0, 10, 20, 30, 40, 50],
       [ 0, 10, 20, 30, 40, 50],
       [ 0, 10, 20, 30, 40, 50],
       [ 0, 10, 20, 30, 40, 50],
       [ 0, 10, 20, 30, 40, 50]])

In [133]:
(np.full((6,6),range(6)) * 10).T

array([[ 0,  0,  0,  0,  0,  0],
       [10, 10, 10, 10, 10, 10],
       [20, 20, 20, 20, 20, 20],
       [30, 30, 30, 30, 30, 30],
       [40, 40, 40, 40, 40, 40],
       [50, 50, 50, 50, 50, 50]])

In [135]:
np.full((6,6),range(6)) + (np.full((6,6),range(6)) * 10).T

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [137]:
matrix

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

#### Indexing/Slicing

In [149]:
# [][] - List-like 

matrix[1:3][1:3]


array([[20, 21, 22, 23, 24, 25]])

In [153]:
# [,] - Using both rows and columns indices to get a value

matrix[2:5, 3:5]

array([[23, 24],
       [33, 34],
       [43, 44]])

In [155]:
matrix_reshaped = matrix.reshape(4,9)
matrix_reshaped

array([[ 0,  1,  2,  3,  4,  5, 10, 11, 12],
       [13, 14, 15, 20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35, 40, 41, 42],
       [43, 44, 45, 50, 51, 52, 53, 54, 55]])

In [157]:
# Using both rows and columns indices to get a sub-matrix

matrix_reshaped[:2,:3]

array([[ 0,  1,  2],
       [13, 14, 15]])

In [159]:
# Fun arrays - display a checkers_board list
checkers_board = np.zeros((6,6),dtype=int)
print(checkers_board)

[[0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]


In [161]:
checkers_board[1::2,::2] = 1
print(checkers_board)

[[0 0 0 0 0 0]
 [1 0 1 0 1 0]
 [0 0 0 0 0 0]
 [1 0 1 0 1 0]
 [0 0 0 0 0 0]
 [1 0 1 0 1 0]]


In [163]:
checkers_board[::2,1::2] = 1
print(checkers_board)

[[0 1 0 1 0 1]
 [1 0 1 0 1 0]
 [0 1 0 1 0 1]
 [1 0 1 0 1 0]
 [0 1 0 1 0 1]
 [1 0 1 0 1 0]]


#### Array of indices subsetting - use array/list of indices to subset array with only the elements given by the indices

In [167]:
matrix 

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [165]:
indices = [0,2,3]
matrix[indices,]

array([[ 0,  1,  2,  3,  4,  5],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35]])

In [171]:
# columns
matrix[:,indices]


array([[ 0,  2,  3],
       [10, 12, 13],
       [20, 22, 23],
       [30, 32, 33],
       [40, 42, 43],
       [50, 52, 53]])

#### conditional subsetting - use array of booleans to subset array with only the elements where the bool array is True

In [173]:
matrix

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [175]:
matrix[:, 5]

array([ 5, 15, 25, 35, 45, 55])

In [181]:
matrix[:, 5:6]

array([[ 5],
       [15],
       [25],
       [35],
       [45],
       [55]])

In [185]:
matrix[5:6,5:6]

array([[55]])

In [187]:
matrix[5:6,5]

array([55])

In [189]:
matrix

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [195]:
matrix[:,0]

array([ 0, 10, 20, 30, 40, 50])

In [197]:
matrix[:,0] > 20

array([False, False, False,  True,  True,  True])

In [None]:
# conditional subsetting
matrix[(matrix[:,0] > 20)]

In [199]:
# deconstruct

matrix[:, [False, False, False,  True,  True,  True]]

array([[ 3,  4,  5],
       [13, 14, 15],
       [23, 24, 25],
       [33, 34, 35],
       [43, 44, 45],
       [53, 54, 55]])

In [201]:
matrix[[False, False, False,  True,  True,  True],]

array([[30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [203]:
matrix

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [207]:
# multiple conditions  
c = (matrix[:,0] > 20) & (matrix[:,0] <= 40)
c

array([False, False, False,  True,  True, False])

In [211]:
matrix[c]

array([[30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45]])

In [213]:
(matrix[:,0] > 20) and (matrix[:,0] <= 40)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [215]:
 (matrix[:,0] > 20)

array([False, False, False,  True,  True,  True])

In [219]:
 (matrix[:,0] <= 40)

array([ True,  True,  True,  True,  True, False])

In [223]:
# [False, False, False,  True,  True,  True] &
# [True,  True,  True,  True,  True, False]

# array([False, False, False,  True,  True, False])

#### <b>Matrix operations</b>

https://www.tutorialspoint.com/matrix-manipulation-in-python<br>
Arithmetic operators on arrays apply element-wise. <br> 
A new array is created and filled with the result.


#### <b>Array broadcasting</b><br>

https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html<br>
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. <br>
Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.

<img src = "https://www.tutorialspoint.com/numpy/images/array.jpg" height=10/>


https://www.tutorialspoint.com/numpy/numpy_broadcasting.htm

In [225]:
matrix = np.arange(1,13).reshape(3,4)
matrix


array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [227]:
# create an array with 4 values

a = np.array([10,20,30,40])
a

array([10, 20, 30, 40])

In [231]:
# addition using a data row

matrix + a

array([[11, 22, 33, 44],
       [15, 26, 37, 48],
       [19, 30, 41, 52]])

In [233]:
a = np.array([10,20,30])
a

array([10, 20, 30])

In [235]:
matrix + a

ValueError: operands could not be broadcast together with shapes (3,4) (3,) 

In [None]:
####

In [237]:
# create an array with 3 values

a

array([10, 20, 30])

In [None]:
matrix

In [249]:
# addition using a data row - error if dimensions do not match

matrix + a

ValueError: operands could not be broadcast together with shapes (3,4) (3,) 

In [241]:
##########

matrix


array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [255]:
# column vector
a = a.reshape(3,1)
a

array([[10],
       [20],
       [30]])

In [257]:


matrix

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [259]:
##########

matrix +a

array([[11, 12, 13, 14],
       [25, 26, 27, 28],
       [39, 40, 41, 42]])

In [261]:
# column vec

a

array([[10],
       [20],
       [30]])

In [263]:
# multiplication with a data column


matrix * a

array([[ 10,  20,  30,  40],
       [100, 120, 140, 160],
       [270, 300, 330, 360]])

In [265]:
matrix

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

#### Simple multiplication `*` of two matrices of the same shape results in the multiplication of the elements at the respective indices 
#### Mathematical matrix multiplication of two matrices (`n1 x m1`, `n2 x m2`) can be done using the `.dot` method or `@` operator but the dimensions need to be compatible: `m1 == n2` 
* the resulting matrix will be `n1 x m2`, it will have the number rows the same as `n1` and no cols the same `m2`
* each value in the resulting matrix is the sum of the product of the paired of elements from the respective row and column 

<img src = "https://miro.medium.com/max/1400/1*YGcMQSr0ge_DGn96WnEkZw.png" width = "400"/>
     
https://towardsdatascience.com/a-complete-beginners-guide-to-matrix-multiplication-for-data-science-with-python-numpy-9274ecfc1dc6
     

#### <b>More matrix computation</b> - basic aggregate functions are available - min, max, sum, mean

In [267]:
matrix

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [269]:
matrix * matrix

array([[  1,   4,   9,  16],
       [ 25,  36,  49,  64],
       [ 81, 100, 121, 144]])

In [5]:
# import numpy as np
m1 = np.array([[1,2,3],[4,5,6]])
m2 = np.array([[10,11], [20,21], [30,31]])
m1, m2

(array([[1, 2, 3],
        [4, 5, 6]]),
 array([[10, 11],
        [20, 21],
        [30, 31]]))

In [7]:
m1 * m2

ValueError: operands could not be broadcast together with shapes (2,3) (3,2) 

In [15]:
m1 * m2.T

array([[ 10,  40,  90],
       [ 44, 105, 186]])

In [17]:
m1

array([[1, 2, 3],
       [4, 5, 6]])

In [19]:
m2.T

array([[10, 20, 30],
       [11, 21, 31]])

In [21]:
m1 @ m2

array([[140, 146],
       [320, 335]])

In [23]:
m1.dot(m2)

array([[140, 146],
       [320, 335]])

In [25]:
m2.dot(m1)

array([[ 54,  75,  96],
       [104, 145, 186],
       [154, 215, 276]])

#### Use the axis argument to compute mean for each column or row
#### axis = 0 - columns
#### axis = 1 - rows

In [33]:
matrix = np.array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])
matrix

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [35]:
help(matrix.sum)

Help on built-in function sum:

sum(...) method of numpy.ndarray instance
    a.sum(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)

    Return the sum of the array elements over the given axis.

    Refer to `numpy.sum` for full documentation.

    See Also
    --------
    numpy.sum : equivalent function



In [41]:
np.sum?

[0;31mSignature:[0m      
[0mnp[0m[0;34m.[0m[0msum[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0ma[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0maxis[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdtype[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mout[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mkeepdims[0m[0;34m=[0m[0;34m<[0m[0mno[0m [0mvalue[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0minitial[0m[0;34m=[0m[0;34m<[0m[0mno[0m [0mvalue[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mwhere[0m[0;34m=[0m[0;34m<[0m[0mno[0m [0mvalue[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mCall signature:[0m  [0mnp[0m[0;34m.[0m[0msum[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mType:[0m            _ArrayFunctionDispatcher
[0;31mString

In [37]:
matrix

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [45]:
matrix.sum()

78

In [55]:
# col sum 

col_sum = matrix.sum(0)
col_sum


array([15, 18, 21, 24])

In [57]:
# row sum

row_sum = matrix.sum(1)
row_sum

array([10, 26, 42])

In [63]:
matrix.mean(1)

array([ 2.5,  6.5, 10.5])

In [65]:
matrix.mean(0)

array([5., 6., 7., 8.])

In [67]:
matrix.std(1)

array([1.11803399, 1.11803399, 1.11803399])

https://www.w3resource.com/python-exercises/numpy/index.php


Create a matrix of 2 rows and 3 columns with every fifth number starting from 1 (e.g. 1,6,11,16,...)


In [69]:
matrix = np.arange(1, 2*3*5+1, 5).reshape(2,3)

matrix

array([[ 1,  6, 11],
       [16, 21, 26]])

#### <font color = "red">Exercise:</font>   


Normalize the values in the matrix to be between 0 and 1 (min-max normalization).     
Substract the minimum value and divide by the maximum value of the resulting values.

In [89]:
matrix

array([[ 1,  6, 11],
       [16, 21, 26]])

In [93]:
mm = matrix.min()
mm

1

In [99]:
m1 = matrix - mm
m1 

array([[ 0,  5, 10],
       [15, 20, 25]])

In [97]:
matrix

array([[ 1,  6, 11],
       [16, 21, 26]])

In [103]:
m1m = m1.max()
m1/m1m

array([[0. , 0.2, 0.4],
       [0.6, 0.8, 1. ]])

In [105]:
mm = matrix.min()
m1 = matrix - mm
m1m = m1.max()
mn = m1/m1m
mn

array([[0. , 0.2, 0.4],
       [0.6, 0.8, 1. ]])

In [101]:
(matrix - matrix.min())/matrix.max()

array([[0.        , 0.19230769, 0.38461538],
       [0.57692308, 0.76923077, 0.96153846]])

In [81]:
%whos

Variable               Type             Data/Info
-------------------------------------------------
NamespaceMagics        MetaHasTraits    <class 'IPython.core.magi<...>mespace.NamespaceMagics'>
col_sum                ndarray          4: 4 elems, type `int64`, 32 bytes
dataframe_columns      function         <function dataframe_columns at 0x137973240>
dataframe_hash         function         <function dataframe_hash at 0x137973060>
dtypes_str             function         <function dtypes_str at 0x137973ba0>
get_dataframes         function         <function get_dataframes at 0x137971d00>
get_ipython            function         <function get_ipython at 0x104b8d440>
getpass                module           <module 'getpass' from '/<...>b/python3.12/getpass.py'>
hashlib                module           <module 'hashlib' from '/<...>b/python3.12/hashlib.py'>
import_pandas_safely   function         <function import_pandas_safely at 0x1379734c0>
is_data_frame          function         <function

In [73]:
np = 5

In [77]:
del np

In [87]:
import numpy as np
np.array([])

array([], dtype=float64)

#### <font color = "red">Exercise:</font>   

Do the same normalization at the row level

In [107]:
matrix

array([[ 1,  6, 11],
       [16, 21, 26]])

In [111]:
minr = matrix.min(1)
minr

array([ 1, 16])

In [113]:
matrix - minr

ValueError: operands could not be broadcast together with shapes (2,3) (2,) 

In [115]:
minr

array([ 1, 16])

In [119]:
minr.reshape(matrix.shape[0],1)

array([[ 1],
       [16]])

In [121]:
matrix.shape

(2, 3)

In [123]:
matrix.shape[0]

2

In [125]:
matrix

array([[ 1,  6, 11],
       [16, 21, 26]])

In [143]:
minr = matrix.min(1)
c = minr.reshape(matrix.shape[0],1)
m1 = matrix - c
rmax = m1.max(1)
cmax = rmax.reshape(matrix.shape[0],1)
m1/cmax

array([[0. , 0.5, 1. ],
       [0. , 0.5, 1. ]])

#### <font color = "red">Exercise:</font>   


* Return the even numbers from the matrix.
* Try to return the indices of the even numbers  (hint: look at the where method).

In [None]:
# help(np.where)

In [145]:
matrix

array([[ 1,  6, 11],
       [16, 21, 26]])

In [159]:
c = (matrix == 6) | (matrix == 26)
c 

array([[False,  True, False],
       [False, False,  True]])

In [161]:
matrix[c]

array([ 6, 26])

In [165]:
pos = np.where(matrix == 6)
pos

(array([0]), array([1]))

In [167]:
matrix[pos]

array([6])

In [169]:
matrix

array([[ 1,  6, 11],
       [16, 21, 26]])

In [171]:
4 % 2

0

In [177]:
matrix % 2

array([[1, 0, 1],
       [0, 1, 0]])

In [179]:
c = matrix % 2 == 0
c

array([[False,  True, False],
       [ True, False,  True]])

In [181]:
matrix[c]

array([ 6, 16, 26])

In [183]:
matrix[c] += 2     # matrix[c] = matrix[c] + 2 

In [185]:
matrix

array([[ 1,  8, 11],
       [18, 21, 28]])

In [189]:
# dir(matrix)

In [191]:
matrix.diagonal?

[0;31mDocstring:[0m
a.diagonal(offset=0, axis1=0, axis2=1)

Return specified diagonals. In NumPy 1.9 the returned array is a
read-only view instead of a copy as in previous NumPy versions.  In
a future version the read-only restriction will be removed.

Refer to :func:`numpy.diagonal` for full documentation.

See Also
--------
numpy.diagonal : equivalent function
[0;31mType:[0m      builtin_function_or_method

In [193]:
matrix

array([[ 1,  8, 11],
       [18, 21, 28]])

In [195]:
matrix.diagonal()

array([ 1, 21])

In [203]:
matrix.diagonal(1)

array([ 8, 28])

#### RESOURCES

http://scipy-lectures.org/intro/numpy/array_object.html#what-are-numpy-and-numpy-arrays   
https://www.python-course.eu/numpy.php   
https://numpy.org/devdocs/user/quickstart.html#universal-functions   
https://www.geeksforgeeks.org/python-numpy/

_____

### Pandas
<img src = "https://upload.wikimedia.org/wikipedia/commons/e/ed/Pandas_logo.svg" width = 200/>

https://commons.wikimedia.org/wiki/File:Pandas_logo.svg

[Pandas](https://pandas.pydata.org/) is a high-performance library that makes familiar data structures, like `data.frame` from R, and appropriate data analysis tools available to Python users.

<img src = "https://media.geeksforgeeks.org/wp-content/uploads/finallpandas.png" width = 550/>

https://www.geeksforgeeks.org/python-pandas-dataframe/