# The numpy library

numpy is an abbreviation for the **num**eric **py**thon library. It is a library that is based upon a main data structure:

* the ```ndarray``` class

The ```ndarray``` class is a numeric datastructure similar to a Python ```list``` but unlike a Python ```list``` broadcasts numeric operators and mathematical functions. 

```numpy``` is the most commonly used third-party Python library. It is fundamental for other popular data science libraries:

* The Python and Data Analysis Library - ```pandas```
* The Matrix Plotting Library - ```matplotlib```
* The Data Visualization Library - ```seaborn``` 

These libraries are based upon ```numpy``` and are collectively known as the ```numpy``` stack.

## Categorize_Identifiers Module

This notebook will use the following functions ```dir2```, ```variables``` and ```view``` in the custom module ```categorize_identifiers``` which is found in the same directory as this notebook file. ```dir2``` is a variant of ```dir``` that groups identifiers into a ```dict``` under categories and ```variables``` is an IPython based a variable inspector. ```view``` is used to view a ```Collection``` in more detail:

In [1]:
from categorize_identifiers import dir2, variables, view

## Tuples and Lists

The ```list``` is a ```builtins``` collection that can be used to store numeric data:

In [2]:
nums1 = [1, 2, 3, 4, 5]
nums2 = [2, 4, 6, 8, 10]

However operators are setup for collections and the ```+``` operator for example performs concatenation, instead of addition:

In [3]:
nums1 + nums2

[1, 2, 3, 4, 5, 2, 4, 6, 8, 10]

Numeric addition and other mathematical operations can be broadcast along an inbuilt array using a ```for``` loop:

In [4]:
summed = []

for idx in range(len(nums1)):
    summed.append(nums1[idx] + nums2[idx])

print(summed)

[3, 6, 9, 12, 15]


Or a slightly more elegant list comprehension:

In [5]:
[num1 + num2 for num1, num2 in zip(nums1, nums2)]

[3, 6, 9, 12, 15]

## Array Module

The ```tuple``` and ```list``` collections are very versatile and each record can be a Python ```object``` from a different class:

```python
nums1 = [True, '2', 'three', 4, 5]
```

This versatility however becomes disadvantageous when the intent is to work with only numeric data using a ```for``` loop as seen above. 

Having the wrong datatype for an element will result in a ```TypeError```.

Python has an ```array``` module. It can be imported using:

In [6]:
import array

The ```array``` module has the following identifiers:

In [7]:
dir2(array)

{'attribute': ['typecodes'],
 'lower_class': ['array'],
 'upper_class': ['ArrayType'],
 'datamodel_attribute': ['__doc__', '__name__', '__package__', '__spec__'],
 'datamodel_method': ['__loader__'],
 'internal_method': ['_array_reconstructor']}


The two main identifiers are the attribute ```typecodes```:

In [8]:
array.typecodes

'bBuhHiIlLqQfd'

And the ```array``` class which can be used to create an ```array``` of a uniform datatype:

In [9]:
dir2(array.array, object, unique_only=True)

{'attribute': ['itemsize', 'typecode'],
 'method': ['append',
            'buffer_info',
            'byteswap',
            'count',
            'extend',
            'frombytes',
            'fromfile',
            'fromlist',
            'fromunicode',
            'index',
            'insert',
            'pop',
            'remove',
            'reverse',
            'tobytes',
            'tofile',
            'tolist',
            'tounicode'],
 'datamodel_attribute': ['__module__'],
 'datamodel_method': ['__add__',
                      '__buffer__',
                      '__class_getitem__',
                      '__contains__',
                      '__copy__',
                      '__deepcopy__',
                      '__delitem__',
                      '__getitem__',
                      '__iadd__',
                      '__imul__',
                      '__iter__',
                      '__len__',
                      '__mul__',
                      '__release_buffer_

In [10]:
array.array?

[1;31mInit signature:[0m [0marray[0m[1;33m.[0m[0marray[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
array(typecode [, initializer]) -> array

Return a new array whose items are restricted by typecode, and
initialized from the optional initializer value, which must be a list,
string or iterable over elements of the appropriate type.

Arrays represent basic values and behave very much like lists, except
the type of objects stored in them is constrained. The type is specified
at object creation time by using a type code, which is a single character.
The following type codes are defined:

    Type code   C Type             Minimum size in bytes
    'b'         signed integer     1
    'B'         unsigned integer   1
    'u'         Unicode character  2 (see note)
    'h'         signed integer     2
    'H'         unsigned integer   

For example the type code ```'l'``` can be used to create an array where each element is a 4 byte signed ```int``` instance:

In [11]:
a_nums1 = array.array('l', [1, 2, 3, 4, 5])
a_nums2 = array.array('l', [2, 4, 6, 8, 10])

In [12]:
variables(['nums1', 'a_nums1', 'nums2', 'a_nums2'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
nums1,list,5,"[1, 2, 3, 4, 5]"
a_nums1,array,5,"array('l', [1, 2, 3, 4, 5])"
nums2,list,5,"[2, 4, 6, 8, 10]"
a_nums2,array,5,"array('l', [2, 4, 6, 8, 10])"


Recall 4 bytes means, the following number of values:

In [13]:
2 ** (4 * 8)

4294967296

However when the number is signed, these values must be split between positive and negative numbers, the positive number is typically 1 less because a value must be used to represent ```0```:

In [14]:
-(2**(4*8))/2, (((2**(4*8))/2)-1)

(-2147483648.0, 2147483647.0)

The ```array``` instance is a ```Collection``` and otherwise behaves consistently to a ```list```. The ```+``` operator for example performs concatenation:

In [15]:
a_nums1 + a_nums2

array('l', [1, 2, 3, 4, 5, 2, 4, 6, 8, 10])

```list``` comprehension can be used for addition:

In [16]:
array.array('l', [num1 + num2 for num1, num2 in zip(a_nums1, a_nums2)])

array('l', [3, 6, 9, 12, 15])

The type code ```'d'``` can be used to create an array where each element is a 8 byte ```float``` instance:

In [17]:
a_nums3 = array.array('d', [0.1, 0.2, 0.3, 0.4, 0.5])
a_nums4 = array.array('d', [0.2, 0.4, 0.6, 0.8, 1.0])

In [18]:
variables(['a_nums3', 'a_nums4'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
a_nums3,array,5,"array('d', [0.1, 0.2, 0.3, 0.4, 0.5])"
a_nums4,array,5,"array('d', [0.2, 0.4, 0.6, 0.8, 1.0])"


Each ```float``` in this array behaves consistently to a ```float``` and is displayed in decimal but encoded in binary. Therefore the recursive rounding errors encountered previously when the ```float``` class was examined still apply:

In [19]:
array.array('d', [num3 + num4 for num3, num4 in zip(a_nums3, a_nums4)])

array('d', [0.30000000000000004, 0.6000000000000001, 0.8999999999999999, 1.2000000000000002, 1.5])

It is possible to use other type codes to conserve memory, however this comes at the expense of dynamic range. The datatype can be changed to ```'f'``` from ```'d'``` which halves the precision which can be seen by a reduction in the trailing zeros:

In [20]:
array.array('f', [num3 + num4 for num3, num4 in zip(a_nums3, a_nums4)])

array('f', [0.30000001192092896, 0.6000000238418579, 0.8999999761581421, 1.2000000476837158, 1.5])

Note the ```float``` in ```builtins``` uses ```'d'``` by default which is why this lower precision ```'f'``` is displayed in the above with the precision of ```'d'```. The values past the specified precision are meaningless.

## Dimensionality

In mathematics, a matrix has rows and columns:

$$ \begin{bmatrix} 
   1 & 2 & 3 \\
   4 & 5 & 6 \\
   7 & 8 & 9 \\
   \end{bmatrix} $$

By definition "a row" consists of 1 row by n (in this case 3) columns:

$$\begin{bmatrix}1&2&3\end{bmatrix}$$

By definition "a column" consists of multiple rows (in this case 3) by 1 column:

$$  \begin{bmatrix}
    1 \\
    4 \\
    7 \\
    \end{bmatrix} $$ 

The ```array```, ```list``` and ```tuple``` are 1 dimensional collections. These are normally represented as a row vector:

In [21]:
vec = array.array('l', [1, 2, 3])

In [22]:
vec

array('l', [1, 2, 3])

In [23]:
len(vec)

3

If they are input as a column then nothing changes:

In [24]:
vec = array.array('l', [1, 
                        4, 
                        7])

The default representation returned shows that the array is still represented as a row:

In [25]:
vec

array('l', [1, 4, 7])

In [26]:
len(vec)

3

Strictly the ```array``` instance is neither a row nor or a column but is a ```Collection``` that has a single dimension with a length of ```3```.

Each element in an ```array``` instance is a fixed fundamental datatype however a ```list``` or ```tuple``` can nest other ```Collections```:

$$ \begin{bmatrix} 
   1 & 2 & 3 \\
   4 & 5 & 6 \\
   7 & 8 & 9 \\
   \end{bmatrix} $$

Each row in the matrix can be represented as a 3-element ```tuple```:

In [27]:
row0 = (1, 2, 3)
row1 = (4, 5, 6)
row2 = (7, 8, 9)

And each of these ```tuple``` rows can be an element in a ```list```:

In [28]:
[row0, 
 row1, 
 row2]

[(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In the above the different brackets for the 2 collections are used to clearly distinguish rows and columns. The ```,``` delimiter in the ```tuple``` is an instruction to move onto a new row and the ```,``` delimiter in the ```list``` is an instruction to move onto a new column. 

It is more typicaly to use a ```list``` of nested ```list``` instances. It can be spaced out as a matrix for clarity but the default representation will show a flattened format:

In [29]:
nums = [[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]]

Notice now that the outer list is indexed into to get a row:

In [30]:
nums[0]

[1, 2, 3]

Then this row (inner list) is indexed into to get the column:

In [31]:
nums[0][1]

2

This means that a nested for loop needs to be used to perform numeric addition for example of the scalar ```1``` to this matrix and the syntax therefore becomes quite cumbersome:

In [32]:
outer = []

for row in nums:
    inner = []
    for num in row:
        inner.append(num + 1) 
    outer.append(inner)
    
outer

[[2, 3, 4], [5, 6, 7], [8, 9, 10]]

## NDArray

The ```numpy``` library can be imported using:

In [33]:
import numpy as np

Because the ```numpy``` library includes a large number of functions that are commonly used, it is usually imported using the 2 letter alias ```np```. 

```numpy``` is a third-party Python library. It is preinstalled in the Anaconda base Python environment of the Anaconda Python distribution but is not preinstalled with Python. If a ```ModuleNotFoundError``` is encountered, you will need to manage your Python environment. See previous tutorials on installation for more details.

```np``` has a huge amount of identifiers. A large number of obsolete identifiers, pending depreciation have been dropped to clear up the output and will be removed in [numpy 2.0.0 release notes](https://numpy.org/devdocs/release/2.0.0-notes.html) which is still under development. 

The lower classes contain the ```ndarray``` which is the main data structure used by the ```np``` library. A ```ndarray``` like an ```array``` uses a fixed datatype and the remaining lower case classes correspond to these datatypes.

The attributes contain the constants found in the ```math``` module such as ```e```, ```pi```, ```inf``` and ```nan```. There were upper case versions of these but these are pending depreciation and upper case constants are being conserved mainly for constants and flags that are typically used internally by ```np```. 

There are a number of modules such as ```linalg``` and ```random``` which compartmentalise functions fo linear algebra and functions for random number generation. 

There are a large number of functions which are used to manipulate a ```ndarray```. Although this list of identifiers is very long, many of the identifiers are recognisable and have similar names to ```builtins``` identifiers, or methods from ```builtins``` classes, or identifiers found in the ```math``` module. These identifiers behave consistently but are designed to broadcast to an ```ndarray```.

In [34]:
from obsolete import pending_depreciation
dir2(np, exclude_external_modules=True, exclude_identifier_list=pending_depreciation)

{'attribute': ['c_',
               'e',
               'euler_gamma',
               'index_exp',
               'inf',
               'little_endian',
               'mgrid',
               'nan',
               'newaxis',
               'numarray',
               'ogrid',
               'oldnumeric',
               'pi',
               'r_',
               's_',
               'sctypeDict',
               'typecodes'],
 'constant': ['ERR_CALL',
              'ERR_DEFAULT',
              'ERR_IGNORE',
              'ERR_LOG',
              'ERR_PRINT',
              'ERR_RAISE',
              'ERR_WARN',
              'False_',
              'NAN',
              'SHIFT_DIVIDEBYZERO',
              'SHIFT_INVALID',
              'SHIFT_OVERFLOW',
              'SHIFT_UNDERFLOW',
              'ScalarType',
              'True_'],
 'module': ['char',
            'ctypeslib',
            'dtypes',
            'emath',
            'exceptions',
            'fft',
            'lib',
     

The identifiers for the ```ndarray``` class can also be examined, notice it has a number of attributes that are consistent with numeric instances. It also has a large number of methods which are analogous to the functions seen in the ```np``` library and a large number of datamodel methods which are setup to be consistent with numeric instances and carry out numeric operations:

In [35]:
dir2(np.ndarray, object, unique_only=True, exclude_external_modules=True, exclude_identifier_list=pending_depreciation)

{'attribute': ['base',
               'ctypes',
               'data',
               'dtype',
               'flags',
               'flat',
               'imag',
               'itemsize',
               'ndim',
               'real',
               'shape',
               'size',
               'strides'],
 'constant': ['T'],
 'method': ['all',
            'any',
            'argmax',
            'argmin',
            'argpartition',
            'argsort',
            'astype',
            'byteswap',
            'choose',
            'clip',
            'compress',
            'conj',
            'conjugate',
            'copy',
            'cumprod',
            'cumsum',
            'diagonal',
            'dot',
            'dump',
            'dumps',
            'fill',
            'flatten',
            'getfield',
            'item',
            'itemset',
            'max',
            'mean',
            'min',
            'newbyteorder',
            'nonzero',
          

## NDArray Class Factory Functions


The n-dimensional array ```ndarray``` class is the data structure the ```numpy``` library is based around:

In [36]:
np.ndarray?

[1;31mInit signature:[0m [0mnp[0m[1;33m.[0m[0mndarray[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
ndarray(shape, dtype=float, buffer=None, offset=0,
        strides=None, order=None)

An array object represents a multidimensional, homogeneous array
of fixed-size items.  An associated data-type object describes the
format of each element in the array (its byte-order, how many bytes it
occupies in memory, whether it is an integer, a floating point number,
or something else, etc.)

Arrays should be constructed using `array`, `zeros` or `empty` (refer
to the See Also section below).  The parameters given here refer to
a low-level method (`ndarray(...)`) for instantiating an array.

For more information, refer to the `numpy` module and examine the
methods and attributes of an array.

Parameters
----------
(for the __new__ method; see N

The initialisation signature is not normally used directly. Instead the docstring states: arrays should be constructed using the ```array```, ```zeros```, ```one```, ```empty``` or ```full``` factory functions. These functions make it easier to supply initialisation data to construct a new ```np.ndarray```. When the ```np.ndarray``` class is used directly, initialisation requires a ```buffer``` normally in the form of a ```bytearray``` which is machine readible but not very human readible. Nevertheless, the docstring outlines some important parameters for ```ndarray``` creation:

|parameter|description|
|---|---|
|shape|tuple of ints representing the shape of the created array.|
|dtype|data-type, optional object that can be interpreted as a numpy data type.|
|buffer|bytearray, used to supply array with data|
|order|{'C', 'F'}, optional Row-major (C-style) or column-major (Fortran-style) order.|

In addition to some of the important attributes of an ```ndarray```:

|attribute|description|
|---|---|
|size|int number of elements in the array|
|ndim|int number of dimensions of the array|
|shape|tuple of ints representing the shape of the array.|
|T|transpose of the array.|
|flat|flattened version of the array as an iterator.|
|real|real part of the array.|
|imag|imaginary part of the array.|
|data|the elements array in memory.|
|itemsize|the memory use of each array element in bytes.|
|nbytes|the total number of bytes required to store the array data, data * itemsize|

The following matrix:

$$ \begin{bmatrix} 
   0 & 1 & 2 & 3\\
   4 & 5 & 6 & 7\\
   8 & 9 & 10 & 11\\
   12 & 13 & 14 & 15\\
   -1 & -2 & -3 & -4\\
   -5 & -6 & -7 & -8\\
   -9 & -10 & -11 & -12\\
   -13 & -14 & -15 & -16\\      
   \end{bmatrix} $$

Has 8 rows by 4 columns; a ```shape``` of ```(8, 4)```.

All of these values have the datatype ```dtype``` of ```int``` (```np.int32```) and therefore 32 bits are used under the hood to encode each element in the array. Recall that a hexadecimal value can be used to represent 4 bits. For the 32 bits, there are therefore 8 hexadecimal characters and the hexadecimal characters are also arranged using little endian. For clarity the hexvalues will be represented as a ```list```: 

In [37]:
hexvals = ['00000000', '01000000', '02000000', '03000000',
           '04000000', '05000000', '06000000', '07000000',
           '08000000', '09000000', '0a000000', '0b000000',
           '0c000000', '0d000000', '0e000000', '0f000000',
           'ffffffff', 'feffffff', 'fdffffff', 'fcffffff',
           'fbffffff', 'faffffff', 'f9ffffff', 'f8ffffff',
           'f7ffffff', 'f6ffffff', 'f5ffffff', 'f4ffffff',
           'f3ffffff', 'f2ffffff', 'f1ffffff', 'f0ffffff']

Which will be joined into a single hexadecimal ```str```:

In [38]:
hexvals = ''.join(hexvals)

In [39]:
hexvals

'000000000100000002000000030000000400000005000000060000000700000008000000090000000a0000000b0000000c0000000d0000000e0000000f000000fffffffffefffffffdfffffffcfffffffbfffffffafffffff9fffffff8fffffff7fffffff6fffffff5fffffff4fffffff3fffffff2fffffff1fffffff0ffffff'

And then cast into a ```bytearray```:

In [40]:
b = bytearray.fromhex(hexvals)

In [41]:
b

bytearray(b'\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00\r\x00\x00\x00\x0e\x00\x00\x00\x0f\x00\x00\x00\xff\xff\xff\xff\xfe\xff\xff\xff\xfd\xff\xff\xff\xfc\xff\xff\xff\xfb\xff\xff\xff\xfa\xff\xff\xff\xf9\xff\xff\xff\xf8\xff\xff\xff\xf7\xff\xff\xff\xf6\xff\xff\xff\xf5\xff\xff\xff\xf4\xff\xff\xff\xf3\xff\xff\xff\xf2\xff\xff\xff\xf1\xff\xff\xff\xf0\xff\xff\xff')

The ```np.ndarray``` class can now be used with the ```shape=(8, 4)```, ```dtype=int``` and ```buffer=b``` instance:

In [42]:
mat = np.ndarray(shape=(8, 4), dtype=int, buffer=b)

In [43]:
variables(['mat'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
mat,ndarray,"(8, 4)","[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [-1, -2, -3, -4], [-5, -6, -7, -8], [-9, -10, -11, -12], [-13, -14, -15, -16]]"


The formal ```str``` representation shown in the cell output shows the preferred way of constructing this ```np.ndarray``` instance using the ```np.array``` class:

In [44]:
repr(mat)

'array([[  0,   1,   2,   3],\n       [  4,   5,   6,   7],\n       [  8,   9,  10,  11],\n       [ 12,  13,  14,  15],\n       [ -1,  -2,  -3,  -4],\n       [ -5,  -6,  -7,  -8],\n       [ -9, -10, -11, -12],\n       [-13, -14, -15, -16]])'

It is more clear when shown in a cell output as the ```\n``` new line escape characters are processed:

In [45]:
mat

array([[  0,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11],
       [ 12,  13,  14,  15],
       [ -1,  -2,  -3,  -4],
       [ -5,  -6,  -7,  -8],
       [ -9, -10, -11, -12],
       [-13, -14, -15, -16]])

The informal ```str``` representation instead shows a form more similar to the way a matrix is represented in formatted TeX:

In [46]:
str(mat)

'[[  0   1   2   3]\n [  4   5   6   7]\n [  8   9  10  11]\n [ 12  13  14  15]\n [ -1  -2  -3  -4]\n [ -5  -6  -7  -8]\n [ -9 -10 -11 -12]\n [-13 -14 -15 -16]]'

This is more clear when printed as the ```\n``` escape character is processed:

In [47]:
print(mat)

[[  0   1   2   3]
 [  4   5   6   7]
 [  8   9  10  11]
 [ 12  13  14  15]
 [ -1  -2  -3  -4]
 [ -5  -6  -7  -8]
 [ -9 -10 -11 -12]
 [-13 -14 -15 -16]]


Note that the informal representation removes the ```array``` function and the ```,``` delmiter and looks similar to the typical representation used in formatted teX, however he outer ```[]``` is still shown to indicate dimensionality:

The ```size```, ```shape``` and ```dtype``` are given as attributes:

In [48]:
mat.ndim

2

In [49]:
mat.shape

(8, 4)

In [50]:
mat.dtype

dtype('int32')

The class method ```frombuffer``` will instead create a 1d array from the supplied ```buffer```:

In [51]:
vec = np.frombuffer(dtype=int, buffer=b)

Notice that there is only 1 set of square brackets ```[]```:

In [52]:
vec

array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  -1,  -2,  -3,  -4,  -5,  -6,  -7,  -8,  -9, -10,
       -11, -12, -13, -14, -15, -16])

Because there is only 1 dimension:

In [53]:
vec.ndim

1

In [54]:
vec.shape

(32,)

In [55]:
vec.dtype

dtype('int32')

When ```np.ndarray``` initialisation signature or ```np.frombuffer``` are used, the ```buffer``` supplied must match whats expected from the ```dtype```. In the example above, a simple ```int``` datatype was selected and other datatypes such as the ```float``` are much more complicated.

Because the ```bytearray``` ```buffer``` is not very human readible, the factory function ```np.array``` is usually used for ```np.ndarray``` instantiation:

In [56]:
np.array?

[1;31mDocstring:[0m
array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
      like=None)

Create an array.

Parameters
----------
object : array_like
    An array, any object exposing the array interface, an object whose
    ``__array__`` method returns an array, or any (nested) sequence.
    If object is a scalar, a 0-dimensional array containing object is
    returned.
dtype : data-type, optional
    The desired data-type for the array. If not given, NumPy will try to use
    a default ``dtype`` that can represent the values (by applying promotion
    rules when necessary.)
copy : bool, optional
    If true (default), then the object is copied.  Otherwise, a copy will
    only be made if ``__array__`` returns a copy, if obj is a nested
    sequence, or if a copy is needed to satisfy any of the other
    requirements (``dtype``, ``order``, etc.).
order : {'K', 'A', 'C', 'F'}, optional
    Specify the memory layout of the array. If object is not an array, the
   

The positional input argument is normally assigned to a ```list``` of numbers or a ```list``` of equally sized ```list``` instances, that in turn contain numeric values.

If only ```object``` is supplied to the ```np.array``` factory function, the ```dtype``` and the ```ndmin``` are implied from the datatype of the elements in the ```list``` (or ```list``` of nested ```list``` instances):

|parameter|description|
|---|---|
|object|any (nested) sequence, usually a list.|
|dtype|data-type, optional object that can be interpreted as a numpy data type.|
|copy|if true (default), then the object is copied.|
|ndmin|specifies the minimum number of dimensions that the resulting array should have. Ones will be prepended to the shape to satisfy this requirement|

If the ```object``` supplied is a ```list``` a 1d ```ndarray``` instance will be instantiated:

In [57]:
vec = np.array(object=[1, 2, 3])

In [58]:
vec

array([1, 2, 3])

In [59]:
vec.dtype

dtype('int32')

In [60]:
vec.ndim

1

In [61]:
vec.shape

(3,)

If the ```object``` is a ```list``` of nested ```list``` instances a 2d ```ndarray``` will be instantiated:

In [62]:
mat = np.array(object=[[1, 2, 3], 
                       [4, 5, 6], 
                       [7, 8, 9]])

In [63]:
mat

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [64]:
mat.dtype

dtype('int32')

In [65]:
mat.ndim

2

In [66]:
mat.shape

(3, 3)

In [67]:
mat.size

9

Normally the ```object``` is supplied positionally and the other keyword arguments are only supplied when they differ from the defaults. For example ```ndmin``` can be assigned to ```2``` which will create a 2d ```ndarray``` that is a row vector opposed to a 1d ```ndarray```:

In [68]:
row = np.array([1, 2, 3], ndmin=2)

In [69]:
row

array([[1, 2, 3]])

In [70]:
row.dtype

dtype('int32')

In [71]:
row.ndim

2

In [72]:
row.shape

(1, 3)

In [73]:
row.size

3

The transpose ```T``` attribute can be used to convert the row into a column:

In [74]:
col = row.T

In [75]:
col

array([[1],
       [2],
       [3]])

In [76]:
col.dtype

dtype('int32')

In [77]:
col.ndim

2

In [78]:
col.shape

(3, 1)

In [79]:
col.size

3

The 1d array or *1d vector* has a single dimension that spans over n columns (in this case n=3):

In [80]:
vec.shape

(3,)

The 2d array or *2d row vector*, has 1 row and n columns (in this case n=3):

In [81]:
row.shape

(1, 3)

The 2d array or *2d col vector*, has n rows and 1 column (in this case n=3):

In [82]:
col.shape

(3, 1)

If the following are compared:

In [83]:
vec.shape

(3,)

In [84]:
row.shape

(1, 3)

The ```shape``` is a ```tuple``` containing dimensions. 

Notice that the new dimension is left appended; the origin of this new dimension is the length of the outer ```list```. The next element is the dimension of a nested ```list``` instance:

In [85]:
len([[1, 2, 3]])

1

In [86]:
len([1, 2, 3])

3

```vec```, ```row``` and ```col``` can be viewed side by side:

In [87]:
variables(['vec', 'row', 'col'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
vec,ndarray,"(3,)","[1, 2, 3]"
row,ndarray,"(1, 3)","[[1, 2, 3]]"
col,ndarray,"(3, 1)","[[1], [2], [3]]"


The datatype ```dtype``` can be specified:

In [88]:
rowf = np.array([1, 2, 3], dtype=np.float64, ndmin=2)

Now all the numeric values can be seen to be ```float``` instances:

In [89]:
rowf

array([[1., 2., 3.]])

This can be confirmed by examining the ```dtype``` attribute:

In [90]:
rowf.dtype

dtype('float64')

The ```np``` classes are equivalent to those in ```builtins```, however contain the number of bytes each instance occupies:

|np|builtins|
|---|---|
|int32|int|
|float64|float|
|complex128|complex|

When the datatype ```dtype``` of an array is set a ```builtins```, the equivalent ```np``` class will be selected:

In [91]:
np.array([1, 2, 3], dtype=float, ndmin=2).dtype

dtype('float64')

The docstring also includes details about other factory functions to create an ```ndarray```:

|factory function|description|
|---|---|
|empty_like|Return an empty array with shape and type of input.|
|ones_like|Return an array of ones with shape and type of input.|
|zeros_like|Return an array of zeros with shape and type of input.|
|full_like|Return a new array with shape of input filled with value.|
|empty| Return a new uninitialized array.|
|ones|Return a new array setting values to one.|
|zeros|Return a new array setting values to zero.|
|full|Return a new array of given shape filled with value.|

Supposing the following matrix ```mat1``` is constructed:

In [92]:
mat1 = np.array([[1, 2, 3],
                 [4, 5, 6]])

This has 2 dimensions:

In [93]:
mat1.ndim

2

2 rows by columns:

In [94]:
mat1.shape

(2, 3)

And 6 elements:

In [95]:
mat1.size

6

The other factory functions ```empty_like```, ```ones_like```, ```zeros_like``` and ```fulls_like``` can be used on this prototype array.

The ```empty_like``` uses the input argument ```prototype``` and all the values are initialised using junk values:

In [96]:
mat2 = np.empty_like(prototype=mat1)
mat2

array([[         0, 1072693248,          0],
       [1073741824,          0, 1074266112]])

The ```ones_like```, ```zeros_like``` and ```fulls_like``` use the input argument ```a``` (for array) and ```fulls_like``` requires an additional input argument ```full_value```:

In [97]:
mat3 = np.zeros_like(a=mat1)
mat3

array([[0, 0, 0],
       [0, 0, 0]])

In [98]:
mat4 = np.ones_like(a=mat1)
mat4

array([[1, 1, 1],
       [1, 1, 1]])

In [99]:
mat5 = np.full_like(a=mat1, fill_value=2)
mat5

array([[2, 2, 2],
       [2, 2, 2]])

```empty```, ```zeros```, ```ones``` and ```full``` can be used to instead create an equivalent matrix from a ```shape```, which is a ```tuple``` of dimensions:

In [100]:
mat6 = np.empty(shape=(4, 3))
mat6

array([[5.00000000e+000, 0.00000000e+000, 5.00000000e+000],
       [0.00000000e+000, 5.00000000e+000, 0.00000000e+000],
       [5.58431942e-091, 2.70237043e-056, 8.26661559e-072],
       [3.85778377e-057, 3.99910963e+252, 5.99367144e-038]])

In [101]:
mat7 = np.zeros(shape=(2, 1))
mat7

array([[0.],
       [0.]])

In [102]:
mat7 = np.ones(shape=(2, 2))
mat7

array([[1., 1.],
       [1., 1.]])

In [103]:
mat8 = np.full(shape=(5, 1), fill_value=4)
mat8

array([[4],
       [4],
       [4],
       [4],
       [4]])

Notice that the datatype ```dtype``` for ```empty```, ```zero```, ```ones``` is ```np.float64``` by default but can be overridden with the ```dtype``` keyword input argument:

In [104]:
mat9 = np.ones(shape=(4, 3), dtype=np.int32)
mat9

array([[1, 1, 1],
       [1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

The ```dtype``` for the ```array``` constructed using the ```full``` function is inferred by the ```fill_value``` but can be changed from the default.

It is possible to construct higher dimension arrays, however it becomes difficult to visualise arrays that have a higher dimension than the computer screen. For this reason it is worthwhile conceptualising some physical objects:

|ndim|description|shape|
|---|---|---|
|1|line vector|(c, )|
|2|page consisting of rows of equal length line vectors|(r, c)|
|3|book of equally sized pages|(b, r, c)|
|4|shelf of equally sized books|(s, b, r, c)|
|5|wardrobe of equally sized shelves|(w, s, b, r, c)|
|6|library of equally sized wardrobes|(l, w, s, b, r, c)|
|7|group of equally sized libraries|(g, l, w, s, b, r, c)|

A book can therefore be constructed from a ```list```, of nested ```list``` instances, of nested ```list``` instances, spacing is normally used to seperate the matrices corresponding to each page:

In [105]:
book = np.array([[[ 1,  2,], 
                  [ 3,  4]],
                  
                 [[ 5,  6], 
                  [ 7,  8]],
                  
                 [[ 9, 10], 
                  [11, 12]]])

In [106]:
book

array([[[ 1,  2],
        [ 3,  4]],

       [[ 5,  6],
        [ 7,  8]],

       [[ 9, 10],
        [11, 12]]])

This book has 3 dimensions:

In [107]:
book.ndim

3

With a total of 12 elements:

In [108]:
book.size

12

And a shape of 3 pages, by 2 rows by 2 columns:

In [109]:
book.shape

(3, 2, 2)

The new dimension is always left appended:

In [110]:
ncols = vec.shape

In [111]:
nrows, ncols = mat.shape

In [112]:
ncols = book.shape

Therefore to get ```ncols``` from ```shape``` the positive index needs to increase in correspondance to the number of dimensions:

In [113]:
ncols = vec.shape[0]

In [114]:
ncols = mat.shape[1]

In [115]:
ncols = book.shape[2]

The last index is ```-1``` and corresponds to ```ncols```:

In [116]:
ncols = vec.shape[-1]

In [117]:
ncols = mat.shape[-1]

In [118]:
ncols = book.shape[-1]

The second last index ```-2``` corresponds to ```nrows```:

In [119]:
nrows = mat.shape[-2]

In [120]:
nrows = book.shape[-2]

## Ravel and Reshape

An array has the attributes ```size```, ```ndim``` and ```shape```. The dimensionality of an array is ignored when the array is cast into an iterator using the ```flat``` attribute. Recall an iterator has no dimensionality because it only displays a single value at a time:

In [121]:
it = book.flat
it

<numpy.flatiter at 0x241c0429bf0>

The value of the iterator can be advanced using ```next```:

In [122]:
next(it)

1

In [123]:
next(it)

2

Alternatively it can be cast into a ```tuple``` to consume all remaining elements:

In [124]:
tuple(it) # remaining elements

(3, 4, 5, 6, 7, 8, 9, 10, 11, 12)

Notice that the array is deconstructed in row order, meaning each consecutive row is essentially concatenated.

In [125]:
book

array([[[ 1,  2],
        [ 3,  4]],

       [[ 5,  6],
        [ 7,  8]],

       [[ 9, 10],
        [11, 12]]])

In [126]:
tuple(book.flat)

(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)

The function ```np.ravel``` and the immutable method ```ndarray.ravel``` will instead unravel all the elements and ravel them in a 1d ```ndarray```:

In [127]:
np.ravel(book)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [128]:
book.ravel()

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

The ```numpy``` function and ```ndarray``` method ```ravel``` have the keyword input argument ```order```. The ```order``` is assigned to a string that is ```'C'``` (default) or ```'F'``` which stand for the C (row-major) and Fortran (column-major) programming languages respectively. **Do not confuse C with column.**

In [129]:
book.ravel(order='C')

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [130]:
book.ravel(order='F')

array([ 1,  5,  9,  3,  7, 11,  2,  6, 10,  4,  8, 12])

Comparison of the 2 unravelled arrays shows that it is far more logical to use the default ```'C'``` (row-order).

The ```builtins``` class ```list``` can be used to cast the outer layer of an ```ndarray``` to a ```list```. Notice that this is a ```list``` of nested ```ndarray``` instances:

In [131]:
list(book)

[array([[1, 2],
        [3, 4]]),
 array([[5, 6],
        [7, 8]]),
 array([[ 9, 10],
        [11, 12]])]

The mutable method ```ndarray.tolist``` will instead cast the ```ndarray``` into a ```list``` of nested ```list``` of nested ```list``` instances:

In [132]:
book.tolist()

[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]]

An ```ndarray``` can also be reshaped to new dimensions, providing that the ```size``` of the new dimensions matches the ```size``` of the original dimensions. The original ```ndarray``` is essentially raveled and each element is then used to populate the new dimensions. 

Once again there is the function ```np.reshape``` and the complementary immutable method ```np.reshape```:

In [133]:
mat10 = np.reshape(a=book, newshape=(4, 3))

In [134]:
mat10

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [135]:
mat11 = book.reshape((4, 3))

In [136]:
mat11

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

## Indexing and Slicing

If the following matrix:

$$ \begin{bmatrix} 
   1 & 2 & 3 \\
   4 & 5 & 6 \\
   7 & 8 & 9 \\
   10 & 11 & 12 \\
   \end{bmatrix} $$

Is represented as a ```list``` of ```tuples```:

In [137]:
list_of_tuples = [( 1,  2,  3),
                  ( 4,  5,  6),
                  ( 7,  8,  9),
                  (10, 11, 12)]

Then a row can be selected by indexing into the ```list```. For example:

$$ 
\begin{bmatrix} 
\textbf{1} & \textbf{2} & \textbf{3} \\
4 & 5 & 6 \\
7 & 8 & 9 \\
10 & 11 & 12 \\
\end{bmatrix} 
$$

In [138]:
row0 = list_of_tuples[0]

In [139]:
row0

(1, 2, 3)

Then the ```tuple``` instance ```row0``` can be indexed into to select an element. For example:

$$ 
\begin{bmatrix} 
1 & \textbf{2} & 3\\
\end{bmatrix} 
$$

In [140]:
row0[1]

2

And this same element can be selected using two sets of square brackets which index into the ```list``` of ```tuple``` rows and the ```tuple``` row of elements respectively:

$$ 
\begin{bmatrix} 
1 & \textbf{2} & 3 \\
4 & 5 & 6 \\
7 & 8 & 9 \\
10 & 11 & 12 \\
\end{bmatrix} 
$$

In [141]:
list_of_tuples[0][1]

2

For an equivalent numpy array:

$$ 
\begin{bmatrix} 
1 & \textbf{2} & 3 \\
4 & 5 & 6 \\
7 & 8 & 9 \\
10 & 11 & 12 \\
\end{bmatrix} 
$$

In [142]:
mat1 = np.array([[ 1,  2,  3],
                 [ 4,  5,  6],
                 [ 7,  8,  9],
                 [10, 11, 12]])

Indexing can be carried out in a similar manner:

In [143]:
mat1[0][1]

2

However as the ```ndarray``` is setup to be a n dimensional structure, there is simpler syntax, which involves a single set of square brackets and use of a ```,``` as a delimiter:

In [144]:
mat1[0, 1]

2

Notice that the order of elements used for indexing are consistent to the order of elements in  ```shape``` which recall is a ```tuple``` of dimensions: 

In [145]:
mat1.shape

(4, 3)

|ndim|description|indexing|
|---|---|---|
|1|line vector|array1d[c, ]|
|2|page consisting of rows of equal length line vectors|array2d[r, c]|
|3|book of equally sized pages|array3d[b, r, c]|
|4|shelf of equally sized books|array4d[s, b, r, c]|
|5|wardrobe of equally sized shelves|array5d[w, s, b, r, c]|
|6|library of equally sized wardrobes|array6d[l, w, s, b, r, c]|
|7|group of equally sized libraries|array7d[g, l, w, s, b, r, c]|

Multiple values can be selected from the array by indexing using a list. This outputs an ```ndarray```:

$$ \begin{bmatrix} 
    1 & \textbf{2} & 3 \\
    \textbf{4} & 5 & 6 \\
    7 & 8 & 9 \\
    10 & 11 & 12 \\
    \end{bmatrix} $$

The scalar ```2``` is on row ```0``` and column ```1```.

The scalar ```4``` is on row ```1``` and column ```0```.

In [146]:
mat1[[0, 1], [1, 0]]

array([2, 4])

Supposing all of the items are along an axis, for example row 0, all columns:

$$ \begin{bmatrix} 
   \textbf{1} & \textbf{2} & \textbf{3} \\
   4 & 5 & 6 \\
   7 & 8 & 9 \\
   10 & 11 & 12 \\
   \end{bmatrix} $$

This can be indexed explicitly using a ```list``` of indexes for each axis:

In [147]:
mat1[[0, 0, 0], [0, 1, 2]]

array([1, 2, 3])

For short hand, a scalar value can be used the row:

In [148]:
mat1[0, [0, 1, 2]]

array([1, 2, 3])

To select all columns slicing can be used:

In [149]:
mat1[0, 0:mat1.shape[-1]:1]

array([1, 2, 3])

Recall that the slice has the general form ```start:stop:step```:

* if the ```start``` is not specified it is assumed to be ```0```.
* if the ```stop``` is not specified it is assumed to be the length of the array along that axis.
* if the ```step``` is not specified it is assumed to be ```1```.

Therefore the following is equivalent:

In [150]:
mat1[0, :]

array([1, 2, 3])

Indexing is also possible using a negative step:

In [151]:
mat1[0, -1:-mat1.shape[-1]-1:-1]

array([3, 2, 1])

Use of a negative ```step``` will change the default for ```start``` and ```stop``` if not specified:

* if the ```start``` is not specified it is assumed to be ```-1```.
* if the ```stop``` is not specified it is assumed to be the negative length of the array along that axis minus 1.

Therefore the following is equivalent:

In [152]:
mat1[0, ::-1]

array([3, 2, 1])

Since it is easier to conceptualise a matrix on a computer screen. Higher order arrays are often constructed using the ```np.zeros``` function and then populated by slicing and assigning the slice to a matrix.

In [153]:
book = np.zeros(shape=(3, 4, 3))
book

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

In [154]:
book.size

36

In [155]:
book.ndim

3

In [156]:
book.shape

(3, 4, 3)

The data for a single page can be constructed as a matrix:

In [157]:
page0 = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9],
                  [10, 11, 12]])

In [158]:
page0

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [159]:
page0.size

12

In [160]:
page0.ndim

2

In [161]:
page0.shape

(4, 3)

Notice that the following match:

In [162]:
book.shape[-1] == page0.shape[-1]

True

In [163]:
book.shape[-2] == page0.shape[-2]

True

And therefore the page can be added to the book using:

In [164]:
book[0, :, :] = page0

In [165]:
book

array([[[ 1.,  2.,  3.],
        [ 4.,  5.,  6.],
        [ 7.,  8.,  9.],
        [10., 11., 12.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]])

Another two pages can be created:

In [166]:
page1 = np.array([[13, 14, 15],
                  [16, 17, 18],
                  [19, 20, 21],
                  [22, 23, 24]])

In [167]:
book[1, :, :] = page1

In [168]:
page2 = np.array([[25, 26, 27],
                  [28, 29, 30],
                  [31, 32, 33],
                  [34, 35, 36]])

In [169]:
book[2, :, :] = page2

The ```book``` can be viewed:

In [170]:
book

array([[[ 1.,  2.,  3.],
        [ 4.,  5.,  6.],
        [ 7.,  8.,  9.],
        [10., 11., 12.]],

       [[13., 14., 15.],
        [16., 17., 18.],
        [19., 20., 21.],
        [22., 23., 24.]],

       [[25., 26., 27.],
        [28., 29., 30.],
        [31., 32., 33.],
        [34., 35., 36.]]])

## Axis, Flip, FlipLR, FlipUD

The following matrix:

$$ \text{mat} = \begin{bmatrix} 
                1 & 2 & 3 \\
                4 & 5 & 6 \\
                7 & 8 & 9 \\
                10 & 11 & 12 \\
                \end{bmatrix} $$

Can be instantiated:

In [171]:
mat = np.array([[ 1,  2,  3],
                [ 4,  5,  6],
                [ 7,  8,  9],
                [10, 11, 12]])

It can be left right flipped left right:

$$ \text{matLR} = \begin{bmatrix} 
                  3 & 2 & 1 \\
                  6 & 5 & 4 \\
                  9 & 8 & 7 \\
                  12 & 11 & 10 \\
                  \end{bmatrix} $$


By indexing with rows=```:``` and columns=```::-1```

In [172]:
mat_lr = mat1[:, ::-1]

In [173]:
mat_lr

array([[ 3,  2,  1],
       [ 6,  5,  4],
       [ 9,  8,  7],
       [12, 11, 10]])

This can also be performed using the ```np.fliplr``` function:

In [174]:
mat_lr = np.fliplr(m=mat)

In [175]:
mat_lr

array([[ 3,  2,  1],
       [ 6,  5,  4],
       [ 9,  8,  7],
       [12, 11, 10]])

there is an associated function ```np.flip```. This function has the keyword input argument ```axis``` which is a common parameter for ```numpy``` functions.

When flipping left right using indexing, all rows were selected (rows=```:```) and the columns were reversed (columns=```::-1```).  Therefore the flip operation is carried out along the column axis. 

Since a matrix is a 2darray the rows are ```axis=0``` and the columns are ```axis=1```.

In [176]:
mat_lr = np.flip(m=mat, axis=1)

In [177]:
mat_lr

array([[ 3,  2,  1],
       [ 6,  5,  4],
       [ 9,  8,  7],
       [12, 11, 10]])

Recall that a new dimension in ```shape``` which is a```tuple``` of dimensions is left appended. This means the ```axis``` positive value of the columns will increase by ```1``` for each dimension. 

On the other hand the negative value will always be ```-1``` for columns, ```-2``` for rows and so on... It is therefore recommended to use the negative value for the sake of consistency:

In [178]:
mat_lr = np.flip(m=mat, axis=-1)

In [179]:
mat_lr

array([[ 3,  2,  1],
       [ 6,  5,  4],
       [ 9,  8,  7],
       [12, 11, 10]])

A matrix can be up down flipped:

$$ \text{mat} = \begin{bmatrix} 
                1 & 2 & 3 \\
                4 & 5 & 6 \\
                7 & 8 & 9 \\
                10 & 11 & 12 \\
                \end{bmatrix} $$


$$ \text{mat} = \begin{bmatrix} 
                10 & 11 & 12 \\
                7 & 8 & 9 \\
                4 & 5 & 6 \\
                1 & 2 & 3 \\
                \end{bmatrix} $$


This can be done by indexing with rows=```::-1``` and columns=```:```

In [180]:
mat_ud = mat[::-1, :]

In [181]:
mat_ud

array([[10, 11, 12],
       [ 7,  8,  9],
       [ 4,  5,  6],
       [ 1,  2,  3]])

Using the the function ```np.flipud```:

In [182]:
mat_ud = np.flipud(m=mat)

In [183]:
mat_ud

array([[10, 11, 12],
       [ 7,  8,  9],
       [ 4,  5,  6],
       [ 1,  2,  3]])

Using the the function ```np.flip``` with ```axis=-2```:

In [184]:
mat_ud = np.flip(m=mat, axis=-2)

In [185]:
mat_ud

array([[10, 11, 12],
       [ 7,  8,  9],
       [ 4,  5,  6],
       [ 1,  2,  3]])

## Transposing

The ```T``` attribute and complementary function ```np.transpose``` can be used to transpose an array. This is typically used most for 2d arrays for example 2d row vectors can be transposed into 2d column vectors or vice versa: 

$$\text{r}=\left[\begin{matrix}1&2&3&4\end{matrix}\right]$$


$$\text{c}=\left[\begin{matrix}
                 1\\
                 2\\
                 3\\
                 4\\
                 \end{matrix}\right]$$

A row vector can be transposed into a column, which can be transposed back into a row:

In [186]:
r = np.array([1, 2, 3, 4], ndmin=2)

In [187]:
r

array([[1, 2, 3, 4]])

In [188]:
c = r.T

In [189]:
c

array([[1],
       [2],
       [3],
       [4]])

In [190]:
r2 = c.T

In [191]:
r2

array([[1, 2, 3, 4]])

Transposing the 2d array 2 times, returns the original array.

Recall that a 1d ```ndarray``` is neither a row or a column. This is seen when it is attempted to be transposed:

In [192]:
v = np.array([1, 2, 3, 4], ndmin=1)

In [193]:
v

array([1, 2, 3, 4])

In [194]:
v.T

array([1, 2, 3, 4])

A 1d ```ndarray``` can explictly be converted into a 2d row or a column by indexing using all elements ```:``` and ```np.newaxis```.

For a column, the ```np.newaxis``` should be placed at position ```-1```:

In [195]:
c = v[:, np.newaxis]

In [196]:
c

array([[1],
       [2],
       [3],
       [4]])

For a row, the ```np.newaxis``` should be placed at position ```-2```:

In [197]:
r = v[np.newaxis, :]

In [198]:
r

array([[1, 2, 3, 4]])

Transposing also works with matrices:

$$ \text{mat} = \begin{bmatrix} 
                1 & 2 & 3 \\
                4 & 5 & 6 \\
                7 & 8 & 9 \\
                10 & 11 & 12 \\
                \end{bmatrix} $$

$$ \text{matT} = \begin{bmatrix} 
                 1 & 4 & 7 & 9 \\
                 2 & 5 & 8 & 11 \\
                 3 & 6 & 9 & 12\\
                 \end{bmatrix} $$

In [199]:
mat = np.array([[ 1,  2,  3],
                [ 4,  5,  6],
                [ 7,  8,  9],
                [10, 11, 12]])

In [200]:
mat.T

array([[ 1,  4,  7, 10],
       [ 2,  5,  8, 11],
       [ 3,  6,  9, 12]])

In [201]:
np.transpose(mat)

array([[ 1,  4,  7, 10],
       [ 2,  5,  8, 11],
       [ 3,  6,  9, 12]])

It is possible to transpose higher dimensional arrays, although a bit less common:

In [202]:
book = np.array([[[ 1,  2,  3,  4], 
                  [ 5,  6,  7,  8]],
                  
                 [[ 9, 10, 11, 12], 
                  [13, 14, 15, 16]],
                  
                 [[17, 18, 19, 20], 
                  [21, 22, 23, 24]]])

In [203]:
book.T

array([[[ 1,  9, 17],
        [ 5, 13, 21]],

       [[ 2, 10, 18],
        [ 6, 14, 22]],

       [[ 3, 11, 19],
        [ 7, 15, 23]],

       [[ 4, 12, 20],
        [ 8, 16, 24]]])

The ```shape``` attributes can be compared:

In [204]:
book.shape

(3, 2, 4)

In [205]:
book.T.shape

(4, 2, 3)

## Arange and Linspace

A numpy array can be created from a ```range``` instance. Recall that the ```range``` instance has a ```start```, ```stop``` and   ```step``` value and uses integer based zero-order indexing:


In [206]:
v = np.array(object=range(0, 10, 1))

This has a ```size``` attribute of ```10``` as the ```range``` instance contains ```10``` values:

In [207]:
v.size

10

It has an ```np.int32``` datatype ```dtype``` attribute as the ```range``` class only supports steps of an ```int``` instance:

In [208]:
v.dtype

dtype('int32')

In [209]:
v.ndim

1

In [210]:
v.shape

(10,)

Recall that the ```range``` class can be instantiated with a varying number of positional input arguments:

|n|supplied|default|
|---|---|---|
|3|start, stop, step||
|2|start, stop|step=1|
|1|stop|start=0, step=1|

Zero-order indexing is exclusive of the stop value. To get an array that is inclusive of the stop value ```10```. The following would have to be used:

In [211]:
v = np.array(object=range(0, 10+1, 1))

In [212]:
v

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [213]:
v.size

11

The ```numpy``` library includes the more powerful array range function ```np.arange```. This function behaves similar to the ```range``` class when an ```np.int32``` datatype is used but can handle additional datatypes such as ```np.float64```, ```np.complex128```, ```np.datetime64``` and ```np.timedelta64```. 

In [214]:
np.arange?

[1;31mDocstring:[0m
arange([start,] stop[, step,], dtype=None, *, like=None)

Return evenly spaced values within a given interval.

``arange`` can be called with a varying number of positional arguments:

* ``arange(stop)``: Values are generated within the half-open interval
  ``[0, stop)`` (in other words, the interval including `start` but
  excluding `stop`).
* ``arange(start, stop)``: Values are generated within the half-open
  interval ``[start, stop)``.
* ``arange(start, stop, step)`` Values are generated within the half-open
  interval ``[start, stop)``, with spacing between values given by
  ``step``.

For integer arguments the function is roughly equivalent to the Python
built-in :py:class:`range`, but returns an ndarray rather than a ``range``
instance.

When using a non-integer step, such as 0.1, it is often better to use
`numpy.linspace`.


Parameters
----------
start : integer or real, optional
    Start of interval.  The interval includes this value.  The default
    st

```np.array``` has consistent positional parameters ```start```, ```stop``` and ```step```:

In [215]:
v = np.arange(0, 10+1, 1)

In [216]:
v

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [217]:
v.dtype

dtype('int32')

In [218]:
v.size

11

However can also be explicitly specified using the input argument ```dtype```:

In [219]:
v = np.arange(0, 10+1, 1, dtype=np.float64)

In [220]:
v

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [221]:
v.dtype

dtype('float64')

In [222]:
v.size

11

```np.arange``` like ```range``` can take 3 positional parameters (```start```, ```stop``` and ```step```):

In [223]:
np.arange(0, 10, 1)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

2 positional parameters (```start``` and ```stop``` with ```step``` assumed to be ```1```):

In [224]:
np.arange(0, 10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

and 1 positional parameter (```stop``` with ```start``` assumed to be ```0``` and ```step``` assumed to be ```1```):

In [225]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

The datatype is usually inferred by the input arguments. If the input arguments are ```float``` instances, the datatype of the array will be ```np.float64```. For example:

In [226]:
v = np.arange(start=0.0, stop=10.0+0.1, step=0.1)

In [227]:
v

array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9,  1. ,
        1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8,  1.9,  2. ,  2.1,
        2.2,  2.3,  2.4,  2.5,  2.6,  2.7,  2.8,  2.9,  3. ,  3.1,  3.2,
        3.3,  3.4,  3.5,  3.6,  3.7,  3.8,  3.9,  4. ,  4.1,  4.2,  4.3,
        4.4,  4.5,  4.6,  4.7,  4.8,  4.9,  5. ,  5.1,  5.2,  5.3,  5.4,
        5.5,  5.6,  5.7,  5.8,  5.9,  6. ,  6.1,  6.2,  6.3,  6.4,  6.5,
        6.6,  6.7,  6.8,  6.9,  7. ,  7.1,  7.2,  7.3,  7.4,  7.5,  7.6,
        7.7,  7.8,  7.9,  8. ,  8.1,  8.2,  8.3,  8.4,  8.5,  8.6,  8.7,
        8.8,  8.9,  9. ,  9.1,  9.2,  9.3,  9.4,  9.5,  9.6,  9.7,  9.8,
        9.9, 10. ])

In [228]:
v.size

101

In [229]:
v.shape

(101,)

In [230]:
v.dtype

dtype('float64')

The array range function can also be used with ```datetime``` and ```timedelta``` instances. These classes can be imported from the ```datetime``` module:

In [231]:
from datetime import datetime, timedelta

To construct a ```datetime``` array using ```np.arange```:

* ```start``` is normally a ```datetime``` instance
* ```stop``` is normally a ```datetime``` instance
* ```step``` is normally a ```timedelta``` instance

For example:

In [232]:
start_day = datetime(year=2023, month=1, day=1, 
                     hour=12, minute=0, second=0, microsecond=0)

In [233]:
stop_day = datetime(year=2024, month=1, day=1, 
                    hour=12, minute=0, second=0, microsecond=0)

In [234]:
step_week = timedelta(days=7)
step_week

datetime.timedelta(days=7)

In [235]:
dt_v = np.arange(start_day, stop_day, step_week)

In [236]:
dt_v

array(['2023-01-01T12:00:00.000000', '2023-01-08T12:00:00.000000',
       '2023-01-15T12:00:00.000000', '2023-01-22T12:00:00.000000',
       '2023-01-29T12:00:00.000000', '2023-02-05T12:00:00.000000',
       '2023-02-12T12:00:00.000000', '2023-02-19T12:00:00.000000',
       '2023-02-26T12:00:00.000000', '2023-03-05T12:00:00.000000',
       '2023-03-12T12:00:00.000000', '2023-03-19T12:00:00.000000',
       '2023-03-26T12:00:00.000000', '2023-04-02T12:00:00.000000',
       '2023-04-09T12:00:00.000000', '2023-04-16T12:00:00.000000',
       '2023-04-23T12:00:00.000000', '2023-04-30T12:00:00.000000',
       '2023-05-07T12:00:00.000000', '2023-05-14T12:00:00.000000',
       '2023-05-21T12:00:00.000000', '2023-05-28T12:00:00.000000',
       '2023-06-04T12:00:00.000000', '2023-06-11T12:00:00.000000',
       '2023-06-18T12:00:00.000000', '2023-06-25T12:00:00.000000',
       '2023-07-02T12:00:00.000000', '2023-07-09T12:00:00.000000',
       '2023-07-16T12:00:00.000000', '2023-07-23T12:00:00.0000

In [237]:
dt_v.size

53

In [238]:
dt_v.shape

(53,)

In [239]:
dt_v.dtype

dtype('<M8[us]')

To construct a ```timedelta``` array using ```np.arange```:

* ```start``` is normally a ```timedelta``` instance
* ```stop``` is normally a ```timedelta``` instance
* ```step``` is normally a ```timedelta``` instance

For example:

In [240]:
start_time = timedelta()

In [241]:
stop_time = timedelta(hours=1, minutes=5)

In [242]:
step_time = timedelta(minutes=5)

In [243]:
dt_v = np.arange(start_time, stop_time, step_time)

In [244]:
dt_v

array([         0,  300000000,  600000000,  900000000, 1200000000,
       1500000000, 1800000000, 2100000000, 2400000000, 2700000000,
       3000000000, 3300000000, 3600000000], dtype='timedelta64[us]')

In [245]:
dt_v.size

13

In [246]:
dt_v.shape

(13,)

In [247]:
dt_v.dtype

dtype('<m8[us]')

Note ```np``` does not support timezone information and will raise a ```UserWarning``` if ```datetime.datetime``` instances with timezone information is attempted to be parsed.

The ```numpy``` library uses the more accurate datatypes ```np.datetime64``` and ```np.timedelta64``` which has nanosecond precision opposed to ```datetime.datetime``` and ```datetime.timedelta``` which have a microsecond precision. 

These classes have a slightly different initialisation signature. The ```np.datetime64``` class uses a timestamp:

In [248]:
np.datetime64?

[1;31mInit signature:[0m [0mnp[0m[1;33m.[0m[0mdatetime64[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
If created from a 64-bit integer, it represents an offset from
``1970-01-01T00:00:00``.
If created from string, the string can be in ISO 8601 date
or datetime format.

>>> np.datetime64(10, 'Y')
numpy.datetime64('1980')
>>> np.datetime64('1980', 'Y')
numpy.datetime64('1980')
>>> np.datetime64(10, 'D')
numpy.datetime64('1970-01-11')

See :ref:`arrays.datetime` for more information.

:Character code: ``'M'``
[1;31mFile:[0m           c:\users\phili\anaconda3\envs\vscode-env\lib\site-packages\numpy\__init__.py
[1;31mType:[0m           type
[1;31mSubclasses:[0m     

In [249]:
start_day = np.datetime64('2023-01-01T12:00:00')
stop_day = np.datetime64('2024-01-01T12:00:00')

The ```np.timedelta64``` uses a numeric value followed by a unit:

In [250]:
np.timedelta64?

[1;31mInit signature:[0m [0mnp[0m[1;33m.[0m[0mtimedelta64[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
A timedelta stored as a 64-bit integer.

See :ref:`arrays.datetime` for more information.

:Character code: ``'m'``
[1;31mFile:[0m           c:\users\phili\anaconda3\envs\vscode-env\lib\site-packages\numpy\__init__.py
[1;31mType:[0m           type
[1;31mSubclasses:[0m     

In [251]:
step_week = np.timedelta64(7, 'D')

In [252]:
step_week

numpy.timedelta64(7,'D')

Arithmetic is often used to create a time with accuracy of the smallest unit:

In [253]:
step_week = np.timedelta64(6, 'D') \
            + np.timedelta64(23, 'h') \
            + np.timedelta64(59, 'm') \
            + np.timedelta64(59, 's') \
            + np.timedelta64(999, 'ms') \
            + np.timedelta64(999, 'us') \
            + np.timedelta64(1000, 'ns')

In [254]:
step_week

numpy.timedelta64(604800000000000,'ns')

In [255]:
step_week = timedelta(days=7)
step_week

datetime.timedelta(days=7)

The array range function ```np.arange``` is complemented by the linearly spaced array function ```np.linspace```:

In [256]:
np.linspace?

[1;31mSignature:[0m      
[0mnp[0m[1;33m.[0m[0mlinspace[0m[1;33m([0m[1;33m
[0m    [0mstart[0m[1;33m,[0m[1;33m
[0m    [0mstop[0m[1;33m,[0m[1;33m
[0m    [0mnum[0m[1;33m=[0m[1;36m50[0m[1;33m,[0m[1;33m
[0m    [0mendpoint[0m[1;33m=[0m[1;32mTrue[0m[1;33m,[0m[1;33m
[0m    [0mretstep[0m[1;33m=[0m[1;32mFalse[0m[1;33m,[0m[1;33m
[0m    [0mdtype[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0maxis[0m[1;33m=[0m[1;36m0[0m[1;33m,[0m[1;33m
[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mCall signature:[0m  [0mnp[0m[1;33m.[0m[0mlinspace[0m[1;33m([0m[1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mType:[0m            _ArrayFunctionDispatcher
[1;31mString form:[0m     <function linspace at 0x00000241BF645F80>
[1;31mFile:[0m            c:\users\phili\anaconda3\envs\vscode-env\lib\site-packages\numpy\core\function_base.py
[1;31mDocstring:[0m      
Retur

```np.linspace``` function also has a ```start``` and ```stop``` positional parameter but is inclusive to both boundaries by default. Instead of a ```step``` positional parameter, it uses a ```num``` positional parameter to specify the number of evenly spaced datapoints:

In [257]:
v = np.linspace(0, 10, 11)

In [258]:
v

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [259]:
v.size

11

In [260]:
v.dtype

dtype('float64')

Notice that the datetype is ```np.float64``` by default, as the linear spaced function uses ```float``` division to calculate each datapoint. The ```np.linspace``` function has the keyword input argument ```dtype``` which can be used to set a custom datatype such as ```np.int32```:

In [261]:
v = np.linspace(0, 10, 11, dtype=np.int32)

In [262]:
v

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [263]:
v.size

11

In [264]:
v.dtype

dtype('int32')

The functions ```np.arange``` and ```np.linspace``` have no shape parameter and will by default create a 1d ```ndarray```. 

To create a row vector or column vector, indexing with ```np.newaxis``` is often used as seen previously:

In [265]:
c = np.linspace(0, 10, 11, dtype=np.int32)[:, np.newaxis]

In [266]:
c

array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10]])

In [267]:
r = np.linspace(0, 10, 11, dtype=np.int32)[np.newaxis, :]

In [268]:
r

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10]])

A matrix can also be created by using the array method ```reshape```:

In [269]:
v = np.linspace(0, 15, 16, dtype=np.int32)

In [270]:
v

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [271]:
mat = v.reshape((4, 4))

In [272]:
mat

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

Notice the ```size``` of the vector ```v``` and the matrix ```mat``` match:

In [273]:
v.size

16

In [274]:
mat.size

16

This can be done on a single line using:

In [275]:
mat = np.linspace(0, 15, 16, dtype=np.int32).reshape((4, 4))

In [276]:
mat

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

## Object Based Datamodel Methods

The ```ndarray``` has a number of ```object``` based datamodel identifiers:

|datamodel method|use|description|
|---|---|---|
|ndarray.\_\_dir\_\_|dir(ndarray)|list directory for identifiers|
|ndarray.\_\_hash\_\_|None|mutable therefore not hashable|
|ndarray.\_\_str\_\_|str(ndarray)|informal representation|
|ndarray.\_\_repr\_\_|repr(ndarray)|formal representation|
|ndarray.\_\_doc\_\_|ndarray?|lookup docstring|
|ndarray.\_\_len\_\_|len(ndarray)|len of ndarray|
|ndarray.\_\_copy\_\_|copy.copy(ndarray)|shallow copy of ndarray|

The ```__len__``` datamodel method maps to the builtin ```len``` function. This treats the ndarray as a list of lists and returns only the length of the outer dimension:

In [277]:
len(mat.tolist())

4

In [278]:
mat.tolist()

[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]]

In [279]:
len(mat)

4

It may be more appropriate to use the attributes ```size```, ```ndim```, and ```shape```:

In [280]:
mat.ndim

2

In [281]:
np.ndim(mat)

2

In [282]:
mat.shape

(4, 4)

In [283]:
np.shape(mat)

(4, 4)

In [284]:
mat.size

16

In [285]:
np.size(mat)

16

The ```ndarray``` is mutable:

In [286]:
np.ndarray.__hash__ == None

True

If ```mat``` is assigned to ```matrix``` then ```matrix``` becomes an alias for ```mat```:

In [287]:
mat

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [288]:
matrix = mat

Mutating ```matrix```:

In [289]:
matrix[3, 3] = 25

In [290]:
matrix

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 25]])

Mutates, ```mat``` as they are references to the same ```ndarray``` instance:

In [291]:
mat

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 25]])

In [292]:
mat is matrix

True

Returning ```mat``` back to the values before:

In [293]:
mat = np.arange(start=1, stop=17, step=1).reshape((4, 4))

In [294]:
mat

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

The function ```np.copy``` or method ```ndarray.copy``` can be used instead to make a shallow copy. These are the ```np``` implementation of ```copy.copy``` however ```copy.copy``` can be used with the ```ndarray``` datamodel method ```__copy__``` (*dunder copy*):

In [295]:
matrix = mat.copy()

When the copy is mutated:

In [296]:
matrix[3, 3] = 25

In [297]:
matrix

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 25]])

The original remains unchanged:

In [298]:
mat

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [299]:
mat is matrix

False

## Unitary Datamodel Element by Element Methods

The unitary (operate on a single array) datamodel methods are setup for numeric operations:

|unitary datamodel method|use|description|
|---|---|---|
|ndarray.\_\_pos\_\_|+ndarray|unitary positive operator|
|ndarray.\_\_neg\_\_|-ndarray|unitary negative operator|
|ndarray.\_\_abs\_\_|abs(ndarray)|absolute values of ndarray|

Supposing the following ```ndarray``` is created:

In [300]:
mat = np.array([[1, 2, 3, 4],
                [-5, -6, -7, -8],
                [9, 10, 11, 12],
                [-13, -14, -15, -16]])

Use of the unitary ```+``` operator leaves the ```ndarray``` unchanged:

In [301]:
+mat

array([[  1,   2,   3,   4],
       [ -5,  -6,  -7,  -8],
       [  9,  10,  11,  12],
       [-13, -14, -15, -16]])

Use of the unitary ```-``` operator reverses the sign of each element, applying negation to each element:

In [302]:
-mat

array([[ -1,  -2,  -3,  -4],
       [  5,   6,   7,   8],
       [ -9, -10, -11, -12],
       [ 13,  14,  15,  16]])

Use of the unitary absolute function ```abs``` strips the sign of each element:

In [303]:
abs(mat)

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [304]:
abs(-mat)

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

## Binary Datamodel Element by Element Methods

The binary (operate on a pair of arrays) datamodel methods are setup for mathematical operations:

|binary datamodel method|use|description|
|---|---|---|
|ndarray1.\_\_add\_\_(ndarray2)|ndarray1 + ndarray2|addition operator|
|ndarray1.\_\_radd\_\_(ndarray2)||reverse addition operator|
|ndarray1.\_\_iadd\_\_(ndarray2)|ndarray1 += ndarray2|inplace addition operator|
|ndarray1.\_\_sub\_\_(ndarray2)|ndarray1 - ndarray2|subtraction operator|
|ndarray1.\_\_rsub\_\_(ndarray2)||reverse subtraction operator|
|ndarray1.\_\_isub\_\_(ndarray2)|ndarray1 -= ndarray2|inplace subtraction operator|
|ndarray1.\_\_mul\_\_(ndarray2)|ndarray1 \* ndarray2|multiplication operator|
|ndarray1.\_\_mul\_\_(ndarray2)||reverse multiplication operator|
|ndarray1.\_\_mul\_\_(ndarray2)|ndarray1 \*= ndarray2|inplace multiplication operator|
|ndarray1.\_\_pow\_\_(ndarray2)|ndarray1 \*\* ndarray2|power operator|
|ndarray1.\_\_pow\_\_(ndarray2)||reverse power operator|
|ndarray1.\_\_pow\_\_(ndarray2)|ndarray1 \*\*= ndarray2|inplace power operator|
|ndarray1.\_\_floordiv\_\_(ndarray2)|ndarray1 // ndarray2|integer division operator|
|ndarray1.\_\_rfloordiv\_\_(ndarray2)||reverse integer division operator|
|ndarray1.\_\_ifloordiv\_\_(ndarray2)|ndarray1 //= ndarray2|inplace integer division operator|
|ndarray1.\_\_mod\_\_(ndarray2)|ndarray1 % ndarray2|modulus operator|
|ndarray1.\_\_rmod\_\_(ndarray2)||modulus operator|
|ndarray1.\_\_imod\_\_(ndarray2)|ndarray1 %= ndarray2|modulus operator|
|ndarray1.\_\_truediv\_\_(ndarray2)|ndarray1 / ndarray2|float division operator|
|ndarray1.\_\_rtruediv\_\_(ndarray2)||reverse float division operator|
|ndarray1.\_\_itruediv\_\_(ndarray2)|ndarray1 /= ndarray2|inplace float division operator|
|ndarray1.\_\_and\_\_(ndarray2)|ndarray1 & ndarray2|and operator|
|ndarray1.\_\_rand\_\_(ndarray2)||reverse and operator|
|ndarray1.\_\_iand\_\_(ndarray2)|ndarray1 &= ndarray2|inplace and operator|
|ndarray1.\_\_or\_\_(ndarray2)|ndarray1 \| ndarray2|or operator|
|ndarray1.\_\_ror\_\_(ndarray2)||reverse or operator|
|ndarray1.\_\_ior\_\_(ndarray2)|ndarray1 \|= ndarray2|inplace or operator|
|ndarray1.\_\_xor\_\_(ndarray2)|ndarray1 ^ ndarray2|xor operator|
|ndarray1.\_\_rxor\_\_(ndarray2)||reverse xor operator|
|ndarray1.\_\_ior\_\_(ndarray2)|ndarray1 ^= ndarray2|inplace xor operator|
|ndarray1.\_\_eq\_\_(ndarray2)|ndarray1 == ndarray2|is equal to operator|
|ndarray1.\_\_ne\_\_(ndarray2)|ndarray1 != ndarray2|not equal to operator|
|ndarray1.\_\_lt\_\_(ndarray2)|ndarray1 < ndarray2|less than operator|
|ndarray1.\_\_gt\_\_(ndarray2)|ndarray1 > ndarray2|greater than operator|
|ndarray1.\_\_le\_\_(ndarray2)|ndarray1 <= ndarray2|less than or equal to operator|
|ndarray1.\_\_ge\_\_(ndarray2)|ndarray1 >= ndarray2|greater than or equal to operator|
|ndarray1.\_\_lshift\_\_(ndarray2)|ndarray1 << ndarray2|leftshift operator|
|ndarray1.\_\_rlshift\_\_(ndarray2)||reverse leftshift operator|
|ndarray1.\_\_ilshift\_\_(ndarray2)|ndarray1 <<= ndarray2|inplace leftshift operator|
|ndarray1.\_\_rshift\_\_(ndarray2)|ndarray1 >> ndarray2|rightshift operator|
|ndarray1.\_\_rrshift\_\_(ndarray2)||reverse rightshift operator|
|ndarray1.\_\_irshift\_\_(ndarray2)|ndarray1 >>= ndarray2|inplace rightshift operator|

The instance ```ndarray1``` is referred to as the instance ```self``` and the instance ```ndarray2``` is referred to as ```other``` when the datamodel method is called from the instance ```ndarray1```. 

There a large number of binary datamodel methods however most of these will be familar as they behave the same way as their counterparts in the builtins ```int```, ```float``` and ```bool``` classes.

When ```mat``` is a matrix:

In [305]:
mat = np.array([[1, 2, 3, 4],
                [-5, -6, -7, -8],
                [9, 10, 11, 12],
                [-13, -14, -15, -16]])

In [306]:
mat

array([[  1,   2,   3,   4],
       [ -5,  -6,  -7,  -8],
       [  9,  10,  11,  12],
       [-13, -14, -15, -16]])

In [307]:
mat.shape

(4, 4)

```mat2``` can be added provided the shape of both ```ndarrays``` instances are the same:

In [308]:
mat2 = abs(mat)

In [309]:
mat2

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [310]:
mat2.shape

(4, 4)

In [311]:
mat3 = mat + mat2

In [312]:
mat3

array([[ 2,  4,  6,  8],
       [ 0,  0,  0,  0],
       [18, 20, 22, 24],
       [ 0,  0,  0,  0]])

Alternatively scalar expansion of a scalar ```int```, ```float``` or ```tuple``` can be used. The scalar can be broadcast using the dimensions of the ```mat```:

In [313]:
mat

array([[  1,   2,   3,   4],
       [ -5,  -6,  -7,  -8],
       [  9,  10,  11,  12],
       [-13, -14, -15, -16]])

In [314]:
mat.shape

(4, 4)

In [315]:
scalar = 4

In [316]:
scalar_broadcast = np.full_like(mat, fill_value=scalar)

In [317]:
scalar_broadcast

array([[4, 4, 4, 4],
       [4, 4, 4, 4],
       [4, 4, 4, 4],
       [4, 4, 4, 4]])

In [318]:
scalar_broadcast.shape

(4, 4)

In [319]:
mat3 = mat + scalar_broadcast

In [320]:
mat3

array([[  5,   6,   7,   8],
       [ -1,  -2,  -3,  -4],
       [ 13,  14,  15,  16],
       [ -9, -10, -11, -12]])

Scalar expansion is automatically broadcast across the dimensions of the ```ndarray```:

In [321]:
mat3 = mat + scalar

In [322]:
mat3

array([[  5,   6,   7,   8],
       [ -1,  -2,  -3,  -4],
       [ 13,  14,  15,  16],
       [ -9, -10, -11, -12]])

Vector expansion can also be used. However the vector needs to be implictly specified as either a row or column. For example the following row can be examined:

In [323]:
mat

array([[  1,   2,   3,   4],
       [ -5,  -6,  -7,  -8],
       [  9,  10,  11,  12],
       [-13, -14, -15, -16]])

In [324]:
mat.shape

(4, 4)

In [325]:
row = np.arange(4)[np.newaxis, :]

In [326]:
row

array([[0, 1, 2, 3]])

In [327]:
row.shape

(1, 4)

Notice for row expansion, the number of columns matches:

In [328]:
mat.shape[-1] == row.shape[-1]

True

In [329]:
mat3 = mat + row

In [330]:
mat3

array([[  1,   3,   5,   7],
       [ -5,  -5,  -5,  -5],
       [  9,  11,  13,  15],
       [-13, -13, -13, -13]])

Alternatively for a column:

In [331]:
col = np.arange(4)[:, np.newaxis]

In [332]:
col

array([[0],
       [1],
       [2],
       [3]])

In [333]:
col.shape

(4, 1)

Notice for column expansion, the number of rows matches:

In [334]:
mat.shape[-2] == col.shape[-2]

True

In [335]:
mat3 = mat + col

In [336]:
mat3

array([[  1,   2,   3,   4],
       [ -4,  -5,  -6,  -7],
       [ 11,  12,  13,  14],
       [-10, -11, -12, -13]])

## Binary Datamodel Array Methods

All the binary operators mentioned above are broadcast across an ```ndarray``` element by element. For the case of multiplication, it is possible to carry out element by element multiplication or array multiplication:

|datamodel method|use|description|
|---|---|---|
|ndarray1.\_\_matmul\_\_(ndarray2)|ndarray1 @ ndarray2|ndarray multiplication operator|
|ndarray1.\_\_rmatmul\_\_(ndarray2)||reverse ndarray multiplication operator|
|ndarray1.\_\_imatmul\_\_(ndarray2)|ndarray1 @= ndarray2|inplace ndarray multiplication operator|

For array multiplication, dimensionality is important and the inner dimensions must match for array multiplication to take place:

$$\left[\begin{matrix}5&6\end{matrix}\right]@\left[\begin{matrix}
                                                   7\\
                                                   8\\
                                                   \end{matrix}\right]=\left[5\ast7+6\ast8\right]=\left[83\right]$$


In [337]:
row = np.array([5, 6])[np.newaxis, :]

In [338]:
row

array([[5, 6]])

In [339]:
col = np.array([7, 8])[:, np.newaxis]

In [340]:
col

array([[7],
       [8]])

Notice that the inner dimension of the two ```ndarray``` instances surrounding the ```@``` operator match:

In [341]:
row.shape[-1] == col.shape[0]

True

And the return ```ndarray``` instance has the dimensions which match the outer dimentions of the two ndarrays surrounding the ```@``` operator:

In [342]:
mat3 = row @ col

In [343]:
mat3

array([[83]])

In [344]:
mat3.shape

(1, 1)

In [345]:
(row.shape[0], col.shape[-1]) == mat3.shape

True

Giving the following:

$$\left(1,\ \textbf{2}\right)@\left(\textbf{2},1\right)=\left(1,1\right)=(1,1)$$

In the above example, the largest dimension of each vector was placed in the inside, around the ```@``` operator resulting in a 2d ```ndarray``` with a single scalar element. This is known as the inner dot product of two vectors. The ```np.inner``` function can perform this operation on vectors that aren't explicitly specified as rows or columns:

In [346]:
np.inner([5, 6], [7, 8])

83

Notice that the value returned using ```np.inner``` is a scalar. Whereas the value returned from using ```ndarray``` multiplication returns a 2d ```ndarray``` with only 1 element:

In [347]:
row @ col

array([[83]])

In the above example, ```row``` can be conceptualised as the quantity of an item purchased and the ```col``` can be conceptualised as the price of each unit item. The array returned ```mat``` is therefore the total price.

In contrast it is possible to place the largest dimension of each vector on the outside. This is known as the outer dot product and will result in a matrix output.

$$\left[\begin{matrix} 5 \\ 
                       6 \\ 
                       \end{matrix}\right]@ \left[\begin{matrix} 7 & 8 \end{matrix}\right] = \left[\begin{matrix} 5 \ast 7 & 5 \ast8 \\ 
                       6 \ast 7 & 6 \ast8 \\ 
                       \end{matrix}\right] = \left[\begin{matrix} 35 & 40 \\ 
                                                                  42 & 48 \\ 
                                                                  \end{matrix}\right]$$

In [348]:
mat3 = col @ row

In [349]:
mat3

array([[35, 42],
       [40, 48]])

The ```shape``` ```tuple``` of dimensions are:

$$\left(2,\ \textbf{1}\right)@\left(\textbf{1},2\right)=\left(2,2\right)=(2,2)$$

Notice the inner dimensions which are ```1``` match:

In [350]:
col.shape[-1] == row.shape[0]

True

And the return ```ndarray``` instance once again has the dimensions which match the outer dimentions of the two ndarrays surrounding the ```@``` operator:

In [351]:
(col.shape[0], row.shape[1]) == mat3.shape 

True

The ```np.outer``` function can perform this operation on vectors that aren't explicitly specified as columns or rows:

In [352]:
np.outer([5, 6], [7, 8])

array([[35, 40],
       [42, 48]])

Array division is not as straight-forward as array multiplication. Let's look at the example used earlier when the dot product of a row vector and a column vector was calculated:

$$\left[\begin{matrix} 5 & 6 \end{matrix}\right] @ \left[\begin{matrix} 7 \\ 
                                                                        8 \\ 
                                                                        \end{matrix}\right] = \left[ 5 \ast 7 + 6 \ast 8 \right] = \left[ 83 \right]$$

$$\left(1,\ \textbf{2}\right)@\left(\textbf{2},1\right)=\left(1,1\right)=(1,1)$$

Now supposing the values in the column vector are unknowns:

$$\left[\begin{matrix} 5 & 6 \end{matrix}\right] @ \left[\begin{matrix} x \\ 
                                                                        y \\ 
                                                                        \end{matrix}\right] = \left[ 5 \ast x + 6 \ast y \right] = \left[83\right]$$


This gives a single equation, with two unknowns and therefore there is not enough information to calculate these unknowns:

$$5x+6y=83$$

To find a solution for n unknowns, n unique equations are required:

$$5x+6y=83$$

$$3x+3y=42$$

In matrix form this is:

$$\left[\begin{matrix} 5x + 6y \\ 
                       3x + 3y \\
                       \end{matrix}\right] = \left[\begin{matrix} 83 \\
                                                                  42 \\
                                                                  \end{matrix}\right]$$


$$\left[\begin{matrix} 5 & 6 \\ 
                       3 & 3 \\ 
                       \end{matrix}\right] @ \left[\begin{matrix} x \\ 
                                                                  y \\ 
                                                                  \end{matrix}\right] = \left[\begin{matrix} 83 \\
                                                                                                             42 \\
                                                                                                             \end{matrix}\right]$$

Where the known values are:

$$\text{equations}=\left[\begin{matrix} 5 & 6 \\ 
                                        3 & 3 \\ 
                                        \end{matrix}\right]$$

In [353]:
equations = np.array([[5, 6],
                      [3, 3]])

In [354]:
equations

array([[5, 6],
       [3, 3]])

$$\text{results}=\left[\begin{matrix}83 \\ 
                                     42 \\ 
                                     \end{matrix}\right]$$

In [355]:
results = np.array([83, 42], ndmin=2).T

In [356]:
results

array([[83],
       [42]])

And the unknown values are:

$$\text{coefficients}=\left[\begin{matrix} x \\ 
                                           y \\ 
                                           \end{matrix}\right]$$

Notice equations is a square matrix. Square matrices are typically constructed from a system of linear equations due to the requirement of n equations for n unknown coefficients. A square matrix typically has a inverse matrix. The inverse matrix for equations is:

$$\text{equationsInv}=\left[\begin{matrix} -1 & 2 \\ 
                                           1 & -1.6667 \\ 
                                           \end{matrix}\right]$$

In [357]:
equations_inv = np.linalg.inv(equations)

In [358]:
equations_inv

array([[-1.        ,  2.        ],
       [ 1.        , -1.66666667]])

Array multiplication between a square matrix and its inverse square matrix gives the identity matrix:

$$\left[\begin{matrix} -1 & 2 \\
                       1 & -1.6667 \\ 
                       \end{matrix}\right] @ \left[\begin{matrix} 5 & 6 \\ 
                                                                  3 & 3 \\ 
                                                                  \end{matrix}\right]=\left[\begin{matrix} -1 \ast 5 + 2 \ast 3 & -1 \ast 6 + 2 \ast3 \\ 
                                                                  1 \ast5 - 1.6667 \ast3 &1 \ast 6 - 1.6667 \ast3 \\
                                                                  \end{matrix}\right] = \left[\begin{matrix} 1 & 0 \\
                                                                  0 & 1 \\
                                                                  \end{matrix}\right]$$

In [359]:
equations_inv @ equations

array([[ 1.00000000e+00, -1.33226763e-15],
       [ 0.00000000e+00,  1.00000000e+00]])

In [360]:
equations @ equations_inv

array([[ 1.00000000e+00,  0.00000000e+00],
       [-6.66133815e-16,  1.00000000e+00]])

Notice that the identity matrix is a square matrix that has the value of ```1``` for each element along the main diagonal and ```0``` elsewhere. This can be created using the function ```np.diag```:

In [361]:
np.diag([1, 1])

array([[1, 0],
       [0, 1]])

The function ```np.diag``` can be used to create diagnoals of different values:

In [362]:
np.diag([1, 2, 3, 4])

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

The ```ndarray``` method diagonal can be used to retrieve the diagonal from a square matrix:

In [363]:
equations

array([[5, 6],
       [3, 3]])

In [364]:
equations.diagonal()

array([5, 3])

It can also be instantiated using the functions ```np.identity``` and ```np.eye```:

In [365]:
np.identity(2)

array([[1., 0.],
       [0., 1.]])

In [366]:
np.eye(2)

array([[1., 0.],
       [0., 1.]])

Most applications for an identity matrix involve a square matrix however ```np.eye``` has some support for non-square matrices:

In [367]:
np.eye(3, 2)

array([[1., 0.],
       [0., 1.],
       [0., 0.]])

Multiplication of an ```ndarray``` by the identity matrix leaves it unchanged:

$$\left[\begin{matrix} 1 & 0 \\ 
                       0 & 1 \\ 
                       \end{matrix}\right] @ \left[\begin{matrix} x \\
                                                                  y \\ 
                                                                  \end{matrix}\right] = \left[\begin{matrix} 1 \ast x + 0 \ast y \\
                                                                  0 \ast x + 1 \ast y \\
                                                                  \end{matrix}\right]=\left[\begin{matrix}x \\
                                                                  y \\
                                                                  \end{matrix}\right]$$


In [368]:
results

array([[83],
       [42]])

In [369]:
np.identity(2) @ results

array([[83.],
       [42.]])

In [370]:
equations

array([[5, 6],
       [3, 3]])

In [371]:
np.identity(2) @ equations

array([[5., 6.],
       [3., 3.]])

In [372]:
equations @ np.identity(2)

array([[5., 6.],
       [3., 3.]])

This means array multiplication of the inverse equations matrix on both sides gives:

$$\left[\begin{matrix} x \\ 
                       y \\ 
                       \end{matrix}\right] = \left[\begin{matrix} -1 & 2 \\
                                                                  1 & -1.6667 \\ 
                                                                  \end{matrix}\right] @ \left[\begin{matrix} 83 \\
                                                                                                             42 \\
                                                                                                             \end{matrix}\right]$$

Which can be solved:


$$\left[\begin{matrix} -1 & 2 \\
                       1 & -1.6667 \\ 
                       \end{matrix}\right] @ \left[\begin{matrix} 83 \\
                                                                  42 \\ \end{matrix}\right] = \left[\begin{matrix} -1 \ast 83 + 2 \ast42 \\ 
                                                                  1 \ast 83 - 1.6667 \ast 42 \\ 
                                                                  \end{matrix}\right] = \left[\begin{matrix} 1 \\
                                                                                                             13 \\
                                                                                                             \end{matrix}\right]$$

In [373]:
coefficients = equations_inv @ results

In [374]:
coefficients

array([[ 1.],
       [13.]])

Therefore:

$$\left[\begin{matrix} x \\ 
                       y \\ 
                       \end{matrix}\right] = \left[\begin{matrix} 1 \\ 
                                                                  13 \\ 
                                                                  \end{matrix}\right]$$

These can also be calculated directly using the ```np.linalg.solve```:

In [375]:
coefficients = np.linalg.solve(equations, results)

In [376]:
coefficients

array([[ 1.],
       [13.]])

Notice that the ```numpy``` library compartmentalises linear algebra related functions in the ```np.linalg``` module. The linear algebra functions are commonly used and included in ```np```. ```scipy``` is the scientific python library and has a ```numpy``` base with a greater number of scientific modules that are compartmentalised for more niche scientific purposes.

## Statistics

The ```ndarray``` class has a number of statistical methods and complementary ```numpy``` functions. These are essentially the functions in ```builtins``` and the ```statistics``` module broadcast across an ```ndarray```. To recap:

In [377]:
max([1, 2, 3, 4])

4

In [378]:
import statistics

In [379]:
statistics.mean([1, 2, 3, 4])

2.5

In [380]:
statistics.stdev([1, 2, 3, 4])

1.2909944487358056

If the following ```ndarray``` instance is instantiated:

$$ \text{mat} = \begin{bmatrix} 
                    1 & -2 & 3 & -4 \\
                    -5 & 6 & -7 & 8 \\
                    9 & -10 & 11 & -12 \\
                    \end{bmatrix} $$


In [381]:
mat = np.array([[1, -2, 3, -4], 
                [-5, 6, -7, 8],
                [9, -10, 11, -12]])

In [382]:
mat

array([[  1,  -2,   3,  -4],
       [ -5,   6,  -7,   8],
       [  9, -10,  11, -12]])

The ```shape``` ```tuple``` is:

In [383]:
mat.shape

(3, 4)

That is ```(3 rows, 4 columns)``` and recall the columns are at index ```-1``` and the rows are at index ```-2```.

When ```axis=None``` the matrix is flattened, so the maximum value of this flattened array is returned as the scalar ```11``` and the associated index of this value is ```10```:

$$ \text{matAxisNone} = \begin{bmatrix} 1 & -2 & 3 & -4 & -5 & 6 & -7 & 8 & 9 & -10 & \textbf{11} & -12 \end{bmatrix} $$

$$ \text{matAxisNoneIndex} = \begin{bmatrix} 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & \textbf{10} & 11 \end{bmatrix} $$

In [384]:
mat.max(axis=None)

11

In [385]:
mat.argmax(axis=None)

10

This can be mimicked using the ```ndarray.ravel``` method:

In [386]:
mat.ravel()

array([  1,  -2,   3,  -4,  -5,   6,  -7,   8,   9, -10,  11, -12])

In [387]:
mat.ravel()[10]

11

When the ```axis=-1```, the method operates along columns. For example examining the four columns in the first row, the maximum value is ```3``` and it has a corresponding index of ```2```: 

$$ \text{matCol} = \begin{bmatrix} 
                     1 & -2 & \textbf{3} & -4 \\
                     x & x & x & x \\
                     y & y & y & y \\
                     \end{bmatrix} $$

$$ \text{matColIndex} = \begin{bmatrix} 
                            0 & 1 & \textbf{2} & 3 \\
                            x & x & x & x \\
                            y & y & y & y \\
                            \end{bmatrix} $$

The maximum value of each column is found for each row:

$$ \text{mat} = \begin{bmatrix} 
                    1 & -2 & \textbf{3} & -4 \\
                    -5 & 6 & -7 & \textbf{8} \\
                    9 & -10 & \textbf{11} & -12 \\
                    \end{bmatrix} $$

In [388]:
mat.max(axis=-1)

array([ 3,  8, 11])

$$ \text{matColIndex} = \begin{bmatrix} 
                            0 & 1 & \textbf{2} & 3 \\
                            0 & 1 & 2 & \textbf{3} \\
                            0 & 1 & \textbf{2} & 3 \\
                            \end{bmatrix} $$

In [389]:
mat.argmax(axis=-1)

array([2, 3, 2], dtype=int64)

Notice that these are returned as 1d ```ndarray``` instances:

$$ \text{ColMax} = \begin{bmatrix} 3 & 8 & 11 \end{bmatrix} $$

$$ \text{ColMaxIndex} = \begin{bmatrix} 2 & 3 & 2 \end{bmatrix} $$

However the dimensions can be kept by assigning the ```keepdims``` input argument to ```True```:

$$ \text{ColMax} = \begin{bmatrix} 3 \\
                                   8 \\ 
                                   11 \\ 
                                   \end{bmatrix} $$

In [390]:
mat.max(axis=-1, keepdims=True)

array([[ 3],
       [ 8],
       [11]])

$$ \text{ColMaxIndex} = \begin{bmatrix} 2 \\ 
                                        3 \\ 
                                        2 \\ 
                                        \end{bmatrix} $$

In [391]:
mat.argmax(axis=-1, keepdims=True)

array([[2],
       [3],
       [2]], dtype=int64)

When ```axis=-2```, the method operates along rows: 

$$ \text{mat} = \begin{bmatrix} 
                    1 & -2 & 3 & -4 \\
                    -5 & \textbf{6} & -7 & \textbf{8} \\
                    \textbf{9} & -10 & \textbf{11} & -12 \\
                    \end{bmatrix} $$

$$ \text{RowMax} = \begin{bmatrix} 9 & 6 & 11 & 8 \end{bmatrix} $$

In [392]:
mat.max(axis=-2, keepdims=True)

array([[ 9,  6, 11,  8]])

$$ \text{matRowIndex} = \begin{bmatrix} 
                            0 & 0 & 0 & 0 \\
                            1 & \textbf{1} & 1 & \textbf{1} \\
                            \textbf{2} & 2 & \textbf{2} & 2 \\
                            \end{bmatrix} $$

$$ \text{RowMaxIndex} = \begin{bmatrix} 2 & 1 & 2 & 2 \end{bmatrix} $$

In [393]:
mat.argmax(axis=-2, keepdims=True)

array([[2, 1, 2, 1]], dtype=int64)

The complementary functions ```np.max``` (has an alias ```np.amax```) and ```np.argmax``` operate in a similar manner.

The opposite ```ndarray``` methods ```min``` and ```argmin``` operate in a similar manner, returning the value and index of the minimum value aling an axis respectively.

The ndarray statistical methods and their complementary ```numpy``` functions ```sum```, ```prod```, ```mean```, ```var``` and ```std``` all operate using ```axis``` and ```keepdims``` as input arguments and behave similarly to their equivalents in the statistics module. The ```var``` and ```std``` have a keyword input argument delta degrees of freedom ```ddof``` which has a default value of ```0``` and calculates the population variance or population standard deviation. This can be changed to ```1``` to calculate the sample variance or sample standard deviation. The principle behind these calculations were covered in the previous notebook on the ```statistics``` module:

In [394]:
mat.sum(axis=-1, keepdims=True)

array([[-2],
       [ 2],
       [-2]])

In [395]:
mat.prod(axis=-1, keepdims=True)

array([[   24],
       [ 1680],
       [11880]])

In [396]:
mat.mean(axis=-1, keepdims=True)

array([[-0.5],
       [ 0.5],
       [-0.5]])

In [397]:
mat.var(axis=-1, keepdims=True, ddof=1)

array([[  9.66666667],
       [ 57.66666667],
       [148.33333333]])

In [398]:
mat.std(axis=-1, keepdims=True, ddof=1)

array([[ 3.10912635],
       [ 7.59385717],
       [12.17921727]])

These have the complementary functions ```np.sum```, ```np.prod```, ```np.mean```, ```np.var``` and ```np.std``` which behave analogously. Some of the less common statistical functions are only available as functions such as ```np.median``` and ```np.average```. ```np.median``` is essentially an ```np.average``` with equal weights across each elements:

$$ \text{mat} = \begin{bmatrix} 
                    1 & -2 & 3 & -4 \\
                    -5 & 6 & -7 & 8 \\
                    9 & -10 & 11 & -12 \\
                    \end{bmatrix} $$

$$ \text{weights} = \begin{bmatrix} 
                    1 & 1 & 1 & 1 \\
                    1 & 1 & 1 & 1 \\
                    1 & 1 & 1 & 1 \\
                    \end{bmatrix} $$

The average can be calculated, for the first row using:

$$\frac{1\ast1+1\ast-2+1\ast3+1\ast-4}{1+1+1+1}=\frac{-2}{4}=-0.5$$

In [399]:
np.median(a=mat, axis=-1, keepdims=True)

array([[-0.5],
       [ 0.5],
       [-0.5]])

In [400]:
np.average(a=mat, axis=-1, keepdims=True)

array([[-0.5],
       [ 0.5],
       [-0.5]])

For non-uniform weights an array of weights with a matching shape has to be provided:

$$ \text{weights} = \begin{bmatrix} 
                    1 & 2 & 3 & 4 \\
                    1 & 2 & 3 & 4 \\
                    1 & 2 & 3 & 4 \\
                    \end{bmatrix} $$


In [401]:
weights = np.array([[1, 2, 3, 4],
                    [1, 2, 3, 4],
                    [1, 2, 3, 4]])

The weighted average can be calculated, for the first row using:

$$\frac{1\ast1+2\ast-2+3\ast3+4\ast-4}{1+2+3+4}=\frac{-10}{10}=-1$$

In [402]:
np.average(a=mat, axis=-1, keepdims=True, weights=weights)

array([[-1. ],
       [ 1.8],
       [-2.6]])

The ndarray methods ```any``` and ```all``` work with an array of boolean values and also operate along an ```axis```. The value returned for ```any``` will be ```True``` if any of the elements are ```True```, whereas the value returned for ```all``` will be ```true``` only is all the elements are ```True```:

In [403]:
mat = np.array([[True, False, False, False],
                [True, True, True, True],
                [False, False, False, False]])

In [404]:
mat.any(axis=-1, keepdims=True)

array([[ True],
       [ True],
       [False]])

In [405]:
mat.all(axis=-1, keepdims=True)

array([[False],
       [ True],
       [False]])

The complementary functions ```np.any``` and ```np.all``` behave equivalently.

The ```ndarray.sort``` and ```ndarray.argsort``` methods also operate along an ```axis```:

In [406]:
mat = np.array([[1, -2, 3, -4],
                [-5, 6, -7, 8],
                [9, -10, 11, -12]])

```ndarray.sort``` is a mutatable method that sorts the array inplace along an ```axis```:

In [407]:
mat.sort(axis=-1) # Mutable method, no return value

In [408]:
mat # mutated in place

array([[ -4,  -2,   1,   3],
       [ -7,  -5,   6,   8],
       [-12, -10,   9,  11]])

Returning to the original ```mat```:

In [409]:
mat = np.array([[1, -2, 3, -4],
                [-5, 6, -7, 8],
                [9, -10, 11, -12]])

```ndarray.argsort``` is an immutable method that gives the indexes required to sort the array along an axis:

In [410]:
mat.argsort(axis=-1)

array([[3, 1, 0, 2],
       [2, 0, 1, 3],
       [3, 1, 0, 2]], dtype=int64)

For simplicity, examine only the first row, to sort it, the lowest value at ```-4``` is selected with index ```3```. The next lowest value is ```-2``` at index ```1```. The next lowest value is ```1``` at index ```0``` and the highest value is ```3``` at index ```2```:

$$ \text{mat} = \begin{bmatrix} 
                1 & -2 & 3 & -4 \\
                -5 & 6 & -7 & 8 \\
                9 & -10 & 11 & -12 \\
                \end{bmatrix} $$

The argsort is the reordering of these indexes:

$$ \text{matColArgSort} = \begin{bmatrix} 
                          3 & 1 & 0 & 2 \\
                          2 & 0 & 1 & 3 \\
                          3 & 1 & 0 & 2 \\
                          \end{bmatrix} $$

These indexes can be used with the ```numpy``` function ```take_along_axis``` to sort the matrix indirectly:

In [411]:
np.take_along_axis(mat, mat.argsort(axis=-1), axis=-1)

array([[ -4,  -2,   1,   3],
       [ -7,  -5,   6,   8],
       [-12, -10,   9,  11]])

The equivalent functions ```np.sort``` and ```np.argsort``` behave similarly however ```np.sort``` is an immutable function giving a return value:

In [412]:
np.sort(mat) # immutable function return value

array([[ -4,  -2,   1,   3],
       [ -7,  -5,   6,   8],
       [-12, -10,   9,  11]])

In [413]:
mat # unchanged

array([[  1,  -2,   3,  -4],
       [ -5,   6,  -7,   8],
       [  9, -10,  11, -12]])

The method ```ndarray.cumsum``` also takes the keyword input argument ```axis``` and propogates the cumulative sum along that specified axis. The cumulative sum propagated along the columns with ```axis=-1``` for example can be calculated:

$$ \text{mat} = \begin{bmatrix} 
                1 & -2 & 3 & -4 \\
                x & x & x & x \\
                y & y & y & y \\
                \end{bmatrix} $$

$$ \text{matCumSum} = \begin{bmatrix} 
                      1 & 1 + (-2) & 1 + (-2) + 3 & 1 + (-2) + 3 + (-4) \\
                      x & x & x & x \\
                      y & y & y & y \\
                      \end{bmatrix} $$

$$ \text{matCumSum} = \begin{bmatrix} 
                      1 & -1 & 2 & -2 \\
                      -5 & 1 & -6 & 2 \\
                      9 & -1 & 10 & -2 \\
                      \end{bmatrix} $$

In [414]:
mat.cumsum(axis=-1)

array([[ 1, -1,  2, -2],
       [-5,  1, -6,  2],
       [ 9, -1, 10, -2]])

The corresponding ```np.cumsum``` function behaves analogously.

The ```ndarray.cumprod``` also takes the keyword input argument ```axis``` and propogate thes cumultative product along that axis. The cumulative product propagated along the columns with ```axis=-1```:

$$ \text{mat} = \begin{bmatrix} 
                1 & -2 & 3 & -4 \\
                x & x & x & x \\
                y & y & y & y \\
                \end{bmatrix} $$

$$ \text{matCumProduct} = \begin{bmatrix} 
                          1 & 1 * (-2) & 1 * (-2) * 3 & 1 * (-2) + 3 * (-4) \\
                          x & x & x & x \\
                          y & y & y & y \\
                          \end{bmatrix} $$

$$ \text{matCumProduct} = \begin{bmatrix} 
                          1 & -2 & -6 + 24 \\
                          -5 & -30 & 210 & 1680 \\
                          9 & -90 & -990 & 11880 \\
                          \end{bmatrix} $$

In [415]:
mat.cumprod(axis=-1)

array([[    1,    -2,    -6,    24],
       [   -5,   -30,   210,  1680],
       [    9,   -90,  -990, 11880]])

The corresponding ```np.cumprod``` (with alias ```np.cumproduct```) function behaves analogously.

The ```np.diff``` function also takes the keyword input argument ```axis``` and propogates the difference along an axis. The number of recursions ```n``` is ```1``` by default. Two values are required to compute the difference and the axis length in the return result for the axis the difference was propogated along will be contracted by ```1```:

$$ \text{mat} = \begin{bmatrix} 
                1 & -2 & 3 & -4 \\
                x & x & x & x \\
                y & y & y & y \\
                \end{bmatrix} $$

For ```n=1``` (1 recursion):

$$ \text{matDiff1} = \begin{bmatrix} 
                    (-2) - 1 & 3 - (-2) & -4 - 3 \\
                    x & x & x \\
                    y & y & y \\
                    \end{bmatrix} $$

$$ \text{matDiff1} = \begin{bmatrix} 
                    -3 & 5 & -7 \\
                    11 & -13 & 15 \\
                    -19 & 21 & -23 \\
                    \end{bmatrix} $$

In [416]:
np.diff(mat, axis=-1)

array([[ -3,   5,  -7],
       [ 11, -13,  15],
       [-19,  21, -23]])

For ```n=2``` (2 recursions), the difference of the ```n=1``` matrix is computed:

$$ \text{matDiff2} = \begin{bmatrix} 
                     5 - (-3) & -7 - (-5) \\
                     x & x \\
                     y & y \\
                    \end{bmatrix} $$

$$ \text{matDiff2} = \begin{bmatrix} 
                     8 & -12 \\
                     -24 & 28 \\
                     40 & -44 \\
                    \end{bmatrix} $$

In [417]:
np.diff(mat, axis=-1, n=2)

array([[  8, -12],
       [-24,  28],
       [ 40, -44]])

The method ```ndarray.round``` broadcasts ```builtins.round``` along an ```ndarray```. By default ```decimals=0``` so each floating point number is rounded to ```0``` decimal places:

$$ \text{mat} = \begin{bmatrix} 
                1.12 & -2.12 & 3.12 & -4.12 \\
                -5.12 & 6.12 & -7.12 & 8.12 \\
                9.12 & -10.12 & 11.12 & -12.12 \\
                \end{bmatrix} $$

In [418]:
mat = np.array([[1.12, -2.12, 3.12, -4.12],
                [-5.12, 6.12, -7.12, 8.12],
                [9.12, -10.12, 11.12, -12.12]])

Notice the inclusion of the decimal point after each number ```.```, indicating that the datatype is still a floating point number:

$$ \text{matrix1Round0} = \begin{bmatrix} 
                          1. & -2. & 3. & -4. \\
                          -5. & 6. & -7. & 8. \\
                          2. & -10. & 11. & -12. \\
                          \end{bmatrix} $$

In [419]:
mat.round()

array([[  1.,  -2.,   3.,  -4.],
       [ -5.,   6.,  -7.,   8.],
       [  9., -10.,  11., -12.]])

If ```decimals=1```, each value is instead rounded to 1 decimal place:

$$ \text{matRound0} = \begin{bmatrix} 
                      1.1 & -2.1 & 3.1 & -4.1 \\
                      -5.1 & 6.1 & -7.1 & 8.1 \\
                      9.1 & -10.1 & 11.1 & -12.1 \\
                      \end{bmatrix} $$

In [420]:
mat.round(decimals=1)

array([[  1.1,  -2.1,   3.1,  -4.1],
       [ -5.1,   6.1,  -7.1,   8.1],
       [  9.1, -10.1,  11.1, -12.1]])

## Mathematics

The constants in the ```math``` module are also available from the ```np``` library:

In [421]:
np.e

2.718281828459045

In [422]:
np.pi

3.141592653589793

In [423]:
np.inf

inf

In [424]:
np.nan

nan

These can be made into arrays using the ```np.full``` function or scalar expansion with an ```ndarray``` can be used:

In [425]:
np.full(shape=(3, 2), fill_value=np.inf)

array([[inf, inf],
       [inf, inf],
       [inf, inf]])

In [426]:
np.e * np.ones(shape=(4, 2))

array([[2.71828183, 2.71828183],
       [2.71828183, 2.71828183],
       [2.71828183, 2.71828183],
       [2.71828183, 2.71828183]])

The ```np.pi``` or ```np.tau``` constant is commonly used with ```np.linspace``` to create an array of circular angles:

In [427]:
angles = np.linspace(-np.pi/2, np.pi/2, 11)

In [428]:
angles

array([-1.57079633, -1.25663706, -0.9424778 , -0.62831853, -0.31415927,
        0.        ,  0.31415927,  0.62831853,  0.9424778 ,  1.25663706,
        1.57079633])

The trigonmetric functions in the ```math``` module are also broadcast to ```ndarrays```:

In [429]:
np.sin(angles)

array([-1.        , -0.95105652, -0.80901699, -0.58778525, -0.30901699,
        0.        ,  0.30901699,  0.58778525,  0.80901699,  0.95105652,
        1.        ])

In [430]:
np.cos(angles)

array([6.12323400e-17, 3.09016994e-01, 5.87785252e-01, 8.09016994e-01,
       9.51056516e-01, 1.00000000e+00, 9.51056516e-01, 8.09016994e-01,
       5.87785252e-01, 3.09016994e-01, 6.12323400e-17])

In [431]:
np.tan(angles)

array([-1.63312394e+16, -3.07768354e+00, -1.37638192e+00, -7.26542528e-01,
       -3.24919696e-01,  0.00000000e+00,  3.24919696e-01,  7.26542528e-01,
        1.37638192e+00,  3.07768354e+00,  1.63312394e+16])

These functions, alongside some others were covered in more detail in the notebook on the ```math``` module.

## Random Module

The numpy library has its own ```random``` module which is essentially Pythons standard module ```random``` broadcast to ```ndarrays```. The workflow of the modules is similar. Notice the standard module ```random``` normally returns a scalar:

In [432]:
import random

In [433]:
random.seed(0)

In [434]:
random.randint(0, 11) # scalar

6

Whereas ```np.random``` normally returns an ```ndarray``` according to a specified ```size```:

In [435]:
np.random.seed(0)

In [436]:
np.random.randint(low=0, high=11, size=4) # ndarray 

array([5, 0, 3, 3])

```size``` can also be supplied to a ```shape``` ```tuple``` of dimensions:

In [437]:
np.random.randint(low=0, high=11, size=(6, 5)) # ndarray 

array([[ 7,  9,  3,  5,  2],
       [ 4,  7,  6,  8,  8],
       [10,  1,  6,  7,  7],
       [ 8,  1,  5,  9,  8],
       [ 9,  4,  3,  0,  3],
       [ 5,  0,  2,  3,  8]])

More details about the other functions in the module such as the random distributions were covered in the notebook on the standard ```random``` module.

## Meshgrid

Supposing $x$ is a row and $y$ is a column:

In [438]:
xrow = np.array([5, 6, 7, 8, 9])[np.newaxis, :]

In [439]:
xrow

array([[5, 6, 7, 8, 9]])

In [440]:
ycol = np.array([1, 2, 3, 4])[:, np.newaxis]

In [441]:
ycol

array([[1],
       [2],
       [3],
       [4]])

The ```np.meshgrid``` function can be used to broadcast an ```x``` ```ndarray``` using the dimensionality of a ```y``` ```ndarray``` instance and vice versa:

In [442]:
xmat, ymat = np.meshgrid(xrow, ycol)

In [443]:
xmat

array([[5, 6, 7, 8, 9],
       [5, 6, 7, 8, 9],
       [5, 6, 7, 8, 9],
       [5, 6, 7, 8, 9]])

In [444]:
ymat

array([[1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3],
       [4, 4, 4, 4, 4]])

This is a useful function for plotting data, which will be discussed in the next notebook on ```matplotlib```.

[Return to Python Tutorials](../readme.md)