# Numpy demo
Demo, showing functionality of python numpy module

Victor Kitov, <https://victorkitov.github.io>

[For jupyter notebook - make full screen preview](https://stackoverflow.com/questions/21971449/how-do-i-increase-the-cell-width-of-the-jupyter-ipython-notebook-in-my-browser)

* [Numpy demo](#Numpy-demo)
* [Dynamic types of python](#Dynamic-types-of-python)
* [Numpy](#Numpy)
* [Array - fundamental data type](#Array---fundamental-data-type)
* [Creating standard arrays](#Creating-standard-arrays)
* [Arrays filled with random numbers](#Arrays-filled-with-random-numbers)
* [Mathematical operations](#Mathematical-operations)
   * [sum](#sum)
* [Iteration over rows/columns](#Iteration-over-rows/columns)
* [Extending dimensions](#Extending-dimensions)
* [Reshaping an array](#Reshaping-an-array)
* [Concatenation of arrays](#Concatenation-of-arrays)
   * [hstack - stacks along horizontal dimension (add columns)](#hstack---stacks-along-horizontal-dimension-(add-columns))
   * [vstack - stacks along vertical direction (add rows)](#vstack---stacks-along-vertical-direction-(add-rows))
* [Gotcha: assignment of arrays](#Gotcha:-assignment-of-arrays)
* [Slicing of array](#Slicing-of-array)
* [Boolean operations](#Boolean-operations)
   * [any / all](#any-/-all)
* [Creating subarrays by indexes](#Creating-subarrays-by-indexes)
 * [Integer indexing](#Integer-indexing)
  * [Random reorder of array](#Random-reorder-of-array)
  * [Alternative way to create a random subsample](#Alternative-way-to-create-a-random-subsample)
 * [Boolean indexing](#Boolean-indexing)
 * [Convert boolean selection to integer indexes](#Convert-boolean-selection-to-integer-indexes)
 * [Convert integer indexes to boolean selection](#Convert-integer-indexes-to-boolean-selection)
 * [Boolean selection to integer indexing for a matrix](#Boolean-selection-to-integer-indexing-for-a-matrix)
* [Broadcasting](#Broadcasting)
 * [Demo: Increase all elements by 100.](#Demo:-Increase-all-elements-by-100.)
 * [Demo: add 100 to first column, 200 to second column, etc.](#Demo:-add-100-to-first-column,-200-to-second-column,-etc.)
 * [Demo: add 100 to first row, 200 to second row, etc.](#Demo:-add-100-to-first-row,-200-to-second-row,-etc.)
* [Sorting](#Sorting)
 * [Sorting vectors](#Sorting-vectors)
 * [Sorting matrices](#Sorting-matrices)
 * [Get indexes which make the array sorted](#Get-indexes-which-make-the-array-sorted)
* [Obtaining unique elements](#Obtaining-unique-elements)
* [Saving memory tricks](#Saving-memory-tricks)
 * [float64 to float32](#float64-to-float32)
 * [Sparse matrices](#Sparse-matrices)
* [Special values](#Special-values)
  * [inf, -inf  ](#inf,--inf--)
  * [nan ](#nan-)

# Dynamic types of python

Python is dynamically typed - no need to predefine variable type, it may change through time.

In [1]:
a = True      # now boolean    
a = 123       # now integer 
a = 'Hello'   # now string

In [2]:
a=1
b=2
a+b  # on each such operation python need to understand data type, reference appropriate methods, slow.

3

In [3]:
from sys import getsizeof
getsizeof(a)    # 28 bytes to store int!

28

In [4]:
A=[[1,2,3],[4,5,6],['text',8,True]]
A[2][1]

8

# Numpy

Numpy is a library for storing arrays of uniform (same type) data. 
Data type is the same for all array elements, so vector operations are very fast

In [5]:
# import numpy and matplotlib into current variable namespace
%pylab inline

%pylab is deprecated, use %matplotlib inline and import the required libraries.
Populating the interactive namespace from numpy and matplotlib


In [6]:
# show in all print statements only 3 digits precision for simplicity
%precision 3

'%.3f'

# Array - fundamental data type

In [7]:
A=array([[1,1],[2,2],[3,3]],dtype=float)
A

array([[1., 1.],
       [2., 2.],
       [3., 3.]])

In [8]:
A.dtype

dtype('float64')

In [9]:
A.shape

(3, 2)

In [10]:
A[0,0]=10

In [11]:
A

array([[10.,  1.],
       [ 2.,  2.],
       [ 3.,  3.]])

In [12]:
A+10 # operations are for all elements simultaneously

array([[20., 11.],
       [12., 12.],
       [13., 13.]])

In [13]:
len(A) # number of elements along 0-th dimension (num of rows in matrix)

3

In [14]:
B=zeros(4, dtype=bool)
B.dtype, B

(dtype('bool'), array([False, False, False, False]))

In [15]:
C=array([1,2,3], dtype=float32) # C is half precision, 2 times more memory efficient than float

In [16]:
C.dtype, C

(dtype('float32'), array([1., 2., 3.], dtype=float32))

In [17]:
print("C: %d bytes" % (C.size * C.itemsize))

C: 12 bytes


In [18]:
A=array([1,2,3])
print("A: %d bytes" % (A.size * A.itemsize))

A: 24 bytes


In [19]:
A=arange(10,14)   # python range is a generator, arange is numpy array (fully stored in memory)
A

array([10, 11, 12, 13])

In [20]:
Y=array([11,12,13])
Y

array([11, 12, 13])

# Creating standard arrays

In [187]:
zeros([3,4])

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [188]:
ones([3,4])

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [23]:
eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [24]:
arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [25]:
arange(11, 20)  # [begin,end) - last element excluded!

array([11, 12, 13, 14, 15, 16, 17, 18, 19])

In [26]:
arange(11, 15, 0.5)  # last element excluded!

array([11. , 11.5, 12. , 12.5, 13. , 13.5, 14. , 14.5])

linspace(start, stop, count) - make uniform grid, consisting of [count] points between [start] and [stop]

In [27]:
linspace(-2,3,6)   # last element is included!

array([-2., -1.,  0.,  1.,  2.,  3.])

In [191]:
linspace(0,2,5)

array([0. , 0.5, 1. , 1.5, 2. ])

# Arrays filled with random numbers

In [29]:
rand(2,3) # 2x3 array of uniform U[0,1] numbers

array([[0.87 , 0.138, 0.876],
       [0.972, 0.397, 0.895]])

In [30]:
randn(2,3) # 2x3 array of normally distributed N(0,1) numbers

array([[ 0.999,  0.149, -0.729],
       [-0.953, -0.013,  0.327]])

# Mathematical operations

In [232]:
A=array([[1,2],[3,4]])

In [233]:
A

array([[1, 2],
       [3, 4]])

In [234]:
2*A+100

array([[102, 104],
       [106, 108]])

In [235]:
A**3  # square each element

array([[ 1,  8],
       [27, 64]])

In [35]:
A%2  # remainder after division by 2

array([[1, 0],
       [1, 0]])

In [36]:
log(A)

array([[0.   , 0.693],
       [1.099, 1.386]])

In [37]:
exp(A)

array([[ 2.718,  7.389],
       [20.086, 54.598]])

In [38]:
sqrt(A)

array([[1.   , 1.414],
       [1.732, 2.   ]])

In [39]:
sin(A)

array([[ 0.841,  0.909],
       [ 0.141, -0.757]])

In [236]:
A

array([[1, 2],
       [3, 4]])

In [237]:
B=array([[10,0],[0,10]])

In [238]:
B

array([[10,  0],
       [ 0, 10]])

In [42]:
A+B

array([[11,  2],
       [ 3, 14]])

In [239]:
A*B  # elemenstwise!

array([[10,  0],
       [ 0, 40]])

In [240]:
dot(A,B) # matrix multiplication

array([[10, 20],
       [30, 40]])

In [241]:
A@B  # matrix multiplication

array([[10, 20],
       [30, 40]])

#### sum

In [242]:
A=array([[1,2],[3,4]])
print(A)

[[1 2]
 [3 4]]


In [243]:
A.sum()  # sum of all elements

10

In [244]:
A.sum(axis=0)

array([4, 6])

In [246]:
A.sum(axis=1)

array([3, 7])

In [245]:
sum(A)   # sum of all elements

10

In [48]:
A.sum(0)   # sum along 0-th axis  (sum column values)

array([4, 6])

In [49]:
sum(A,0)   # sum along 0-th axis  (sum column values)

array([4, 6])

In [50]:
A.sum(1)   # sum along 1-st axis (sum row values) 

array([3, 7])

In [51]:
sum(A, 1)   # sum along 1-st axis (sum row values) 

array([3, 7])

In [52]:
sum(A, axis=1)   # sum along 1-st axis (sum row values) 

array([3, 7])

**mean** - calculates the average value

**max** - calculates maximum

**min** - calculates minimum


Work analagous to **sum**

Take argmax

In [247]:
a=array([11,12,13,13,12,11])
a

array([11, 12, 13, 13, 12, 11])

In [248]:
ind = argmax(a)  # returns index of the 1st position of the maximum value
ind, a[ind]

(2, 13)

In [None]:
a==max(a)  # where max

In [250]:
a[a==max(a)]   # max values

array([13, 13])

Take elementwise maximum

In [252]:
A=[10,20,3,40]
B=[10,2,30,4]

In [253]:
maximum(A,B)   # elementwise maximum

array([10, 20, 30, 40])

In [57]:
minimum(A,B)   # elementwise minimum

array([10,  2,  3,  4])

# Iteration over rows/columns

In [254]:
A=array([[1,2,3],[4,5,6],[7,8,9]])
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [255]:
for e in A:    # iteration over rows
    print(e)

[1 2 3]
[4 5 6]
[7 8 9]


In [256]:
for i,elem in enumerate(A,start=1):  # iterate over rows
    print(f'row {i}: {elem}')

row 1: [1 2 3]
row 2: [4 5 6]
row 3: [7 8 9]


In [195]:
A.T   # matrix transposition

array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

In [257]:
for e in A.T:    # iteration over columns
    print(e)

[1 4 7]
[2 5 8]
[3 6 9]


In [260]:
for i,elem in enumerate(A.T, start=1):  # transpose matrix to iterate over columns
    print(f'column {i}: {elem}')

column 1: [1 4 7]
column 2: [2 5 8]
column 3: [3 6 9]


# Extending dimensions

In [261]:
A=array([1,2,3,4])
A

array([1, 2, 3, 4])

In [262]:
A.shape # 1D array

(4,)

In [263]:
B=A[:,None]  # 2D array

In [264]:
B.shape, B

((4, 1),
 array([[1],
        [2],
        [3],
        [4]]))

In [266]:
B=A[None,:]  # add minibatch dimension
B

array([[1, 2, 3, 4]])

# Reshaping an array

In [267]:
A=arange(12)
A

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [268]:
A.reshape([3,4])

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [269]:
B=A.reshape([4,3])
B

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [270]:
C=B.ravel()   # linearize to 1D vector
C

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [271]:
C.reshape(B.shape)   # back to B dimensions

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

# Concatenation of arrays

#### hstack - stacks along horizontal dimension (add columns)

#### vstack - stacks along vertical direction (add rows)

In [272]:
A=zeros([3,4])
B=ones([3,4])
print(A)
print()
print(B)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


In [273]:
hstack([A,B]), vstack([A,B])

(array([[0., 0., 0., 0., 1., 1., 1., 1.],
        [0., 0., 0., 0., 1., 1., 1., 1.],
        [0., 0., 0., 0., 1., 1., 1., 1.]]),
 array([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]))

In [274]:
hstack([[1,2,3],[4,5,6],7,[8,9,10,11,12]])  # stack arrays of different length

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [76]:
vstack([[1,2,3],[4,5,6]])

array([[1, 2, 3],
       [4, 5, 6]])

In [77]:
hstack([eye(3),zeros((3,3))])

array([[1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0.]])

# Gotcha: assignment of arrays

In [275]:
A=eye(3)
B=A          # assignement by pointer
A

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [276]:
B[1,1]=100   # affects both A and B
A

array([[  1.,   0.,   0.],
       [  0., 100.,   0.],
       [  0.,   0.,   1.]])

In [277]:
A=eye(3)
B=array(A)   # now assign copy, instead of copying a pointer
B[1,1]=100   # affects only B
A

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

# Slicing of array

In [278]:
A=arange(16).reshape(4,4)
A

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [279]:
A[:]  # take all elements (returns a COPY of original array, like array(A))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [280]:
A[2:] # all rows before 2

array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [281]:
A[:-2] # all rows before the last and second last

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [282]:
A[:,2:]  

array([[ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15]])

In [87]:
A[:,2:]  # columns, execpt first and second

array([[ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15]])

In [283]:
A[:,:-2]   # columns, except last and second last

array([[ 0,  1],
       [ 4,  5],
       [ 8,  9],
       [12, 13]])

In [89]:
A[2:,2:]  # simultaneus slicing of columns and rows

array([[10, 11],
       [14, 15]])

In [90]:
A[::2,::2]  # submatrix, consisting of every second column and row

array([[ 0,  2],
       [ 8, 10]])

In [91]:
A[1::2,1::2]  # submatrix, consisting of every second column and row with offset

array([[ 5,  7],
       [13, 15]])

# Boolean operations

In [284]:
A=arange(6) # 1,2,3,4,5 (excludes last element)
A

array([0, 1, 2, 3, 4, 5])

In [285]:
A>3

array([False, False, False, False,  True,  True])

In [94]:
A

array([0, 1, 2, 3, 4, 5])

In [95]:
A==3

array([False, False, False,  True, False, False])

In [96]:
A!=3

array([ True,  True,  True, False,  True,  True])

In [97]:
(A>1) & (A<3)  # elementwise AND

array([False, False,  True, False, False, False])

In [98]:
(A<2) | (A>4)  # elementwise OR

array([ True,  True, False, False, False,  True])

#### any / all

In [291]:
A=(A>3)
A

array([False, False, False, False, False, False])

In [292]:
any(A)

False

In [293]:
all(A)

False

In [288]:
any(A==4)

False

In [100]:
all(A>2)

False

In [294]:
A=array([0, 1, 2, 3, 4, 5])

In [296]:
all(A<3)

False

In [297]:
A=arange(9).reshape(3,3)
A

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [298]:
A>4

array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True]])

In [300]:
any(A>4,axis=0)  # any along 0 dimension (any along columns)

array([ True,  True,  True])

In [301]:
any(A>4,1)  # any along 1 dimension (any along rows)

array([False,  True,  True])

In [105]:
all(A>4,1)  # all along 1 dimension (columns)

array([False, False,  True])

# Creating subarrays by indexes

## Integer indexing

In [311]:
A=arange(100,110)
A

array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])

In [312]:
A[2]

102

In [313]:
A[[2,4,7]]

array([102, 104, 107])

### Random reorder of array

In [324]:
inds=arange(len(A))
inds

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [325]:
np.random.seed(0)   # fixes randomization

In [326]:
shuffle(inds)  # random reorder in-place

In [327]:
inds

array([2, 8, 4, 9, 1, 6, 7, 3, 0, 5])

In [328]:
A[inds] # randomly reordered A

array([102, 108, 104, 109, 101, 106, 107, 103, 100, 105])

In [329]:
train_inds=inds[:4]
val_inds=inds[4:]

In [330]:
A[train_inds]  # random sumbsample of 4 elements of A

array([102, 108, 104, 109])

In [331]:
A[val_inds]

array([101, 106, 107, 103, 100, 105])

### Alternative way to create a random subsample

In [333]:
L=list(range(len(A)))
random.shuffle(L)  # random reordering of sequence, in-place
L

[8, 3, 5, 1, 2, 4, 9, 0, 6, 7]

In [119]:
inds = L[:3]   # create subsample of 3 random elements
inds

array([2, 3, 5])

## Boolean indexing

In [334]:
A

array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])

In [335]:
A[[True,False,True,False,True,False,True,False,True,False]]

array([100, 102, 104, 106, 108])

In [336]:
A[A>105]   # create boolean condition and filter by it

array([106, 107, 108, 109])

In [337]:
A[(A>103) & (A<107)]

array([104, 105, 106])

In [338]:
A[(A<103) | (A>107)]

array([100, 101, 102, 108, 109])

## python sets

In [339]:
A=set([1,1,2,2,3,3])
B=set([3,3,4,4,5,5])
print(f'A={A}')
print(f'B={B}')

A={1, 2, 3}
B={3, 4, 5}


In [340]:
A&B

{3}

In [341]:
A|B

{1, 2, 3, 4, 5}

## Convert boolean selection to integer indexes

In [342]:
A=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [343]:
sels = (A>5)   # boolean selection

In [344]:
sels

array([False, False, False, False, False, False,  True,  True,  True,
        True])

In [345]:
inds = where(sels)[0]  # indexes of True positions (integer indexing)
inds

array([6, 7, 8, 9])

In [346]:
A[sels], A[inds]

(array([6, 7, 8, 9]), array([6, 7, 8, 9]))

## Convert integer indexes to boolean selection

* Boolean indexing: for selecting based on value conditions
* Integer indexing: for random reordering, subindexing (index(index)) for random subsamples

In [350]:
A

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [351]:
inds = [6,7,8,9]
A[inds]

array([6, 7, 8, 9])

In [352]:
sels=zeros(A.shape, dtype=bool)  # create empty boolean selection
sels

array([False, False, False, False, False, False, False, False, False,
       False])

In [353]:
sels[inds] = True  # select at positions from inds
sels

array([False, False, False, False, False, False,  True,  True,  True,
        True])

In [354]:
A[inds], A[sels]

(array([6, 7, 8, 9]), array([6, 7, 8, 9]))

## Boolean selection to integer indexing for a matrix

In [355]:
B=arange(9).reshape((3,3))
B

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [357]:
inds = np.where(B>4)
inds

(array([1, 2, 2, 2]), array([2, 0, 1, 2]))

In [358]:
for element in inds:
    print(B[inds[0],inds[1]])

[5 6 7 8]
[5 6 7 8]


# Broadcasting

Broadcasting is numpy technique for handling operations on arrays of incompatible shapes.

In [359]:
A=arange(12).reshape(3,4)
A

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

## Demo: Increase all elements by 100.

Can do it like this:

In [140]:
B=100*ones((3,4))
B

array([[100., 100., 100., 100.],
       [100., 100., 100., 100.],
       [100., 100., 100., 100.]])

In [142]:
A+B

array([[100., 101., 102., 103.],
       [104., 105., 106., 107.],
       [108., 109., 110., 111.]])

In [362]:
A+array(100)  # does the same! 
# during "+" operation 100 is converted to array([100]), then cloned to get to shape of A

array([[100, 101, 102, 103],
       [104, 105, 106, 107],
       [108, 109, 110, 111]])

## Demo: add 100 to first column, 200 to second column, etc.

In [363]:
A

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [364]:
A.shape

(3, 4)

In [365]:
B=arange(1,5)*100
B

array([100, 200, 300, 400])

In [366]:
B.shape

(4,)

In [367]:
A+B  # shape of B (4,) extended to (1,4) then cloned to get shape (3,4)

array([[100, 201, 302, 403],
       [104, 205, 306, 407],
       [108, 209, 310, 411]])

## Demo: add 100 to first row, 200 to second row, etc.

In [368]:
B=arange(1,4)*100
B, B.shape

(array([100, 200, 300]), (3,))

In [369]:
B=B[:,None]
B

array([[100],
       [200],
       [300]])

In [370]:
B.shape

(3, 1)

In [231]:
A+B

array([[100, 101, 102, 103, 104, 105, 106, 107, 108, 109],
       [200, 201, 202, 203, 204, 205, 206, 207, 208, 209],
       [300, 301, 302, 303, 304, 305, 306, 307, 308, 309]])

General rule:
    1. if both arrays have the same shape, perform operation elementwise
        * shapes (3,4,5), (3,4,5) match => do operation elementwise
    2. if arrays have the same number of dimensions:
        * each dimension should either match in both arrays in length or be 1.
        * in the latter case 1 is extended to dimension size of another array by cloning data
        * example: 
            * shapes before broadcasting: (3,4,5,1) and (1,4,1,7)
            * shapes after broadcasting: (3,4,5,7) and (3,4,5,7)
        * at this stage shapes of both arrays match, so do operation elementwise (step 1)
    3. if one array has smaller number of dimensions than another, missing dimensions are prepended
      (added from the beginning)
      * example:
            * shapes before broadcasting: (3,4,5,7) and (5,7)
            * shapes after broadcasting: (3,4,5,7) and (1,1,5,7)
      * now number of dimensions match, and the rest is processed by step 2.

# Sorting

## Sorting vectors

In [371]:
a = array([1,2,3,4,3,2,1])
a

array([1, 2, 3, 4, 3, 2, 1])

In [372]:
sort(a)  # returns sorted array, not in-place

array([1, 1, 2, 2, 3, 3, 4])

In [373]:
inds = argsort(a)  # return indexes which sort the array
a[inds]

array([1, 1, 2, 2, 3, 3, 4])

## Sorting matrices

In [156]:
A=array(arange(5))[::-1]
A

array([4, 3, 2, 1, 0])

In [157]:
B=10**array(arange(4))[::-1]
B[-1]=0
B

array([1000,  100,   10,    0])

In [158]:
B=B[:,None]
B

array([[1000],
       [ 100],
       [  10],
       [   0]])

In [159]:
C=A+B
C

array([[1004, 1003, 1002, 1001, 1000],
       [ 104,  103,  102,  101,  100],
       [  14,   13,   12,   11,   10],
       [   4,    3,    2,    1,    0]])

In [160]:
D=C[:]  # copy of C
D.sort(0)  # sorts along 0-th dimension (sort each column), in-place
D

array([[   4,    3,    2,    1,    0],
       [  14,   13,   12,   11,   10],
       [ 104,  103,  102,  101,  100],
       [1004, 1003, 1002, 1001, 1000]])

In [161]:
D=C[:]  # copy of C
D.sort(1)  # sorts along 1-th dimension (sort each row), in-place
D

array([[   0,    1,    2,    3,    4],
       [  10,   11,   12,   13,   14],
       [ 100,  101,  102,  103,  104],
       [1000, 1001, 1002, 1003, 1004]])

## Get indexes which make the array sorted

In [162]:
C

array([[   0,    1,    2,    3,    4],
       [  10,   11,   12,   13,   14],
       [ 100,  101,  102,  103,  104],
       [1000, 1001, 1002, 1003, 1004]])

In [163]:
inds = argsort(C,0) 
# sort along o dimension (each column), return indexes of sorted elements for each column 
inds

array([[0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3]])

# Obtaining unique elements

In [375]:
Y = array([0,1,1,2,2,2,3,3,3,3,3,3,3])  # may apply to python lists, tuples
Y

array([0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3])

In [376]:
set(Y)

{0, 1, 2, 3}

In [377]:
u = unique(Y)
u

array([0, 1, 2, 3])

In [379]:
unique(Y, return_counts=True)

(array([0, 1, 2, 3]), array([1, 2, 3, 7]))

# Saving memory tricks

## float64 to float32

In [380]:
A=eye(100) 

In [381]:
A

array([[1., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 0., 1., 0.],
       [0., 0., 0., ..., 0., 0., 1.]])

In [382]:
A.dtype  # each element is 64 bit = 8bytes

dtype('float64')

In [384]:
print(f'Memory size of A={A.nbytes}')

Memory size of A=80000


In [387]:
B=A.astype(float32)  

In [388]:
A.dtype, A.nbytes  

(dtype('float64'), 80000)

In [389]:
B.dtype   # each element of B is 32bit = 4 bytes

dtype('float32')

In [390]:
print(f'Memory size of B={B.nbytes}')   # two times less total memory usage!

Memory size of B=40000


## Sparse matrices

When matrix contains many zeros it is more memory efficient to store only non-zero entries instead of storing whole matrix. 

This can be done with sparse matrix type.

In [391]:
A=eye(100)
A

array([[1., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 0., 1., 0.],
       [0., 0., 0., ..., 0., 0., 1.]])

In [393]:
import scipy
from scipy import sparse
B =  scipy.sparse.dok_matrix(A)  # Different space matrix formats exist. Sparse matrix stores only non-zero elements

In [394]:
B.dtype # each element it 64bit = 8 bytes   

dtype('float64')

In [395]:
print(f'Size of A: {A.nbytes}, size of its sparse representation: {B.nnz*8}')

Size of A: 80000, size of its sparse representation: 800


# Special values

### inf, -inf  
Stand for +infinity, -infinity

In [396]:
a = float64(1)/float64(0)
a

  a = float64(1)/float64(0)


inf

In [397]:
b = float64(-1)/float64(0)
b

  b = float64(-1)/float64(0)


-inf

In [182]:
isinf(a), isinf(b)  # check infinity

(True, True)

In [398]:
isfinite(a), isfinite(b)

(False, False)

### nan 
Stands  for "not a number"

In [399]:
c = float64(0)/float64(0)
c

  c = float64(0)/float64(0)


nan

In [400]:
isnan(c)  # check is nan

True

In [186]:
~isinf(c), isfinite(c)

(True, False)

Learn more:
* [Tutorial on numpy/scipy](http://www.scipy-lectures.org/)
* [Function optimization in scipy](http://www.scipy-lectures.org/advanced/mathematical_optimization/index.html)