# Basic Usage

In [1]:
import numpy as np

M = 5
N = 6

## 1. Initialize array

### 1.1 Convert a Python list into np.ndarray

In [2]:
# create a 1-dimentional array from python list:

arr = np.array([e for e in range(M)], dtype='i4')
print(arr); print('shape: {} ({} dimention\ndata type: {})'.format(arr.shape, arr.ndim, arr.dtype))
arr.shape

[0 1 2 3 4]
shape: (5,) (1 dimention
data type: int32)


(5,)

### 1.2 Using np.arange() np.linespace()

Akin to `range` in Python, `arange` in NumPy returns a sequence of numbers in an ndarray. Use the `dtype` parameter to change the type, or use `astype()` function to cast into another type.

In [3]:
# create a 5-dimention vector with interval of 0.2

arr = np.arange(0., 1., .2)
print(arr); print('shape: {} ({} dimention)'.format(arr.shape[0], arr.ndim))


[ 0.   0.2  0.4  0.6  0.8]
shape: 5 (1 dimention)


However, due to the finite precision of floating point numbersit's better to use `linspace` when we are trying to create a sequence of floating point numbers, specifying how many elements we want, instead of the step.

In [4]:
# create a 5-dimention vector with interval of 0.2

arr = np.linspace(0., 1., 3, endpoint=False)
print(arr); print('shape: {} ({} dimention)'.format(arr.shape[0], arr.ndim))


[ 0.          0.33333333  0.66666667]
shape: 3 (1 dimention)


### 1.3 Other methods of initializing an array

In [5]:
# create a 2 × 2 matrix all filled by 0s, 1s or any other numbers

arr = np.zeros((2, 2))
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1])); print('data type: {}'.format(arr.dtype), end='\n\n')

arr = np.ones((2, 2))
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1])); print('data type: {}'.format(arr.dtype), end='\n\n')

arr = np.empty((2, 2))
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1])); print('data type: {}'.format(arr.dtype), end='\n\n')

arr = np.full((2, 2), 666)
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1])); print('data type: {}'.format(arr.dtype), end='\n\n')


[[ 0.  0.]
 [ 0.  0.]]
shape: 2 × 2
data type: float64

[[ 1.  1.]
 [ 1.  1.]]
shape: 2 × 2
data type: float64

[[ 0.  0.]
 [ 0.  0.]]
shape: 2 × 2
data type: float64

[[666 666]
 [666 666]]
shape: 2 × 2
data type: int32



Besides of `arr.dtype` and `arr.shape`, there are other useful attributes like:
* `arr.ndim`: the dimension of the ndarray. E.g. 1 for a vector and 2 for a matrix.
* `arr.size`: the total number of elements in such ndarray.

**Note**: The default data type is `float64` for float numbers and `int32` for integers.

In [6]:
# create a rank-3 identity matrix

arr = np.eye(3)
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1]), end='\n\n')


[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]
shape: 3 × 3



In [7]:
# randomly create a matrix consists of floats in the half-open interval [0.0, 1.0):

np.random.random((3, 2))

array([[ 0.13753289,  0.84871315],
       [ 0.84389689,  0.5721049 ],
       [ 0.23855836,  0.25691341]])

To find out more about **Array Creation**, see [official document for numpy *array creation*](<http://docs.scipy.org/doc/numpy/user/basics.creation.html#arrays-creation>)

## 2. Reshape an ndarray

In [8]:
# 2-dimentional array:

arr = np.array([ [e+r for e in range(N)] for r in range(0, M*N, N)])
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1]), end='\n\n')

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]]
shape: 5 × 6



The `arr.reshape()` method returns a new `ndarray` with the shape changed without modifying the original one.

In [9]:
# reshape arr to 2 × 15 and save it to a new_arr

new_arr = arr.reshape(2, 15)
print(new_arr); print('shape: {} × {}'.format(new_arr.shape[0], new_arr.shape[1]), end='\n\n')

[[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
 [15 16 17 18 19 20 21 22 23 24 25 26 27 28 29]]
shape: 2 × 15



To change the shape of the original `ndarray`, modify `arr.shape` directely

In [10]:
# reshape the original arr to 6 × 5:

arr.shape = (N, M)
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1]), end='\n\n')

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]
 [25 26 27 28 29]]
shape: 6 × 5



In [11]:
# specify minimum dimention of the array:

arr = np.array([e for e in range(N)], ndmin=3)
print(arr); print('shape: {} × {} × {}'.format(arr.shape[0], arr.shape[1], arr.shape[2]))

[[[0 1 2 3 4 5]]]
shape: 1 × 1 × 6


## 3. Data type object

In [12]:
# specify dtype as complex number:

arr = np.array([e for e in range(N)], dtype=complex)
print(arr)

[ 0.+0.j  1.+0.j  2.+0.j  3.+0.j  4.+0.j  5.+0.j]


In addition, numpy also support a great variety of numercal types:

`bool_` (boolean type stored as 1 byte)

`int_`  (long type in C)

`intc_` (int type in C)

`intp`  (ssize_t in C)

`int8`  (int8_t in C)

`int16` (int16_t in C)

`int32` (int32_t in C)

`int64` (int64_t in C)

`uint16` (uint16 in C)

...

`uint64` (uint64 in C)

`float16` (half precision float: 1 sign bit, 5-bit exponent, 10-bit mantissa)

`float32` (single precision float: 1 sign bit, 8-bit exponent, 23-bit mantissa)

`float64` (double precision float: 1 sign bit, 11-bit exponent, 52-bit mantissa)

`float_` (shorthand of float64)

`complex64`  (complex number, represented by two 32-bit floats)

`complex128` (complex number, represented by two 64-bit floats)

In [13]:
# array scalar type:

dt = np.dtype(np.int32)
dt

dtype('int32')

In [14]:
# "int8, int16, int32, int64" can be replaced 
# by equivalent string 'i1', 'i2', 'i4', 'i8', etc. number means #bytes

dt0 = np.dtype('b')
dt1 = np.dtype('i1'); dt2 = np.dtype('i2'); dt3 = np.dtype('i4'); dt4 = np.dtype('i8') # integer
dt5 = np.dtype('f2'); dt6 = np.dtype('f4')    # float number
dt7 = np.dtype('c8'); dt8 = np.dtype('c16')   # complex number
dt9 = np.dtype('u1'); dt10= np.dtype('U')     # unsigned integer / unicode
dt11= np.dtype('a10');dt12= np.dtype('S20')   # fix-sized byte string
dt13= np.dtype('m');  dt14= np.dtype('M')     # timedelta / datetime
dt15= np.dtype('O');  dt16= np.dtype('V')     # python object / void
dt0, dt1, dt2, dt3, dt4, dt5, dt6, dt7, dt8, dt9, dt10, dt11, dt12, dt13, dt14, dt15, dt16

(dtype('int8'),
 dtype('int8'),
 dtype('int16'),
 dtype('int32'),
 dtype('int64'),
 dtype('float16'),
 dtype('float32'),
 dtype('complex64'),
 dtype('complex128'),
 dtype('uint8'),
 dtype('<U'),
 dtype('S10'),
 dtype('S20'),
 dtype('<m8'),
 dtype('<M8'),
 dtype('O'),
 dtype('V'))

In [15]:
# '>' means big endian, vice versa；

dt = np.dtype('>i4')
dt

dtype('>i4')

In [16]:
# specify user-defined data type "age" as int8:

dt_age = np.dtype([('age', np.int8)])
dt_age

dtype([('age', 'i1')])

In [17]:
dt_age = np.dtype([('age', np.int8)])
arr = np.array([(10,), (20,), (30,)], dtype = dt_age)
arr

array([(10,), (20,), (30,)], 
      dtype=[('age', 'i1')])

In [18]:
# file name can be used to access content of age column 

dt_age = np.dtype([('age',np.int8)]) 
arr = np.array([(10,),(20,),(30,)], dtype = dt_age) 
arr['age']

array([10, 20, 30], dtype=int8)

In [19]:
studentType = np.dtype([('name', 'S20'), ('age', 'i1'), ('marks', 'f4')])
students = np.array([('Lex', 24, 78.0), ('Scarlet', 22, 80.5)], dtype = studentType)
print(students, '\n')
for i in range(len(students)):
    print('ID:\t{}'.format(i))
    print('Names:\t{}'.format(students['name'][i].decode('utf-8')))
    print('Ages:\t{}\n'.format(students['age'][i]))

[(b'Lex', 24,  78. ) (b'Scarlet', 22,  80.5)] 

ID:	0
Names:	Lex
Ages:	24

ID:	1
Names:	Scarlet
Ages:	22



To find out more about **data types** in numpy, see [offcial document for numpy *dtype*](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html)

## Array Indexing

In [20]:
# make an 2D array:

arr = np.array([[e+r for e in range(N)] for r in range(0, M*N, N)])
print(arr); print('shape: {} × {}'.format(arr.shape[0], arr.shape[1]), end='\n\n')

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]]
shape: 5 × 6



In [21]:
# slicing:

print(arr[4:5, :]);  print('shape: {} × {}'.format(arr[5:6, :].shape[0], arr[5:6, :].shape[1]), end='\n\n')
print(arr[:, 5:6]);  print('shape: {} × {}'.format(arr[:, 5:6].shape[0], arr[:, 5:6].shape[1]), end='\n\n')
print(arr[:2, 0:5:2]); print('shape: {} × {}'.format(arr[:2, 1:4].shape[0], arr[:2, 1:4].shape[1]), end='\n\n')

[[24 25 26 27 28 29]]
shape: 0 × 6

[[ 5]
 [11]
 [17]
 [23]
 [29]]
shape: 5 × 1

[[ 0  2  4]
 [ 6  8 10]]
shape: 2 × 3



In [22]:
# slicing with ellipsis:

print(id(arr[..., 0]) == id(arr[:, 0]))
print(id(arr[0, ...]) == id(arr[0, :]))

True
True


In [23]:
# add axis

print(arr[:, np.newaxis, :].shape)
print(arr[np.newaxis, :, :].shape)
print(arr[:, :, np.newaxis].shape)

(5, 1, 6)
(1, 5, 6)
(5, 6, 1)


In [24]:
# integer array indexing:

print('arr[3, 3] == arr[3][3] ?', 
      id(arr[3, 3]) == id(arr[3][3]))

print('[ arr[0, 0]  arr[4][4]  arr[2][3] ]', 
      '==', 
      arr[[0, 4, 2], [0, 4, 3]])

arr[3, 3] == arr[3][3] ? True
[ arr[0, 0]  arr[4][4]  arr[2][3] ] == [ 0 28 15]


In [25]:
# mixing of slice indexing and integer indexing:

# for instance, we need all elements in row 4:
#   there are 2 different ways (but with slightly different outcome):

#   1. slicing purely:
print('Rank 2 view:', arr[4:5, :]); print('shape: {}'.format( arr[4:5, :].shape ), end='\n\n')

#   2. integer indexing (with lower rank):
print('Rank 1 view:', arr[4  , :]); print('shpap: {}'.format( arr[4  , :].shape ), end='\n\n')

Rank 2 view: [[24 25 26 27 28 29]]
shape: (1, 6)

Rank 1 view: [24 25 26 27 28 29]
shpap: (6,)



In [26]:
# one trick of using integer indexing:

index_x = index_y = np.arange(min(M, N))
np.random.shuffle(index_y)

# show one element in each row specified by index_y,
#   i.e. arr[x0, y0], arr[x1, y1], arr[x2, y2], arr[x3, y3], ...:
arr[index_y, index_x]

array([28, 21, 14,  0,  7])

In [27]:
# boolean indexing:

bool_arr = (arr > 5)
print(bool_arr)

# we can utilize such an array to generate a rank-1 array
# s.t. all the elements in that array will correspond to the "True" values of bool_arr:

print(arr[bool_arr]); del bool_arr


# in practice, just type:
print(arr[arr > 5])

[[False False False False False False]
 [ True  True  True  True  True  True]
 [ True  True  True  True  True  True]
 [ True  True  True  True  True  True]
 [ True  True  True  True  True  True]]
[ 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29]
[ 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29]


To find out more about numpy **indexing**, see [official document for numpy indexing](http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)

## Array math

In [28]:
# create matrices:

A = np.array([[e+r for e in range(N)] for r in range(0, M*N, N)])
B = np.ones((M, N), dtype='i4')
C = np.eye(M, N, dtype='i4')

X = np.array([2 for _ in range(N)], ndmin=2)
Y = np.array([e for e in range(N)], ndmin=2)

print('A =\n{}'.format(A), end='\n\n')
print('B =\n{}'.format(B), end='\n\n')
print('C =\n{}'.format(C), end='\n\n')

print('X = {}'.format(X), end='\n\n')
print('Y = {}'.format(Y), end='\n\n')

A =
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]]

B =
[[1 1 1 1 1 1]
 [1 1 1 1 1 1]
 [1 1 1 1 1 1]
 [1 1 1 1 1 1]
 [1 1 1 1 1 1]]

C =
[[1 0 0 0 0 0]
 [0 1 0 0 0 0]
 [0 0 1 0 0 0]
 [0 0 0 1 0 0]
 [0 0 0 0 1 0]]

X = [[2 2 2 2 2 2]]

Y = [[0 1 2 3 4 5]]



In [29]:
# elementwise sum:

print('A + C =\n{}'.format(A + C), end='\n\n')
print('B + C =\n{}'.format(np.add(B, C)), end='\n\n')

A + C =
[[ 1  1  2  3  4  5]
 [ 6  8  8  9 10 11]
 [12 13 15 15 16 17]
 [18 19 20 22 22 23]
 [24 25 26 27 29 29]]

B + C =
[[2 1 1 1 1 1]
 [1 2 1 1 1 1]
 [1 1 2 1 1 1]
 [1 1 1 2 1 1]
 [1 1 1 1 2 1]]



In [30]:
# elementwize difference:

print('A - C =\n{}'.format(A - C), end='\n\n')
print('C - B =\n{}'.format(np.subtract(C, B)), end='\n\n')

A - C =
[[-1  1  2  3  4  5]
 [ 6  6  8  9 10 11]
 [12 13 13 15 16 17]
 [18 19 20 20 22 23]
 [24 25 26 27 27 29]]

C - B =
[[ 0 -1 -1 -1 -1 -1]
 [-1  0 -1 -1 -1 -1]
 [-1 -1  0 -1 -1 -1]
 [-1 -1 -1  0 -1 -1]
 [-1 -1 -1 -1  0 -1]]



In [31]:
# elementwize product:

print('A * C =\n{}\n'.format(A * C))
print('B * C =\n{}\n'.format(np.multiply(B, C)))

A * C =
[[ 0  0  0  0  0  0]
 [ 0  7  0  0  0  0]
 [ 0  0 14  0  0  0]
 [ 0  0  0 21  0  0]
 [ 0  0  0  0 28  0]]

B * C =
[[1 0 0 0 0 0]
 [0 1 0 0 0 0]
 [0 0 1 0 0 0]
 [0 0 0 1 0 0]
 [0 0 0 0 1 0]]



In [32]:
# elementwize division (integer):

print('A // B =\n{}\n'.format(A // B))
print('C // B =\n{}\n'.format(np.floor_divide(C, B)))


A // B =
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]]

C // B =
[[1 0 0 0 0 0]
 [0 1 0 0 0 0]
 [0 0 1 0 0 0]
 [0 0 0 1 0 0]
 [0 0 0 0 1 0]]



In [33]:
# inner(dot) product of vectors by arr.dot(another_arr):

print('X · Y = {}'.format(X.dot(Y.transpose())))

# or by np.dot(arr, another_arr):

print('X · Y = {}'.format(np.dot(X, Y.T)))


X · Y = [[30]]
X · Y = [[30]]


In [34]:
# Note: for multiplication of higher (more than 2) dimension matrices,
# use np.matmul(arr, another_arr) instead:
# (or you can alwasy use np.matmul())

print('X · Y = {}'.format(np.matmul(X, Y.T)))

# and there's a syntactic sugar:

print('X · Y = {}'.format(X @ Y.T))


X · Y = [[30]]
X · Y = [[30]]


In [35]:
# matrix / vector product:

print(u'A × Xᵀ =\n{}'.format(A @ X.T), end='\n\n')

A × Xᵀ =
[[ 30]
 [102]
 [174]
 [246]
 [318]]



In [36]:
# matrix / matrix product:

print('A × B =\n{}'.format(A @ B.T), end='\n\n')
print('A × C =\n{}'.format(A @ C.T), end='\n\n')

A × B =
[[ 15  15  15  15  15]
 [ 51  51  51  51  51]
 [ 87  87  87  87  87]
 [123 123 123 123 123]
 [159 159 159 159 159]]

A × C =
[[ 0  1  2  3  4]
 [ 6  7  8  9 10]
 [12 13 14 15 16]
 [18 19 20 21 22]
 [24 25 26 27 28]]



In [37]:
# sum of elements:

print('Sum of all elements in A is {}'.format(A.sum()))

Sum of all elements in A is 435


In [38]:
# sum of each column & row:

print('Sums of all elements in each column in A are {}'.format(A.sum(axis=0)))
print('Sums of all elements in each raw    in A are {}'.format(A.sum(axis=1)))

Sums of all elements in each column in A are [60 65 70 75 80 85]
Sums of all elements in each raw    in A are [ 15  51  87 123 159]


In [39]:
# norm of a vector:

np.linalg.norm(Y), np.linalg.norm(Y, ord=1), np.linalg.norm(Y, ord=np.inf)

(7.416198487095663, 5.0, 15.0)

In [40]:
# inversed matrix:

np.linalg.inv(np.random.rand(4, 4))

array([[ 0.67212657,  1.80743355, -2.05872402,  1.68952964],
       [ 0.02611449, -1.57194381,  4.70404612, -5.72377589],
       [ 3.24066188, -1.46651728, -2.89140064,  4.46079966],
       [-2.43978822,  1.05362254,  1.98158063, -1.59098871]])

For more about manipulating arrays, see [offical documents for mathematical functions in numpy](https://docs.scipy.org/doc/numpy/reference/routines.math.html).

## Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

In [41]:
# extend vector Y by m rows:

print('Y_ext =\n{}'.format( 
    np.tile(Y, (M, 1)) 
                            ), end='\n\n')

# then, we can add extended vector Y and matrix A elementwise:

print('Y_ext + A =\n{}'.format( 
    np.tile(Y, (M, 1)) + A 
                            ), end='\n\n')

Y_ext =
[[0 1 2 3 4 5]
 [0 1 2 3 4 5]
 [0 1 2 3 4 5]
 [0 1 2 3 4 5]
 [0 1 2 3 4 5]]

Y_ext + A =
[[ 0  2  4  6  8 10]
 [ 6  8 10 12 14 16]
 [12 14 16 18 20 22]
 [18 20 22 24 26 28]
 [24 26 28 30 32 34]]



In [42]:
# alternatively, numpy allows us do the addition 
# between vector and matrix elementwise
# without doing actual vector extention:

print('Y + A =\n{}'.format(
    Y + A
                        ), end='\n\n')

Y + A =
[[ 0  2  4  6  8 10]
 [ 6  8 10 12 14 16]
 [12 14 16 18 20 22]
 [18 20 22 24 26 28]
 [24 26 28 30 32 34]]



### Some applications of broadcasting

In [43]:
# compute outer product of vectors X and Y:

print('X × Y =\n{}'.format(
    X.reshape((6, 1)) * Y
                        ), end='\n\n')

X × Y =
[[ 0  2  4  6  8 10]
 [ 0  2  4  6  8 10]
 [ 0  2  4  6  8 10]
 [ 0  2  4  6  8 10]
 [ 0  2  4  6  8 10]
 [ 0  2  4  6  8 10]]



In [44]:
# add a vector Y to each row of a matrix A:

print('Y + A =\n{}'.format(
    Y + A
                        ), end='\n\n')


Y + A =
[[ 0  2  4  6  8 10]
 [ 6  8 10 12 14 16]
 [12 14 16 18 20 22]
 [18 20 22 24 26 28]
 [24 26 28 30 32 34]]



In [45]:
# add a vector Y[:5] to each column of a matrix A:

print('Yᵀ + A =\n{}'.format(
    (Y[0, :5] + A.T).T
                        ), end='\n\n')

# or:

print('Yᵀ + A =\n{}'.format(
    (Y.T[:5] + A)
                        ), end='\n\n')

Yᵀ + A =
[[ 0  1  2  3  4  5]
 [ 7  8  9 10 11 12]
 [14 15 16 17 18 19]
 [21 22 23 24 25 26]
 [28 29 30 31 32 33]]

Yᵀ + A =
[[ 0  1  2  3  4  5]
 [ 7  8  9 10 11 12]
 [14 15 16 17 18 19]
 [21 22 23 24 25 26]
 [28 29 30 31 32 33]]



**Note:** broadcasting typically make code **more concise and faster**, strive to use it when possible!

For more information about broadcasting, see explanation from [this document](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) and [this expanation](http://wiki.scipy.org/EricsBroadcastingDoc)

## End

This brief notebook has touched many of the important things that should be known in numpy for deep learning. However, it is far from complete. Often we should to check out [numpy reference](http://docs.scipy.org/doc/numpy/reference/) for more features we need.