# F.1 N-dimensional Arrays

- Basic object is *ndarray*, a multi-D array.
- *ndarrays* can have upto 32 dimensions
- 1-D ndarray is a vector, while 2-D ndarray can be thought of as a matrix
- all elements are homogeneous.

In [1]:
import numpy as np
lst = [[1,2,3],[4,5,6]] #a list of lists
myArr1 = np.array(lst) #create an ndarray

In [2]:
myArr1

array([[1, 2, 3],
       [4, 5, 6]])

- Numpy infers type upon construction. In our case its int64

In [3]:
myArr1.dtype

dtype('int64')

- we can specify type while defining array using *dtype* parameter
- once already defined, we can recast/downcast using astype()

In [4]:
myArr2 = np.array(lst,dtype = np.float32)

In [5]:
myArr2

array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)

In [6]:
myArr2.astype(np.int64) #recast myArr2

array([[1, 2, 3],
       [4, 5, 6]])

In [7]:
myArr2 #however the effect is not permanent. Reassign it to myArr2 to reflect the change.

array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)

- Numpy's nd has many handy attributes

In [8]:
# 'itemsize' returns size of each element in bytes
myArr1.itemsize

8

In [9]:
# 'size' attribute to return number of elements
myArr1.size

6

In [10]:
#'ndim' attribute to return number of dimensions. Dimension is like rank of a tensor
myArr1.ndim

2

In [11]:
# we can obtain number of elements along each dimension/axes using 'shape'
myArr1.shape #returns a tuple, 2 rows along axes 0 and 3 columns, along axes 1.

(2, 3)

In [12]:
np.array([1,2,3]).shape #one dimensional array 
#also note that the ',' is to help python know that its a tuple

(3,)

- passing a single element generates zero dimensional array

In [14]:
scalar = np.array(5)
print(scalar.ndim)
print(scalar.shape)

0
()


# F.2 Array Construction Routines

- array cannot parse generators. Use *fromiter* to parse generators

In [18]:
def generator():
    for i in range(10):
        if i%2 != 0:
            yield i
            
gen = generator()
np.fromiter(gen,dtype = np.int64) #necessary to provide 'dtype' parameter

array([1, 3, 5, 7, 9])

In [20]:
generator_expression = (i for i in range(10) if i%2 != 0) #generator expression(uses a () unlike []) is equivalent to above
np.fromiter(generator_expression,dtype = np.int64)

array([1, 3, 5, 7, 9])

- Numpy allows functions *zeros()* and *ones()* to create ndarrays of 0. and 1. by specifying the shape
- These are useful as placeholders.
- We can also use *empty()* to fill array with non-sensical values as placeholders.

In [23]:
np.zeros((3,2,3)) #a row of 3 2x3 matrices

array([[[0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.]]])

In [24]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

- Numpy also provides functions *diag()* and *eye()* for diagonal matrices and identity matrices for linear algebra

In [25]:
np.diag([1,2,3,4])

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

In [26]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

- Numpy comes with useful functions like *arange()* and *linspace()* to generate sequence of numbers 

In [27]:
np.arange(1.,11.,2) # [start, stop) and step
# notice the type inference.

array([1., 3., 5., 7., 9.])

In [28]:
np.linspace(1.,10.,20) # create evenly spaced values in an interval

array([ 1.        ,  1.47368421,  1.94736842,  2.42105263,  2.89473684,
        3.36842105,  3.84210526,  4.31578947,  4.78947368,  5.26315789,
        5.73684211,  6.21052632,  6.68421053,  7.15789474,  7.63157895,
        8.10526316,  8.57894737,  9.05263158,  9.52631579, 10.        ])

# F.3 Array Indexing

- simple Numpy indexing and slicing works similar to python *lists*

In [29]:
ary = np.array([1,2,3])
ary[0]

1

In [30]:
ary[:2] # fetches first two elements

array([1, 2])

In [31]:
ary = np.array([[1,2,3],[4,5,6],[7,8,9]])

- for multi-D/multi axes, we specify indexing/slicing along each axis

In [33]:
print(ary[0,0]) #top left
print(ary[-1,-1]) #bottom right
print(ary[0,1]) #first row, second column

1
9
2


In [34]:
# entire first row
ary[0]

array([1, 2, 3])

In [35]:
# entire first column
ary[:,0]

array([1, 4, 7])

In [36]:
# all rows with first two columns
ary[:,:2]

array([[1, 2],
       [4, 5],
       [7, 8]])

In [37]:
# all rows, 1st and third column
ary[:,[0,2]]

array([[1, 3],
       [4, 6],
       [7, 9]])

# F.4 Array Math and Universal Functions

- Numpy provides vectorized-wrappers for element-wise operations on sequence-like objects through *ufuncs*.
- There are 60 ufuncs.
- ufuncs are implemented in compiled C code and hence fast.

In [40]:
# add scalar 1, element-wise
ary = np.array([[1,2,3],[4,5,6]])
ary

array([[1, 2, 3],
       [4, 5, 6]])

In [41]:
ary + 1 # notice how elegant this is compared to the Python implementation
#can also do np.add(ary,1)

array([[2, 3, 4],
       [5, 6, 7]])

In [42]:
# np.subtract(ary,1)
ary - 1

array([[0, 1, 2],
       [3, 4, 5]])

In [43]:
# np.multiply(ary,2)
ary*2

array([[ 2,  4,  6],
       [ 8, 10, 12]])

In [44]:
# np.divide(ary,2);
ary/2

array([[0.5, 1. , 1.5],
       [2. , 2.5, 3. ]])

In [45]:
# np.exp(ary,2)
ary**2

array([[ 1,  4,  9],
       [16, 25, 36]])

- unary *ufuncs* perform operations on a single argument

In [46]:
np.log(ary)

array([[0.        , 0.69314718, 1.09861229],
       [1.38629436, 1.60943791, 1.79175947]])

In [47]:
np.log10(ary)

array([[0.        , 0.30103   , 0.47712125],
       [0.60205999, 0.69897   , 0.77815125]])

In [48]:
np.sqrt(ary)

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

- we can use *reduce()* for sum/ product of elements along a particular axis. default is axis 0

In [49]:
ary = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
ary

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [50]:
np.add.reduce(ary)

array([[ 8, 10, 12],
       [14, 16, 18]])

In [51]:
ary = np.array([[1,2,3],[4,5,6]])
np.add.reduce(ary,axis = 1) # row sums

array([ 6, 15])

- *sum()* and *product* are equal to np.add.reduce and np.multiply.reduce

In [52]:
np.multiply.reduce(ary)

array([ 4, 10, 18])

In [53]:
np.product(ary,axis = 0)

array([ 4, 10, 18])

- *np.sum(ary...)* and *ary.sum()* are equivalent operations
- *sum* and *product* return sum/product of the entire array if no axis is specified

- Some other useful ufuncs:-
    1. *mean*
    2. *std*
    3. *var*
    4. *np.sort*
    5. *np.argsort*
    6. *np.min*
    7. *np.max*
    8. *np.argmin*
    9. *np.argmax*
    10. *array_equal*

# F.5 Broadcasting

- Numpy makes implict grids for element-wise operation when shapes of two arrays don't match
- it is important to note that broadcasting works only if the elements along the explicit axes and the implicit grid are the same


In [55]:
ary1 = np.array([[1,2,3],[4,5,6]])
ary2 = np.array([1,2,3])
ary1 + ary2

array([[2, 4, 6],
       [5, 7, 9]])

In [57]:
ary3 = np.array([[1,2,3,4],[5,6,7,8]])
ary3 + ary2 # this will give ValueError

ValueError: operands could not be broadcast together with shapes (2,4) (3,) 

In [60]:
ary3 + np.array([[1],[2]])

array([[ 2,  3,  4,  5],
       [ 7,  8,  9, 10]])

# F.6 Advanced Indexing-Memory Views and Copy

- slicing/indexing ndarrays creates *views* to save memory resources. Think of it like such magnifying on a portion of the original array than copying it.

In [72]:
ary1 = np.array([[1,2,3],[4,5,6]])

In [73]:
ary1

array([[1, 2, 3],
       [4, 5, 6]])

In [75]:
b = ary1[0]
b += 99
ary1

array([[100, 101, 102],
       [  4,   5,   6]])

In [76]:
# even slicing creates views
center_col = ary1[:,1]
center_col += 99
ary1

array([[100, 200, 102],
       [  4, 104,   6]])

In [77]:
#notice this closely
ary1 = np.arange(5)
ary2 = ary1
ary2[0] = 99
ary1

array([99,  1,  2,  3,  4])

In [78]:
# this is because of views
np.shares_memory(ary1,ary2)

True

In [79]:
# However now notice this
ary1 = np.arange(5)
b = ary1[0]
np.shares_memory(b,ary1[0])

False

- to avoid this problem with views, which is actually useful though... follow this:

In [80]:
ary2 = ary1.copy()
np.shares_memory(ary1,ary2)

False

- Numpy also supports *fancy* indexing.
- In *fancy* indexing, we can use a list/tuple of non-contiguous indexes to fetch values.
- since the sequence is non-contiguous, it results in copies and not views.m


In [81]:
ary1 = np.array([[1,2,3],[4,5,6]])
ary1[:,[0,2]] #first and last column

array([[1, 3],
       [4, 6]])

In [83]:
this_is_a_copy = ary1[:,[0,2]]
this_is_a_copy += 99
ary1

array([[1, 2, 3],
       [4, 5, 6]])

In [84]:
ary1[:,[2,0]] #last and first column

array([[3, 1],
       [6, 4]])

- We can also create Boolean masks.
- These come under *fancy* indexing

In [87]:
greater3_mask = ary1 > 3
greater3_mask

array([[False, False, False],
       [ True,  True,  True]])

In [88]:
ary1[greater3_mask] #returns elements satisfying certain conditions

array([4, 5, 6])

In [91]:
# combine multiple conditions using & and | logical operators
mask = (ary1 > 2) & (ary1%2 != 0)
mask

array([[False, False,  True],
       [False,  True, False]])

In [92]:
ary1[mask]

array([3, 5])

In [93]:
# we can also do this by
ary1[(ary1 > 2) & (ary1%2 != 0)]

array([3, 5])

# F.7 Comparision Operators and Mask

- masks can be useful to find elements that match a given condition.
- compute no. of elements satisfying such condition.
- compute indexes of elements satisfying the conditions.

In [95]:
mask = ary1 > 3
mask

array([[False, False, False],
       [ True,  True,  True]])

In [96]:
ary1[mask] # select elements satisfying 

array([4, 5, 6])

In [98]:
mask.sum() # find total number of elements

3

In [99]:
mask.nonzero() #indices which satisfy the conditions... #first along axis 0, other along axis 1

(array([1, 1, 1]), array([0, 1, 2]))

In [100]:
(ary1>3).nonzero()

(array([1, 1, 1]), array([0, 1, 2]))

- The above is a two step method. Instead, we can simply use *np.where()*

In [101]:
np.where(ary1 > 3)

(array([1, 1, 1]), array([0, 1, 2]))

- we use the np.where function with three arguments: np.where(condition, x, y), which is interpreted as :If condition is True, yield x, otherwise yield y.

In [102]:
np.where(ary1 > 3,1,0)

array([[0, 0, 0],
       [1, 1, 1]])

- This basically assigns 1 to elements greater than 3
- We can also do this manually using boolean mask

In [104]:
mask = ary1 > 3
ary1[mask] = 1
ary1[~mask] = 0
ary1

array([[0, 0, 0],
       [1, 1, 1]])

- Numpy has the following logical operators to generate complex masks:
    1. & or np.bitwise_and
    2. | or np.bitwise_or
    3. ^ or np.bitwise_xor
    4. ~ or np.bitwise_not

# F.8 Random Number Generation

- Random number generation is of practical importance in ML and Deep Learning
- *np.random* package has several methods for random number generation and sampling

In [107]:
# seed pseudo-random no. generator
np.random.seed(123)
np.random.rand(3)

array([0.69646919, 0.28613933, 0.22685145])

- It is however useful in sequential as well as non-sequential executions such as the Jupyter Notebook, to create separate RandomState objects

In [108]:
rng1 = np.random.RandomState(123)
rng1.rand(3)

array([0.69646919, 0.28613933, 0.22685145])

In [109]:
# Gaussian distribution
rng2 = np.random.RandomState(123)
rng2.randn(3,2)

array([[-1.0856306 ,  0.99734545],
       [ 0.2829785 , -1.50629471],
       [-0.57860025,  1.65143654]])

# F.9 Reshaping

- ndarrays have fixed size so we can't alter that
- However we can alter shape of existing array using *reshape()* attribute of an ndarray

In [113]:
ary1 = np.array([[1,2,3,6],[4,5,6,9],[7,8,9,4]])
ary1

array([[1, 2, 3, 6],
       [4, 5, 6, 9],
       [7, 8, 9, 4]])

In [114]:
ary1.reshape(2,-1,3) #-1 for figuring out dimensions itself.

array([[[1, 2, 3],
        [6, 4, 5]],

       [[6, 9, 7],
        [8, 9, 4]]])

In [115]:
# we can even unroll
ary1.reshape(-1)

array([1, 2, 3, 6, 4, 5, 6, 9, 7, 8, 9, 4])

In [116]:
ary1# notice that unrolling doesnt effect original array

array([[1, 2, 3, 6],
       [4, 5, 6, 9],
       [7, 8, 9, 4]])

In [117]:
# we can use 'ravel()'
ary1.ravel()

array([1, 2, 3, 6, 4, 5, 6, 9, 7, 8, 9, 4])

In [118]:
np.shares_memory(ary1,ary1.ravel())

True

In [120]:
np.shares_memory(ary1,ary1.flatten())

False

In [121]:
ary1.flatten() # like ravel() but creates a copy

array([1, 2, 3, 6, 4, 5, 6, 9, 7, 8, 9, 4])

- We can also concatenate two or more arrays. Though expensive, is unavoidable at times
- we use *np.concatenate()* method for this

In [122]:
ary1 = np.array([[1,2,3,4],[5,6,7,8]])
ary1

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [123]:
ary2 = np.array([[9,10,11,12],[13,14,15,16]])
ary2

array([[ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [125]:
np.concatenate((ary1,ary2))

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [126]:
np.concatenate((ary1,ary2),axis = 1)

array([[ 1,  2,  3,  4,  9, 10, 11, 12],
       [ 5,  6,  7,  8, 13, 14, 15, 16]])

In [128]:
np.concatenate((ary1,np.array([1,2,3,4]))) #generates an error

ValueError: all the input arrays must have same number of dimensions

# F.10 Linear Algebra

- Linear algebra has importance in Neural Nets and Deep Learning.
- Numpy comes with restrictive, built-in *matrix* object that is hardly used since it's less general

- 1-D arrays can be thought of as row vectors

In [131]:
row_vector = np.array([1,2,3])
row_vector

array([1, 2, 3])

- we can use 2-D arrays to create column-vectors

In [133]:
column_vector = np.array([[1,2,3]]).reshape(-1,1)
column_vector

array([[1],
       [2],
       [3]])

- Instead of reshaping 1-D array into 2-D column vector, we can simply add a new dimension

In [135]:
row_vector[:,np.newaxis]

array([[1],
       [2],
       [3]])

- in this context, *np.newaxis* behaves like *None*

In [136]:
row_vector[:,None]

array([[1],
       [2],
       [3]])