# NumPy

#### NumPy is the core library for scientic computing in Python.

First, we import numpy as follows.

In [1]:
import numpy as np

### Basic Operations

In [2]:
a = np.array([1, 2, 3])
print(type(a))
print(a.shape)           # .shape returns a tuple (rows, columns)
print(a[0], a[1], a[2])  
print(a)

<class 'numpy.ndarray'>
(3,)
1 2 3
[1 2 3]


In [3]:
b = np.array([[1, 2, 3], [4, 5, 6]])
print(type(b))
print(b.shape)
print(b[0,1], b[1,1])  # accessing elements in multidimensional arrays
print(b)

<class 'numpy.ndarray'>
(2, 3)
2 5
[[1 2 3]
 [4 5 6]]


__Creating Arrays__

In [4]:
a = np.array([[4,5],[7,3]])
print(a)
b = np.zeros((3,2))      # zero 3 x 2
print(b)
c = np.ones((3,2))       # ones 3 x 2
print(c)
d = np.eye(2)            # identity 2 x 2
print(d)
e = np.random.random((2,2))
print(e)

[[4 5]
 [7 3]]
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]
[[ 1.  1.]
 [ 1.  1.]
 [ 1.  1.]]
[[ 1.  0.]
 [ 0.  1.]]
[[ 0.57753741  0.15741501]
 [ 0.1355942   0.21554478]]


### Slicing and Indexing

Let's try slicing on the following np.array

In [5]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

Slicing the first 2 rows and columns 1 and 2 (zero indexed)

In [6]:
b = a[:2, 1:3]
print(b)

[[2 3]
 [6 7]]


Slicing the first 2 rows and taking all columns

In [7]:
c = a[:2,:] # a[:2,] also works
print(c)

[[1 2 3 4]
 [5 6 7 8]]


Taking the whole array a

In [8]:
d = a # a[:,:] also works 
print(d)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


__Note__ : Slicing does not make a new array but instead provides a view to the original array

In [9]:
# when d is changed, a is also changed
print(d[0,0], a[0,0])
d[0,0] = 2
print(d[0,0], a[0,0])

1 1
2 2


In [10]:
# since b corresponds to the first two rows and columns 1 and 2 of a
# indexing changes but they are views of same data
print(b[0,0], a[0,1])
b[0,0] = 42
print(b[0,0], a[0,1])
a[0,1] = 1
print(b[0,0], a[0,1])

2 2
42 42
1 1


__Integer Indexing with Slicing__

Consider the following 2D array

In [11]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

It is expected that the following two slicing syntaxes should yield the same result

In [12]:
b = a[1:2,:]
c = a[1,:]

But it turns out that __b__ and __c__ are different. __c__ is generated using integer indexing.

In [13]:
print(b, b.shape)
print(c, c.shape)

[[5 6 7 8]] (1, 4)
[5 6 7 8] (4,)


Mixing integer indexing with slices yields an array of lower rank, while using only slices always yields arrays of the same rank as original array.

Similarly we can try with column vectors.

In [14]:
e = a[:,1:2]
f = a[:,1]
print(e, e.shape)
print(f, f.shape)

[[ 2]
 [ 6]
 [10]] (3, 1)
[ 2  6 10] (3,)


__Using integer array indexing to create arbitrary arrays__

When we index numpy arrays using slicing, the resulting array view will always be a subarray of the original array. Integer array indexing allows us to construct arbitrary arrays using data from another array.

Consider the following np.array

In [15]:
a = np.array([[1,2], [3,4], [5,6]])
b = a[[0, 1, 2], [0, 1, 0]]
# this takes out elements a[0,0], a[1,1] and a[2,0]
# same as np.array([a[0,0], a[1,1], a[2,0]])
print(b)
b[0] = 12
print(a[0,0], b[0])

[1 4 5]
1 12


When integer indexing is used, new array is created. Thus modifying one doesn't affect the other.

__Useful tricks with Integer array indexing__

Selecting one element from each row

In [16]:
b = np.array([0, 1, 0])
print(a[np.arange(3), b])

[1 4 5]


It prints __a[0,0]__, __a[1,1]__, __a[2,0]__

List of indexes can also be used.

In [17]:
b = [0, 1, 0]
print(a[np.arange(3), b])

[1 4 5]


We can also mutate one element from each row. Let's add 10 to the selected elements.

In [18]:
a[np.arange(3),b] += 10
print(a)

[[11  2]
 [ 3 14]
 [15  6]]


__Boolean array indexing__

Let's try boolean indexing on the following np.array

In [19]:
a = np.array([[1,2,12],[6,8,11]])

Now, we will create a boolean np.array, which will store result (true/false) in place of elements when the corresponding elements are fed to some boolean expression.

In [20]:
is_even = (a%2 == 0)
print(is_even)

[[False  True  True]
 [ True  True False]]


In [21]:
is_small = (a <= 6)
print(is_small)

[[ True  True False]
 [ True False False]]


These boolean arrays can be used as sieves to filter out elements with desirable properties. 

Let's take out all the event elements from __a__ using boolean array __is_even__.

In [22]:
print(a[is_even])

[ 2 12  6  8]


all the elements which are greater than 6:

In [23]:
is_big = (is_small == False)
print(a[is_big])

[12  8 11]


More examples:

In [24]:
print(a[a > 2]) # all elements greater than 2
print(a[a%2 == 1]) # all odd elements

[12  6  8 11]
[ 1 11]


### Datatypes in NumPy

A data type object (numpy.dtype) describes how many bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the type, size, byte order etc. To describe the type of scaler data, there are several built in scaler types in NumPy for various precision of integers, floating point numbers etc. An item extracted from array will be a python object whose type is te scaler type associated with the data type of the array. 

Numpy also allows _structured_ data types (user defined aggregate of scaler types). The data type object in this case also defines the names of the fields of the structure, by which they can be accessed, the data type of each field and part of memory block each field takes.

Note: Scaler types are not _dtype_ objects, even though they can be used in place of whenever a data type specification is needed.

The default data type for numpy arrays is __float\___ 

In [25]:
print(np.dtype(np.int32))
print(np.dtype(np.complex128))
print(np.dtype(np.float_))

int32
complex128
float64


__Specifying structured dtypes__

In [26]:
dt = np.dtype([('name', np.str_, 16), ('grades', np.float_, (2,))])
print(dt['name'])
print(dt['grades'])

<U16
('<f8', (2,))


_U_ defines Unicode, _f_ stands for float. 

__'<'__ denotes little endian, __'>'__ denotes big endian and __'='__ denotes hardware-native (default) byte order.

<U16 : 16 character unicode string little endian

<f8 : 8 byte floating point number little endian

(2,) denotes the dimensions

__Using structured data types__

In [27]:
x = np.array([('Dhruv', (9.4, 9.73)), ('Mohan', (8.2, 7.12))], dtype = dt)

In [28]:
print(x[0])
print(x[0]['grades'])
print(type(x[0]))
print(type(x[0]['grades']))

('Dhruv', [9.4, 9.73])
[ 9.4   9.73]
<class 'numpy.void'>
<class 'numpy.ndarray'>


In [29]:
q = np.array([1,2]) #numpy chooses the data type
print(q.dtype)
w = np.array([1., 2.])
print(w.dtype)
r = np.array([1,2], dtype=np.float_) #force another data type
print(r.dtype)

int32
float64
float64


### Array Math in NumPy

Detailed description can be found here: https://docs.scipy.org/doc/numpy/reference/routines.math.html

In [30]:
x = np.array([[1,2],[3,4]], dtype = np.float64)
y = np.array([[5,6],[7,8]], dtype = np.float64)

__Elementwise Addition__

In [31]:
print(x+y) # available as overloaded + operator
print(np.add(x,y)) # also available as function

[[  6.   8.]
 [ 10.  12.]]
[[  6.   8.]
 [ 10.  12.]]


__Elementwise Difference__

In [32]:
print(x-y)
print(np.subtract(x,y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


__Elementwise Division__

In [33]:
print(x/y)
print(np.divide(x,y))

[[ 0.2         0.33333333]
 [ 0.42857143  0.5       ]]
[[ 0.2         0.33333333]
 [ 0.42857143  0.5       ]]


__Elementwise Multiplication__

In [34]:
print(x * y)
print(np.multiply(x,y))

[[  5.  12.]
 [ 21.  32.]]
[[  5.  12.]
 [ 21.  32.]]


__Vector Inner Product__

In [35]:
v = np.array([9,10])
w = np.array([11, 12])
print(v.dot(w))
print(np.dot(v,w))

219
219


__Matrix Multiplication__

In [36]:
print(x.dot(y))
print(x.dot(v))
print(np.dot(x,y))

[[ 19.  22.]
 [ 43.  50.]]
[ 29.  67.]
[[ 19.  22.]
 [ 43.  50.]]


__Computations on arrays__

In [37]:
x = np.array([[1,2], [3,4]])

In [38]:
print(np.sum(x))  # sum of all elements in x
print(np.sum(x, axis = 0)) # sum of each column
print(np.sum(x, axis = 1)) # sum of each row
print(np.prod(x)) # product of all elements in x
print(np.prod(x, axis = 0)) # product of each column
print(np.prod(x, axis = 1)) # product of each row

10
[4 6]
[3 7]
24
[3 8]
[ 2 12]


In [39]:
print(np.cumprod(x, axis = 0)) # cumulative product along column
print(np.cumprod(x, axis = 1)) # cumulative product along row

[[1 2]
 [3 8]]
[[ 1  2]
 [ 3 12]]


In [40]:
print(np.log(x)) # elementwise natural logarithm
print(np.negative(x)) # elementwise numerical negative
print(np.sqrt(x)) # elementwise square root

[[ 0.          0.69314718]
 [ 1.09861229  1.38629436]]
[[-1 -2]
 [-3 -4]]
[[ 1.          1.41421356]
 [ 1.73205081  2.        ]]


__Transpose of an Array__

In [41]:
x = np.array([[1,2],[3,4]])
print(x)
print(x.T) 
y = x.T # returns view of x as transpose
y[0,1] = 12
print(x)

[[1 2]
 [3 4]]
[[1 3]
 [2 4]]
[[ 1  2]
 [12  4]]


Transpose of a rank 1 array does nothing.