[**Enthought Numpy(beginner) 2018**](https://www.youtube.com/watch?v=V0D2mhVt7NE)

In [1]:
import numpy as np

**Note**: Numpy does point to point operation for all the operators, *,+,-, /,** etc

In [7]:
a= np.array([0,2,3,4])
b= np.array([44,23,5,7])
print(a+b)
print(a*b)
print(a/b)
print(a**b)
print(a-b)
print(a>b)
print(np.logical_and(a,b))

[44 25  8 11]
[ 0 46 15 28]
[0.         0.08695652 0.6        0.57142857]
[      0 8388608     243   16384]
[-44 -21  -2  -3]
[False False False False]
[False  True  True  True]


In [13]:
# in python
output=[]
for item1,item2 in zip(a,b):  # note: even for normal list, the same will happen
    output.append(item1+item2)
    
print(output)    

# in numpy
print(a+b)

[44, 25, 8, 11]
[44 25  8 11]


Python list is like a container, which has several address, it has to do back and forth, type checking etc. But in case of numpy, all data types of same type is kept at a place in memory, which makes it much faster.

all elements of numpy array are homogenous, that is of the same type; that type is accessable as 'dtype'. 


In [14]:
a= np.arange(1,5)

In [17]:
print(a.ndim)
print(a.shape)  # gives the number of elements along each dimensions

1
(4,)


In [19]:
type(np.log) # these are ufuncs, all implemented in C. They are very fast.

numpy.ufunc

**Note**: The functions also work point to point. There is a for loop happening in the background.

In [23]:
np.log(a)
np.sin(a)

array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 ])

**Note**: Numpy elements are mutable. We can change the value in place, in memory.

In [26]:
a= np.arange(12)
a[0]= 222
print(a)

[222   1   2   3   4   5   6   7   8   9  10  11]


#### **Note**: Beware of the type coercion. If we try to assign a value that is different from the 'dtype' of the array element, it will truncate/convert it to make the elements same.

In [37]:
a= np.arange(12)
a[2]= 22.3
print(a)
# a[3]= 'a'
# print(a)    #Will raise a 'ValueError';
a.dtype
a.fill(33.3)
print(a)     # fill too has the same type of behavior; floating type is truncated here.

[ 0  1 22  3  4  5  6  7  8  9 10 11]
[33 33 33 33 33 33 33 33 33 33 33 33]


**Note**: Numpy decides the type of array based on the hierarchy. if all are int, it's an int. if even one is float, it's a float, if even one element is complex, it's a complex. if even one element is string, it's a string. eg:

In [47]:
a=np.array([1,2,3,4])
print(a,a.dtype)
a=np.array([1,2,3,4.9])
print(a,a.dtype)
a=np.array([1,2,3,4.9+1j])
print(a,a.dtype)
a=np.array([1,2,3,'s'])
print(a,a.dtype)

[1 2 3 4] int64
[1.  2.  3.  4.9] float64
[1. +0.j 2. +0.j 3. +0.j 4.9+1.j] complex128
['1' '2' '3' 's'] <U21


**Note**: We can control the type by specifying it:

In [51]:
a= np.array([1,2,3,4.4], dtype='int64')
print(a.dtype)

int64


In [54]:
b= np.array([[1,2,3,4],[3,4,5,6]])
print(b.ndim,b.shape) # the first dimension is rows, the second dimenstion is columns.



2 (2, 4)


**Note**: if we have a one dimensional object, it's row vector. C sort of languages are row major (memories are stored as row) in contrast to fortran type (matlab) languages.

**Note**: We are dealing with arrays, not matrices in numpy, it's unlike that of Matlab, where everything is matrix. The basic element is a one dimensional array, then two, three, four etc.

**Note**: Don't use the numpy matrix, we can do anything with array that we can do with matrix.

In [67]:
print(a)
print(a.T) # since it's a 1D array, transposing it does nothing.

print(b.size) #tells how many elements it has.
print(b.nbytes, a.nbytes)  # Total bytes consumed by the elements of the array.


[1 2 3 4]
[1 2 3 4]
8
64 32


**Note**: while accessing the values in array, we should **always** put the indices in the **same parenthesis**. This is in contrast to the regular python way.

In [72]:
# accessing the 2nd row and 3rd column value in b:
print(b)
print(b[1,2])

# don't use:
print(b[1][2]) #gives the same result, but put them in the same parenthesis always.

[[1 2 3 4]
 [3 4 5 6]]
5
5


In [73]:
# If we do partial indexing, we get the whole row/column (this is different from MATLAB)

print(b[0]) #gives the entire first row (matlab would give the first element)

[1 2 3 4]


#### **Slicing**

In [81]:
a= np.arange(10,15)
print(a[1:3])
# all things common in slicing in normal python is available here.
## ommited boundaries are considered to be the beginning or end of the list.
print(a[:3])
print(a[2:])

print(a[::2]) #every other element of the array

[11 12]
[10 11 12]
[12 13 14]
[10 12 14]


In [90]:
### slicing in two dimensional array:

a = np.arange(36).reshape(6,6)
print(a)
print(a[2:3,2:4])  #selects the elements 14 and 15
print(a[3:,3:])  #selecting last 3 rows and columns
print(a[:,2])  # selecting the 3rd column

print(a[2::2,::2])

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 31 32 33 34 35]]
[[14 15]]
[[21 22 23]
 [27 28 29]
 [33 34 35]]
[ 2  8 14 20 26 32]
[[12 14 16]
 [24 26 28]]


### **Note**: Everytime we index, we drop a dimension, and everytime we slice, we keep the dimension

In [96]:
print(a)
print( a.ndim)
print(a[0].ndim)
print(a[:1,:1].ndim) #even though we have only element here, we have two dimensions.

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 31 32 33 34 35]]
2
1
2


**example exercise** @42:46

In [101]:
a= np.arange(25).reshape(5,5)
print(a[:,1::2])
print(a[4,:])                   ## or simply a[4] / a[-1]
print(a[1::2,:3:2])

[[ 1  3]
 [ 6  8]
 [11 13]
 [16 18]
 [21 23]]
[20 21 22 23 24]
[[ 5  7]
 [15 17]]


**Another way of indexing:** Fancy indexing and boolean indexing

In [105]:
##Fancy indexing
a=np.arange(0,80,10)
print(a)
indices=[1,2,-3]
print(a[indices])
## fancy indexing gives copy all the time

[ 0 10 20 30 40 50 60 70]
[10 20 50]


In [117]:
## Indexing with Boolean

# manual creation of mask:
mask= np.array([0,1,1,0,0,1,0,0],dtype=bool)
print(a[mask])

# conditional creation of mask eg:
mask=  a<30
print(mask)

b= np.arange(1,11)
b[1::2]= -b[1::2]  #negates all even values
print(b)

print(b[b<0]) #selects all the negative values

# put a zero for all the negative values
b[b<0] = 0
print(b)

[10 20 50]
[ True  True  True False False False False False]
[  1  -2   3  -4   5  -6   7  -8   9 -10]
[ -2  -4  -6  -8 -10]
[1 0 3 0 5 0 7 0 9 0]


#### Selecting value in range:

In [125]:
# there is binary operator and the bitwise operator. 
# binary operators: 'and','or','not'  ----> one True one False
#bitwise operators: & (and), | (or), ~ (not), ^(xor) ----> element by element

b= np.array([1,2,3,4,-3,8,72,-12])
print(b[(b>3) & (b< 8)])  # note: the () are needed here

[4]


In [130]:
# to tell the positions of Trues, we can use 'nonzeors'

b[b<0]= 0
print(b)
np.nonzero(b)  #always returns a tuple, and returns the non zero values.

[ 1  2  3  4  0  8 72  0]


(array([0, 1, 2, 3, 5, 6]),)

In [133]:
# we can do element wise masking as well:

a= np.array([10,12,15])
b= np.array([22,5,16])
mask= a>b
print(mask)

[False  True False]


**Exercise example @1:25:22**

In [148]:
a= np.arange(25).reshape(5,5)
mask= a[[0,1,2,3],[1,2,3,4]]
print(mask)

# selecting all numbers divisible by 3:
mask2= a[a%3==0]
print(mask2)



[ 1  7 13 19]
[ 0  3  6  9 12 15 18 21 24]


In [149]:
#suppose i want to put 'nan' wherever there is number not divisible by 3:
print(a)
np.where(a%3==0,a, np.nan)   # nan is a valid floating point value that represents missingness

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


array([[ 0., nan, nan,  3., nan],
       [nan,  6., nan, nan,  9.],
       [nan, nan, 12., nan, nan],
       [15., nan, nan, 18., nan],
       [nan, 21., nan, nan, 24.]])