**Numpy**

- Considered as a fundamental package for all scientific computing in Python
- This is kind of base for lot of other major data science libraries in python, like Pandas


**Background on how Numpy works**

**What is numpy**
- Multi Dimensional Array Library
- We can use numpy to store all types of data in One dimensional array, two dimensional array, three.. four, etc

**Why use numpy over lists**
- Main difference is speed. Lists are slow, numpy is very fast

**Why are lists slow and numpy fast**
- numpy uses fixed type
- Faster to read as less bytes of memory
- We don't have to do type checking in numpy
- Numpy uses Contiguous memory, in lists it is scattered (Benefits, SIMD Vector Processing, Effective Cache utilization)

  **Numpy**
  
- Ex: consider 5, computer read this in binary 00000101 (8bits or 1 byte); by default this gets cached in int32 type in numpy (4 Bytes)
- In numpy, we can specify the bytes needed (int16, int8) based on the value we have
 
  **Lists**
- We need to store lot of other information apart from integer
- this used built-in int type, this uses 4 different things
- object value , object type, reference count, size
- object value (8 bytes)
- object type (8 bytes) 
- reference count (8 bytes)
- size (4 bytes)

**What can we do in numpy**

-Insertion, Deletion, Appending, Concatenation and lots more

In [35]:
#lists

a=[1,3,5]
b=[2,4,6]

In [45]:
a+b

[1, 3, 5, 2, 4, 6]

In [43]:
a*b # Cannot do this in list

TypeError: can't multiply sequence by non-int of type 'list'

In [39]:
#numpy

a_arr=np.array([1,3,5])
b_arr=np.array([2,4,6])

In [47]:
a_arr+b_arr

array([ 3,  7, 11])

In [41]:
a_arr*b_arr

array([ 2, 12, 30])

**Applications of numpy**

- Mathematics (matlab replcement),Scipy library
- Plotting (Matplotlib)
- Backend of many applications(Pandas, Digital Photography)
- Machine Learning (very important, Tensors (similar to numpy))

**Install numpy library**

In [59]:
!pip install numpy



In [79]:
import numpy as np

**Basics**

**How to initialize an array**

In [89]:
x= np.array([10,20,30])
x

array([10, 20, 30])

In [91]:
print(x)

[10 20 30]


In [197]:
#2D array of floats
y=np.array([[2.0,5.0,3.0],[8.0,10.0,4.0]])

In [99]:
y

array([[ 2.,  5.,  3.],
       [ 8., 10.,  4.]])

In [107]:
#nest list within lists
z=np.array([[[1,2,3]]])

In [113]:
z

array([[[1, 2, 3]]])

In [115]:
print(z)

[[[1 2 3]]]


In [117]:
#Get the dimensions of numpy arrays

In [119]:
x.ndim

1

In [121]:
y.ndim

2

In [123]:
z.ndim

3

In [125]:
#Get Shape

In [127]:
x.shape

(3,)

In [129]:
y.shape

(2, 3)

In [131]:
z.shape

(1, 1, 3)

In [133]:
#How much memory numpy array takes up

In [135]:
#Get the Type
x.dtype

dtype('int32')

In [145]:
x1= np.array([10,20,30],dtype='int16')
x1

array([10, 20, 30], dtype=int16)

In [147]:
x1.dtype

dtype('int16')

In [149]:
#Get the size

In [151]:
x.itemsize

4

In [153]:
x1.itemsize

2

In [159]:
x2.itemsize

8

In [163]:
#Get total size

In [167]:
x.size * x.itemsize

12

In [169]:
x.nbytes

12

In [171]:
x1.size * x1.itemsize

6

In [173]:
x1.nbytes

6

In [175]:
x2.size * x2.itemsize

24

In [177]:
x2.nbytes

24

In [199]:
y.itemsize # its more as the values are floats

8

In [201]:
y.dtype

dtype('float64')

In [203]:
#Don't worry about data type

**Accessing / Changing specific elements, rows, columns**

In [348]:
a=np.array([[1,2,3,4,5,6,7],[11,12,13,14,15,16,17]])
a

array([[ 1,  2,  3,  4,  5,  6,  7],
       [11, 12, 13, 14, 15, 16, 17]])

In [216]:
a.shape

(2, 7)

In [222]:
#Get a specific element [row, col]
#to get 15
a[1,4]

15

In [224]:
a[1,-3]

15

In [228]:
#Get a specific row
a[0,:]

array([1, 2, 3, 4, 5, 6, 7])

In [230]:
#Get a specific col
a[:,2]

array([ 3, 13])

In [232]:
#Fancy Indexing
#(Start index:Stopindex(stop+1):stepsize)

In [240]:
a[0,1:6:2]

array([2, 4, 6])

In [248]:
#To change a value
a[1,3]=100

In [250]:
a

array([[  1,   2,   3,   4,   5,   6,   7],
       [ 11,  12,  13, 100,  15,  16,  17]])

In [252]:
#Change entire column

In [254]:
a[:,2]=55

In [256]:
a

array([[  1,   2,  55,   4,   5,   6,   7],
       [ 11,  12,  55, 100,  15,  16,  17]])

In [258]:
a[:,4]=[33,44]

In [260]:
a

array([[  1,   2,  55,   4,  33,   6,   7],
       [ 11,  12,  55, 100,  44,  16,  17]])

In [262]:
##Three Dimensional array

In [290]:
a=np.array([[[1,2],[3,4]],[[5,6],[7,8]]])

In [292]:
a

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

In [294]:
a.ndim

3

In [296]:
#to get a value from a 3D

In [298]:
a[0,1,1]

4

In [300]:
#replace
a

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

In [302]:
a[:,1,:]

array([[3, 4],
       [7, 8]])

In [305]:
a[:,1,:] = [[9,9],[8,8]]

In [307]:
a

array([[[1, 2],
        [9, 9]],

       [[5, 6],
        [8, 8]]])

**Initialize Different Types of Arrays**

In [310]:
#All 0s matrix

In [312]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [314]:
a=np.zeros(5)
a

array([0., 0., 0., 0., 0.])

In [316]:
a.ndim

1

In [322]:
np.zeros([2,3])

array([[0., 0., 0.],
       [0., 0., 0.]])

In [324]:
np.zeros([2,3,3])

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

In [326]:
np.zeros([2,3,3,4])

array([[[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]],


       [[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]]])

In [328]:
#All ones matrix

In [330]:
np.ones(2)

array([1., 1.])

In [334]:
np.ones([3,4])

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [336]:
np.ones([4,3,2],dtype='int32')

array([[[1, 1],
        [1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1],
        [1, 1]]])

In [338]:
#Any other number

In [340]:
np.full((2,2),8)

array([[8, 8],
       [8, 8]])

In [342]:
np.full((2,2),22,dtype='float32')

array([[22., 22.],
       [22., 22.]], dtype=float32)

In [356]:
#Other number, takes shape already built
np.full_like(a,7)

array([[7, 7, 7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7, 7, 7]])

In [358]:
a

array([[ 1,  2,  3,  4,  5,  6,  7],
       [11, 12, 13, 14, 15, 16, 17]])

In [360]:
np.full(a.shape,7)

array([[7, 7, 7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7, 7, 7]])

In [362]:
#Matrix of random numbers (decimal)

In [366]:
np.random.rand(4,2)

array([[0.954794  , 0.35217893],
       [0.77015946, 0.5194333 ],
       [0.72375346, 0.24639621],
       [0.3763648 , 0.9094211 ]])

In [370]:
np.random.rand(4,2,3)

array([[[0.02992057, 0.25692092, 0.90861653],
        [0.45631808, 0.75338433, 0.86041059]],

       [[0.34373879, 0.89062707, 0.47000998],
        [0.7274619 , 0.16404356, 0.24732661]],

       [[0.46194184, 0.93088633, 0.86599789],
        [0.55241286, 0.33044344, 0.61982089]],

       [[0.2041296 , 0.50587643, 0.43105876],
        [0.55121638, 0.83303666, 0.93884022]]])

In [None]:
#To pass a shape, use random_sample

In [372]:
np.random.random_sample(a.shape)

array([[0.37220056, 0.05476066, 0.02739874, 0.06341032, 0.20766552,
        0.59821216, 0.90234218],
       [0.22846287, 0.77472806, 0.62988219, 0.1143743 , 0.80802449,
        0.25630692, 0.19959912]])

In [374]:
#For random integer values

In [398]:
np.random.randint(3,7,size=(3,3))

array([[5, 6, 4],
       [4, 3, 3],
       [4, 6, 5]])

In [400]:
np.random.randint(-3,7,size=(3,3))

array([[ 6,  5,  1],
       [ 4,  0,  4],
       [ 4, -2,  4]])

In [402]:
#Identity matrix

In [404]:
np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [416]:
a=np.array([1,2,3])*3
r=np.repeat(a,4)
r

array([3, 3, 3, 3, 6, 6, 6, 6, 9, 9, 9, 9])

In [422]:
#Repeat an array
a=np.array([[1,2,3]])*3
r=np.repeat(a,4,axis=0)
r

array([[3, 6, 9],
       [3, 6, 9],
       [3, 6, 9],
       [3, 6, 9]])

#Try this

1  1  1  1  1

1  0  0  0  1

1  0  9  0  1

1  0  0  0  1

1  1  1  1  1

In [476]:
#make all 1s, print
#fill middle part with 0s
#fill middle one with 9 in zeros


In [478]:
a=np.ones((5,5))
a

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [480]:
b=np.zeros((3,3))
b

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [482]:
a[1:4,1:4]

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [486]:
b[1,1]=9
b

array([[0., 0., 0.],
       [0., 9., 0.],
       [0., 0., 0.]])

In [488]:
a[1:4,1:4] = b

In [490]:
a

array([[1., 1., 1., 1., 1.],
       [1., 0., 0., 0., 1.],
       [1., 0., 9., 0., 1.],
       [1., 0., 0., 0., 1.],
       [1., 1., 1., 1., 1.]])

In [492]:
# Important while copying

In [494]:
a=np.array([1,2,3,4])

In [498]:
b=a
b

array([1, 2, 3, 4])

In [500]:
b[0]=100

In [502]:
b

array([100,   2,   3,   4])

In [504]:
a

array([100,   2,   3,   4])

In [506]:
a=np.array([1,2,3,4])

In [510]:
b=a.copy()
b

array([1, 2, 3, 4])

In [514]:
b[0]=100
b

array([100,   2,   3,   4])

In [516]:
a

array([1, 2, 3, 4])

**Mathematics**

In [521]:
a=np.array([1,2,3,4])
print(a)

[1 2 3 4]


In [523]:
a+2

array([3, 4, 5, 6])

In [525]:
a-2

array([-1,  0,  1,  2])

In [527]:
a*2

array([2, 4, 6, 8])

In [529]:
a/2

array([0.5, 1. , 1.5, 2. ])

In [531]:
a%2

array([1, 0, 1, 0], dtype=int32)

In [533]:
a//2

array([0, 1, 1, 2], dtype=int32)

In [537]:
a+=2

In [539]:
a

array([3, 4, 5, 6])

In [541]:
b=np.array([3,0,3,0])

In [545]:
a

array([3, 4, 5, 6])

In [547]:
b

array([3, 0, 3, 0])

In [543]:
a+b

array([6, 4, 8, 6])

In [549]:
a**2

array([ 9, 16, 25, 36])

In [551]:
#sin
np.sin(a)

array([ 0.14112001, -0.7568025 , -0.95892427, -0.2794155 ])

In [555]:
np.cos(a)

array([-0.9899925 , -0.65364362,  0.28366219,  0.96017029])

**Linear Algebra**

- we don't do element wise computation
- cols of first should be same as rows of second

In [567]:
a=np.full((2,3),1)
a

array([[1, 1, 1],
       [1, 1, 1]])

In [569]:
b=np.full((3,2),4)
b

array([[4, 4],
       [4, 4],
       [4, 4]])

In [572]:
#matrix multiplication

In [576]:
a*b

ValueError: operands could not be broadcast together with shapes (2,3) (3,2) 

In [578]:
np.matmul(a,b)

array([[12, 12],
       [12, 12]])

In [580]:
a @ b

array([[12, 12],
       [12, 12]])

In [582]:
a.dot(b)

array([[12, 12],
       [12, 12]])

In [586]:
#Determinant of matrix
c=np.identity(5)
c

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [590]:
np.linalg.det(c)

1.0

**Statistics**

In [593]:
st=np.array([[1,2,3,4,5],[6,7,8,9,10]])
st

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

In [601]:
np.min(st)

1

In [603]:
np.max(st)

10

In [605]:
np.min(st, axis=0)

array([1, 2, 3, 4, 5])

In [607]:
np.min(st, axis=1)

array([1, 6])

In [609]:
np.max(st, axis=1)

array([ 5, 10])

In [611]:
np.max(st, axis=0)

array([ 6,  7,  8,  9, 10])

In [613]:
np.sum(st)

55

In [615]:
np.sum(st, axis=0)

array([ 7,  9, 11, 13, 15])

In [617]:
np.sum(st, axis=1)

array([15, 40])

**Reorganizing Arrays**

In [620]:
a=np.array([[1,2,3,4],[5,6,7,8]])
a

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [622]:
a.shape

(2, 4)

In [630]:
#rehape
b=a.reshape(8,1)
b

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]])

In [632]:
c=a.reshape(4,2)
c

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

In [634]:
d=a.reshape(2,2,2)
d

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

In [636]:
d=a.reshape(2,2,3)
d

ValueError: cannot reshape array of size 8 into shape (2,2,3)

In [638]:
#Stacking
#Vertical stacks

In [640]:
v1=np.array([1,2,3,4])
v2=np.array([5,6,7,8])

In [644]:
np.vstack([v1,v2])

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [646]:
np.vstack([v1,v2,v2,v2,v2])

array([[1, 2, 3, 4],
       [5, 6, 7, 8],
       [5, 6, 7, 8],
       [5, 6, 7, 8],
       [5, 6, 7, 8]])

In [648]:
#Hortizantal stack

In [654]:
h1=np.ones((2,4))
h2=np.zeros((2,2))
print(h1)
print(h2)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[[0. 0.]
 [0. 0.]]


In [656]:
np.hstack([h1,h2])

array([[1., 1., 1., 1., 0., 0.],
       [1., 1., 1., 1., 0., 0.]])

**load data from file into numpy**

In [None]:
np.genfromtxt('file_name.txt',delimiter=',')

In [1]:
import numpy as np

In [3]:
l1=[1,3,5,4.3,2.1,4.6,'Yes',True]

In [7]:
a=np.array([l1])
a

array([['1', '3', '5', '4.3', '2.1', '4.6', 'Yes', 'True']], dtype='<U32')

In [9]:
a.reshape(2,4)

array([['1', '3', '5', '4.3'],
       ['2.1', '4.6', 'Yes', 'True']], dtype='<U32')

In [659]:
##Advance indexing, Boolean masking

In [699]:
a=np.array([[1,2,3,56,56,4,4,34,54],[4543534,324,25,77,8,3332,399,776,900],[1,343,777,444,7777,908,4,3,2]])
a

array([[      1,       2,       3,      56,      56,       4,       4,
             34,      54],
       [4543534,     324,      25,      77,       8,    3332,     399,
            776,     900],
       [      1,     343,     777,     444,    7777,     908,       4,
              3,       2]])

In [683]:
a>50

array([[False, False, False,  True,  True, False, False, False,  True],
       [ True,  True, False,  True, False,  True,  True,  True,  True],
       [False,  True,  True,  True,  True,  True, False, False, False]])

In [687]:
a[a>50]

array([     56,      56,      54, 4543534,     324,      77,    3332,
           399,     776,     900,     343,     777,     444,    7777,
           908])

In [693]:
a=np.array([1,2,3,4,5,6,7,8,9])
a

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [695]:
a[[1,2,6]]

array([2, 3, 7])

In [701]:
#any
np.any(a>50, axis=0)

array([ True,  True,  True,  True,  True,  True,  True,  True,  True])

In [703]:
#all
np.all(a>50, axis=0)

array([False, False, False,  True, False, False, False, False, False])

In [705]:
np.all(a>50, axis=1)

array([False, False, False])

In [715]:
(a>50) & (a<500)

array([[False, False, False,  True,  True, False, False, False,  True],
       [False,  True, False,  True, False, False,  True, False, False],
       [False,  True, False,  True, False, False, False, False, False]])

In [717]:
~((a>50) & (a<500))

array([[ True,  True,  True, False, False,  True,  True,  True, False],
       [ True, False,  True, False,  True,  True, False,  True,  True],
       [ True, False,  True, False,  True,  True,  True,  True,  True]])

In [729]:
b=np.arange(1,31).reshape(6,5)
b

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25],
       [26, 27, 28, 29, 30]])

In [731]:
b[2:4,0:2]

array([[11, 12],
       [16, 17]])

In [737]:
b[[0,1,2,3],[1,2,3,4]]

array([ 2,  8, 14, 20])

In [739]:
b[[0,3,4],3:]

array([[ 4,  5],
       [19, 20],
       [24, 25]])

In [741]:
#Indexing:  https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html

In [743]:
#Array : https://numpy.org/doc/stable/reference/routines.array-creation.html

In [745]:
#Math: https://numpy.org/doc/stable/reference/routines.math.html

In [747]:
#linalg: https://numpy.org/doc/stable/reference/routines.linalg.html