<a href="https://colab.research.google.com/github/ria1994maitra/Practice_Codes/blob/main/Padhai_Foundation_course.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Numpy( Week- 8)**

## **Working with arrays**
* Scientific Computing
* Financial Analysis
* Relational Data
* Multimedia Data
* Deep Learning

All of these require storing and processing high dimensional arrays efficiently.

We already learnt lists, sets, tuples and dictionaries.

List can store collection of high dimensional numbers as arrays and we can operate on them by iterating.
But this is very inefficient - 10x to 100x slower - than expected performance.

**Why?**
1. Lists are designed to store heterogenous data.
2. No low-level hardware mechanism to accelerate operations on lists


**Numpy**
* Intended to bring performance and functionality improvements for numerical computing
* Started only in 2006!
* Now a standard package used in many real-world apllications, other packages.

***Programing Level***
* provide implementation pof many functions across linear algebra, statistics...
* Efficiently broadcast operations across dimensions

***Functionality Level***
* Enable other packages to use numpy arrays as an efficient data interface 
* Efficiently process data without type-checking overhead.

***Hardware Level***
* Enable easy file save and load of n-d arrays.
* Efficiently store n-d array in vectorised form to benefit from DRAM locality.

**What we will focus on**
* What are nd-array?
* What is broadcasting?
* How to load and save n-d arrays
* How to use statistical functions.


## Comparing performance with lists, etc.

In [1]:
import numpy as np

In [2]:
N = 10000000

In [3]:
%%time
list_ = list(range(N))
for i in range(N):
    list_[i] = list_[i] * list_[i]

CPU times: user 3.63 s, sys: 244 ms, total: 3.87 s
Wall time: 3.92 s


In [4]:
%%time
list_ = list(range(N))
list_ = [item * item for item in list_]

CPU times: user 1.02 s, sys: 549 ms, total: 1.57 s
Wall time: 1.57 s


In [5]:
%%time
list_ = list(range(N))
list_ = map(lambda x: x*x , list_)

CPU times: user 375 ms, sys: 254 ms, total: 629 ms
Wall time: 631 ms


In [6]:
%%time
list_ = list(range(N))
list_sum = 0
for item in list_:
    list_sum+=item

CPU times: user 1.48 s, sys: 182 ms, total: 1.66 s
Wall time: 1.67 s


In [7]:
%%time
list_ = list(range(N))
list_sum = sum(list_)

CPU times: user 331 ms, sys: 186 ms, total: 517 ms
Wall time: 517 ms


In [8]:
%%time
arr = np.arange(N)
arr = arr * arr


CPU times: user 47 ms, sys: 1.21 ms, total: 48.3 ms
Wall time: 52.9 ms


In [9]:
%%time
arr = np.arange(N)
arr_sum = np.sum(arr)

CPU times: user 31.2 ms, sys: 1.05 ms, total: 32.2 ms
Wall time: 33.2 ms


## **High Dimensional Array & Creating NumPy Array**

In [10]:
arr = np.arange(5)

In [11]:
print(arr, type(arr))

[0 1 2 3 4] <class 'numpy.ndarray'>


In [12]:
arr = np.array([0.0,2,4,6,8])

In [13]:
print(arr, type(arr))

[0. 2. 4. 6. 8.] <class 'numpy.ndarray'>


In [14]:
arr

array([0., 2., 4., 6., 8.])

In [15]:
arr.dtype

dtype('float64')

In [16]:
arr.ndim

1

In [17]:
arr.shape

(5,)

In [18]:
arr.size

5

In [19]:
arr.itemsize

8

In [20]:
arr2d = np.array([
                  [1,2,3],
                  [4,5,6]
])

In [21]:
arr2d

array([[1, 2, 3],
       [4, 5, 6]])

In [22]:
arr2d.ndim

2

In [23]:
arr2d.shape

(2, 3)

In [24]:
arr2d.size

6

In [25]:
arr3d = np.array([
                  [
                   [1,2,3],
                   [4,5,6]
                  ],
                  [
                   [7,8,9],
                   [10,11,12]
                  ]
])

In [26]:
arr3d.shape

(2, 2, 3)

In [27]:
arr3d.ndim

3

In [28]:
arr3d.size

12

In [29]:
np.ones((2,3,4))

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In [30]:
1729*np.ones((2,3,4))

array([[[1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.]],

       [[1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.]]])

In [31]:
np.zeros((2,3,4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [32]:
np.random.randn(2,3)

array([[ 0.12335659,  0.28081514, -0.47021374],
       [-0.10979971,  0.93599146, -0.21383982]])

In [33]:
np.random.rand(2,3)

array([[0.8792471 , 0.263905  , 0.426518  ],
       [0.65101014, 0.71780975, 0.93895392]])

In [34]:
np.random.rand(10,1)

array([[0.92583908],
       [0.50721999],
       [0.56053915],
       [0.90214265],
       [0.53305515],
       [0.38009755],
       [0.62816872],
       [0.4680259 ],
       [0.42604201],
       [0.24331606]])

In [35]:
np.random.randint(0, 100,(2,3))

array([[68, 68,  5],
       [93, 89, 53]])

In [36]:
np.arange(7, 71, 7)

array([ 7, 14, 21, 28, 35, 42, 49, 56, 63, 70])

In [37]:
np.linspace(7,70,10)

array([ 7., 14., 21., 28., 35., 42., 49., 56., 63., 70.])

In [38]:
np.array([True, False, True, True])

array([ True, False,  True,  True])

In [39]:
str_arr = np.array(['1.4','2.1','1.1'])

In [40]:
arr = np.array(str_arr, dtype = 'float')

In [41]:
arr

array([1.4, 2.1, 1.1])

## Creating np Arrays

In [42]:
import numpy as np

In [43]:
arr = np.arange(5)

In [44]:
print(arr, type(arr))

[0 1 2 3 4] <class 'numpy.ndarray'>


In [45]:
arr.dtype

dtype('int64')

In [46]:
arr.ndim

1

In [47]:
arr.shape

(5,)

In [48]:
arr.size

5

In [49]:
arr.itemsize

8

In [50]:
arr2d = np.array([
    [1,2,3],
    [4,5,6]
])

In [51]:
arr2d

array([[1, 2, 3],
       [4, 5, 6]])

In [52]:
arr2d.ndim

2

In [53]:
arr2d.shape

(2, 3)

In [54]:
arr3d = np.array([
                  [
                   [1,2,3],
                   [4,5,6]
                  ],
                  [
                   [7,8,9],
                   [10,11,12]
                  ]
])

In [55]:
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [56]:
arr3d.shape

(2, 2, 3)

In [57]:
arr3d.size

12

In [58]:
np.ones((2,3,4))

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In [59]:
1729*np.ones((2,3,4))

array([[[1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.]],

       [[1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.]]])

In [60]:
np.zeros((2,3,4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [61]:
np.random.randn(2,3)

array([[-1.87071882,  0.39207096, -0.0153899 ],
       [ 0.34929235,  2.17690438, -0.48915203]])

In [62]:
np.random.rand(2,3)

array([[0.70011858, 0.10350166, 0.40812385],
       [0.36908371, 0.68166539, 0.12124356]])

In [63]:
np.random.rand(10,1)

array([[0.14255806],
       [0.99158405],
       [0.97821606],
       [0.05665871],
       [0.82688525],
       [0.07510126],
       [0.87198274],
       [0.46459521],
       [0.92227277],
       [0.23443406]])

In [64]:
np.random.randint(0,100,(2,3))

array([[93, 27, 15],
       [ 2, 99, 13]])

In [65]:
np.arange(7,71,7)

array([ 7, 14, 21, 28, 35, 42, 49, 56, 63, 70])

In [66]:
np.linspace(7, 70,10)

array([ 7., 14., 21., 28., 35., 42., 49., 56., 63., 70.])

In [67]:
np.array([True, False, True, True])

array([ True, False,  True,  True])

In [68]:
str_arr = np.array(['1.4','2.1','3.1'])

In [69]:
arr = np.array(str_arr, dtype = 'float')

In [70]:
arr

array([1.4, 2.1, 3.1])

## Indexing

In [71]:
print(arr3d)

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]


In [72]:
arr3d[0,0,0]

1

In [73]:
arr3d[1,0,2]

9

In [74]:
i = 1
j = 0
k = 2
arr3d[i,j,k]

9

In [75]:
arr3d[1,:,:]

array([[ 7,  8,  9],
       [10, 11, 12]])

In [76]:
arr3d[:,1,:]

array([[ 4,  5,  6],
       [10, 11, 12]])

In [77]:
arr3d % 2 ==0

array([[[False,  True, False],
        [ True, False,  True]],

       [[False,  True, False],
        [ True, False,  True]]])

In [78]:
arr3d[arr3d % 2 ==1]

array([ 1,  3,  5,  7,  9, 11])

In [79]:
arr3d[(arr3d % 2 ==1) & (arr3d > 3)]

array([ 5,  7,  9, 11])

In [80]:
arr_slice = arr3d[:,:,0:2]

In [81]:
print(type(arr_slice))

<class 'numpy.ndarray'>


In [82]:
arr_slice.ndim

3

In [83]:
arr_slice.shape

(2, 2, 2)

In [84]:
arr_slice[0,0,1]

2

In [85]:
arr_slice[0,0,0] =1729

In [86]:
arr_slice

array([[[1729,    2],
        [   4,    5]],

       [[   7,    8],
        [  10,   11]]])

In [87]:
arr3d

array([[[1729,    2,    3],
        [   4,    5,    6]],

       [[   7,    8,    9],
        [  10,   11,   12]]])

In [88]:
arr_slice = np.copy(arr3d[:,:,0:2])

In [89]:
arr_slice[0,0,0] =1

In [90]:
arr_slice

array([[[ 1,  2],
        [ 4,  5]],

       [[ 7,  8],
        [10, 11]]])

In [91]:
arr3d

array([[[1729,    2,    3],
        [   4,    5,    6]],

       [[   7,    8,    9],
        [  10,   11,   12]]])

In [92]:
arr = np.random.randint(0,10, (5))

In [93]:
arr

array([7, 5, 3, 3, 9])

In [94]:
my_indices = [1,3,4]

In [95]:
arr[my_indices]

array([5, 3, 9])

## Numpy Operations

In [96]:
arr1 = np.random.rand(3,4)
arr2 = np.random.rand(3,4)

In [97]:
arr1

array([[0.83535881, 0.08464225, 0.34906169, 0.73612548],
       [0.9071185 , 0.57957974, 0.08159811, 0.743952  ],
       [0.23714779, 0.12834388, 0.79613043, 0.32433883]])

In [98]:
arr2

array([[0.93182915, 0.30987232, 0.28340046, 0.59678953],
       [0.01291882, 0.16561504, 0.18325294, 0.17671213],
       [0.39817521, 0.71281582, 0.43828649, 0.24824885]])

In [99]:
arr1+arr2

array([[1.76718796, 0.39451457, 0.63246215, 1.33291501],
       [0.92003731, 0.74519477, 0.26485105, 0.92066413],
       [0.635323  , 0.84115971, 1.23441692, 0.57258767]])

In [100]:
arr1-arr2

array([[-0.09647034, -0.22523007,  0.06566123,  0.13933594],
       [ 0.89419968,  0.4139647 , -0.10165483,  0.56723987],
       [-0.16102742, -0.58447194,  0.35784394,  0.07608998]])

In [101]:
arr1 * arr2

array([[0.77841169, 0.02622829, 0.09892424, 0.43931198],
       [0.0117189 , 0.09598712, 0.01495309, 0.13146534],
       [0.09442637, 0.09148555, 0.34893321, 0.08051674]])

In [102]:
arr1/ arr2

array([[ 0.89647207,  0.27315202,  1.23169063,  1.23347585],
       [70.21683542,  3.49955985,  0.44527586,  4.20996564],
       [ 0.59558652,  0.18005196,  1.81646125,  1.30650689]])

In [103]:
np.exp(arr1)

array([[2.30564118, 1.08832765, 1.41773665, 2.08783047],
       [2.47717425, 1.78528798, 1.08501966, 2.10423505],
       [1.26762845, 1.13694391, 2.21694568, 1.38311586]])

In [104]:
np.log(np.exp(arr1))

array([[0.83535881, 0.08464225, 0.34906169, 0.73612548],
       [0.9071185 , 0.57957974, 0.08159811, 0.743952  ],
       [0.23714779, 0.12834388, 0.79613043, 0.32433883]])

In [105]:
np.sin(arr1)

array([[0.74153729, 0.08454122, 0.34201624, 0.67142164],
       [0.78773195, 0.54767235, 0.08150759, 0.67720107],
       [0.2349312 , 0.12799182, 0.71465477, 0.31868215]])

In [106]:
np.cos(arr1)

array([[0.67091166, 0.99641998, 0.93969404, 0.74107555],
       [0.61601816, 0.83669289, 0.99667272, 0.73579801],
       [0.972012  , 0.99177522, 0.69947735, 0.94786164]])

In [107]:
np.sqrt(arr1)

array([[0.91397965, 0.29093341, 0.59081443, 0.85797755],
       [0.95242769, 0.76130134, 0.28565383, 0.86252652],
       [0.48697822, 0.35825115, 0.89226141, 0.56950753]])

In [108]:
arr_inv = 1 / arr1

In [109]:
arr_inv

array([[ 1.19709039, 11.81443073,  2.86482311,  1.35846406],
       [ 1.10239181,  1.72538814, 12.255186  ,  1.34417274],
       [ 4.21677976,  7.79156721,  1.25607559,  3.08319547]])

In [110]:
arr1 = np.zeros((3,5))

In [111]:
arr_inv = 1/arr1

  """Entry point for launching an IPython kernel.


In [112]:
print(arr_inv)

[[inf inf inf inf inf]
 [inf inf inf inf inf]
 [inf inf inf inf inf]]


In [113]:
np.isinf(arr_inv[0,0])

True

In [114]:
np.isinf(arr_inv[0,0])

True

In [115]:
np.isinf(arr_inv)

array([[ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True]])

## Problem Solution

* Exercise on finding number of points outside n-dimensional sphere

In [116]:
ndim = 2

In [117]:
npoints = 100000

In [118]:
points = np.random.rand(npoints, ndim)

In [119]:
points[0:2, :]

array([[0.13029478, 0.29981583],
       [0.31134425, 0.5012857 ]])

In [120]:
dfo = np.zeros((npoints,1))
outside_points = 0

In [121]:
%%time
for i in range(npoints):
    for j in range(ndim):
        dfo[i] += points[i,j] **2
    dfo[i] = np.sqrt(dfo[i])
    if dfo[i]> 1:
        outside_points +=1

CPU times: user 2 s, sys: 374 ms, total: 2.38 s
Wall time: 2.14 s


In [122]:
print(outside_points/npoints)

0.21349


In [123]:
# 1-(pi/4)
1-3.14/4

0.21499999999999997

In [124]:
%%time
sq_points = points * points
dfo = np.sum(sq_points, axis = 1)
outside_points = np.sum(dfo >1)



CPU times: user 3.27 ms, sys: 0 ns, total: 3.27 ms
Wall time: 3.29 ms


# Broadcasting

In [125]:
arr1 = np.arange(6)
arr1.shape

(6,)

In [126]:
arr1 = arr1.reshape((3,2))

In [127]:
arr1.shape

(3, 2)

In [128]:
arr1

array([[0, 1],
       [2, 3],
       [4, 5]])

In [129]:
arr2 = np.arange(6).reshape((3,2))

In [130]:
arr1 + arr2

array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])

In [131]:
arr2[0].reshape((1,2))

array([[0, 1]])

In [132]:
arr1 + arr2[0].reshape((1,2)) # (3,2) + (1,2)

array([[0, 2],
       [2, 4],
       [4, 6]])

In [133]:
arr2[:,0].reshape((3,1))

array([[0],
       [2],
       [4]])

In [134]:
arr1 + arr2[:,0].reshape((3,1)) # (3,2) + (3,1)

array([[0, 1],
       [4, 5],
       [8, 9]])

In [135]:
arr1 +1

array([[1, 2],
       [3, 4],
       [5, 6]])

In [137]:
arr1 = np.arange(24).reshape((2,3,4))

In [138]:
arr1

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [139]:
arr2 = np.ones((1,4))

In [141]:
arr1 + arr2  ##(2,3,4) + (1,4)

array([[[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.]],

       [[13., 14., 15., 16.],
        [17., 18., 19., 20.],
        [21., 22., 23., 24.]]])

In [142]:
arr1 = np.arange(4)

In [144]:
arr2 = np.arange(5)

In [145]:
print(arr1.shape, arr2.shape)

(4,) (5,)
