<a href="https://colab.research.google.com/github/ria1994maitra/Practice_Codes/blob/main/Padhai_Foundation_course.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Numpy( Week- 8)**

## **Working with arrays**
* Scientific Computing
* Financial Analysis
* Relational Data
* Multimedia Data
* Deep Learning

All of these require storing and processing high dimensional arrays efficiently.

We already learnt lists, sets, tuples and dictionaries.

List can store collection of high dimensional numbers as arrays and we can operate on them by iterating.
But this is very inefficient - 10x to 100x slower - than expected performance.

**Why?**
1. Lists are designed to store heterogenous data.
2. No low-level hardware mechanism to accelerate operations on lists


**Numpy**
* Intended to bring performance and functionality improvements for numerical computing
* Started only in 2006!
* Now a standard package used in many real-world apllications, other packages.

***Programing Level***
* provide implementation pof many functions across linear algebra, statistics...
* Efficiently broadcast operations across dimensions

***Functionality Level***
* Enable other packages to use numpy arrays as an efficient data interface 
* Efficiently process data without type-checking overhead.

***Hardware Level***
* Enable easy file save and load of n-d arrays.
* Efficiently store n-d array in vectorised form to benefit from DRAM locality.

**What we will focus on**
* What are nd-array?
* What is broadcasting?
* How to load and save n-d arrays
* How to use statistical functions.


## Comparing performance with lists, etc.

In [8]:
import numpy as np

In [1]:
N = 10000000

In [2]:
%%time
list_ = list(range(N))
for i in range(N):
    list_[i] = list_[i] * list_[i]

CPU times: user 2.73 s, sys: 191 ms, total: 2.92 s
Wall time: 2.92 s


In [3]:
%%time
list_ = list(range(N))
list_ = [item * item for item in list_]

CPU times: user 1.01 s, sys: 366 ms, total: 1.37 s
Wall time: 1.38 s


In [4]:
%%time
list_ = list(range(N))
list_ = map(lambda x: x*x , list_)

CPU times: user 143 ms, sys: 158 ms, total: 301 ms
Wall time: 300 ms


In [5]:
%%time
list_ = list(range(N))
list_sum = 0
for item in list_:
    list_sum+=item

CPU times: user 1.55 s, sys: 223 ms, total: 1.78 s
Wall time: 1.78 s


In [6]:
%%time
list_ = list(range(N))
list_sum = sum(list_)

CPU times: user 247 ms, sys: 432 ms, total: 679 ms
Wall time: 677 ms


In [9]:
%%time
arr = np.arange(N)
arr = arr * arr


CPU times: user 68.2 ms, sys: 2.65 ms, total: 70.8 ms
Wall time: 78.7 ms


In [10]:
%%time
arr = np.arange(N)
arr_sum = np.sum(arr)

CPU times: user 37.9 ms, sys: 0 ns, total: 37.9 ms
Wall time: 42.7 ms


## **High Dimensional Array & Creating NumPy Array**

In [11]:
arr = np.arange(5)

In [12]:
print(arr, type(arr))

[0 1 2 3 4] <class 'numpy.ndarray'>


In [21]:
arr = np.array([0.0,2,4,6,8])

In [22]:
print(arr, type(arr))

[0. 2. 4. 6. 8.] <class 'numpy.ndarray'>


In [23]:
arr

array([0., 2., 4., 6., 8.])

In [24]:
arr.dtype

dtype('float64')

In [25]:
arr.ndim

1

In [26]:
arr.shape

(5,)

In [27]:
arr.size

5

In [28]:
arr.itemsize

8

In [29]:
arr2d = np.array([
                  [1,2,3],
                  [4,5,6]
])

In [30]:
arr2d

array([[1, 2, 3],
       [4, 5, 6]])

In [31]:
arr2d.ndim

2

In [32]:
arr2d.shape

(2, 3)

In [33]:
arr2d.size

6

In [34]:
arr3d = np.array([
                  [
                   [1,2,3],
                   [4,5,6]
                  ],
                  [
                   [7,8,9],
                   [10,11,12]
                  ]
])

In [35]:
arr3d.shape

(2, 2, 3)

In [36]:
arr3d.ndim

3

In [37]:
arr3d.size

12

In [38]:
np.ones((2,3,4))

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In [39]:
1729*np.ones((2,3,4))

array([[[1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.]],

       [[1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.],
        [1729., 1729., 1729., 1729.]]])

In [40]:
np.zeros((2,3,4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [41]:
np.random.randn(2,3)

array([[ 1.25468433, -1.45592176, -0.3939183 ],
       [ 0.34970484,  0.27248797, -0.63126578]])

In [42]:
np.random.rand(2,3)

array([[0.88709998, 0.51298019, 0.32951752],
       [0.84473094, 0.27708629, 0.08807862]])

In [43]:
np.random.rand(10,1)

array([[0.81447144],
       [0.88417104],
       [0.21962908],
       [0.18017848],
       [0.22540267],
       [0.22316137],
       [0.53973028],
       [0.17954051],
       [0.08722958],
       [0.01064278]])

In [44]:
np.random.randint(0, 100,(2,3))

array([[37, 49, 80],
       [ 4, 93, 22]])

In [45]:
np.arange(7, 71, 7)

array([ 7, 14, 21, 28, 35, 42, 49, 56, 63, 70])

In [46]:
np.linspace(7,70,10)

array([ 7., 14., 21., 28., 35., 42., 49., 56., 63., 70.])

In [47]:
np.array([True, False, True, True])

array([ True, False,  True,  True])

In [49]:
str_arr = np.array(['1.4','2.1','1.1'])

In [50]:
arr = np.array(str_arr, dtype = 'float')

In [51]:
arr

array([1.4, 2.1, 1.1])