### SESSION 14 - Advanced Numpy

### Numpy array vs Python lists :
- **Comparing factors :**
    - **Speed**
    - **Memory**
    - **Convenience**
 

**Speed :**

In [1]:
# python list
a = [i for i in range(10000000)]
b = [i for i in range(10000000,20000000)]

import time
c = []
start = time.time()
for i in range(len(a)):
    c.append(a[i]+b[i])
print(time.time()-start)


4.512556076049805


In [3]:
# python numpy
import numpy as np
a = np.arange(10000000)
b = np.arange(10000000,20000000)

start = time.time()
c = a + b
print(time.time()-start)

0.3487734794616699


In [8]:
4.512556076049805/0.3487734794616699

12.938357821858794

**Memory :**

In [9]:
# python list
import sys
a = [i for i in range(10000000)]
sys.getsizeof(a)

89095160

In [17]:
# python numpy
import sys
a = np.arange(10000000,dtype=np.int32) # default float
sys.getsizeof(a)

40000112

**Convenience :**
- writing the code easy in numpy

### Advanced Indexing :

In [20]:
# Normal Indexing and slicing
a = np.arange(12).reshape(4,3)
print(a)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]


In [30]:
print(a[1:3,1:3])

[[4 5]
 [7 8]]


#### Fancy Indexing :
- Fancy indexing in NumPy is a way of indexing arrays **using an array or a list of indices rather than using a slice or a single integer index**. 
- This allows for more **advanced indexing and selection of elements from an array.**
- To perform fancy indexing in NumPy, you can **use an array or a list of indices to select specific elements or subarrays from an array.**
- **Use in pandas.**

In [38]:
arr1 = np.arange(24).reshape(6,4)
print(arr1)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


In [57]:
# row-w ise
print(arr1[[0,3,5]])

[[ 0  1  2  3]
 [12 13 14 15]
 [20 21 22 23]]


In [58]:
# column-wise
print(arr1[:,[0,2,3]])

[[ 0  2  3]
 [ 4  6  7]
 [ 8 10 11]
 [12 14 15]
 [16 18 19]
 [20 22 23]]


#### Boolean Indexing:
- Boolean indexing in NumPy is a **way of selecting elements from an array based on a Boolean condition.**
- The condition is specified as a Boolean array of the same shape as the array being indexed, where each element of the Boolean array indicates whether the corresponding element in the indexed array should be included in the result.
- **Use in pandas, data analysis etc.**

**np.random.randint():**
- It is a function provided by the NumPy library in Python. It is used to generate a random integer within a specified range.
- **syntax : np.random.randint(low, high=None, size=None, dtype=int)**
    - **low:** Lowest integer to be drawn from the distribution. It is inclusive.
    - **high:** If high is not None, one integer is drawn from the range [low, high). If high is None, one integer is drawn from the range [0, low).
    - **size:** Output shape of the array.
    - **dtype:** Data type of the output array.

In [63]:
arr2 = np.random.randint(1,70,24).reshape(6,4)
print(arr2)

[[48 60  8 17]
 [31 36 53  7]
 [39 12 43 32]
 [32 13 54 13]
 [55 65 55  9]
 [10 43 27 27]]


In [75]:
# find all numbers greater than 50
print(arr2[arr2>50])

[60 53 54 55 65 55]


In [69]:
# find out even numbers
print(arr2[arr2%2 == 0])

[48 60  8 36 12 32 32 54 10]


In [78]:
# find all numbers greater than 50 and are even
# Here we use bitwise '&' not logical 'and' because of boolean
print(arr2[(arr2%2 == 0) & (arr2>50)])

[60 54]


In [82]:
# find all numbers not divisible by 7
print(arr2[(arr2%7 != 0)])

[48 60  8 17 31 36 53 39 12 43 32 32 13 54 13 55 65 55  9 10 43 27 27]
