### Why Numpy and not conventional Python lists and so on?

`!pip install numpy`

Comparing the computation times of an explicit for-loop with vectorized implementation in numpy.

https://stackoverflow.com/questions/51549363/for-loop-vs-numpy-vectorization-computation-time

In [1]:
import time
import math
import numpy as np

iter = 10000000

x = np.zeros((iter,1))
v = np.random.randn(iter,1)

before = time.time()

for i in range(iter):
    x[i] = math.exp(v[i])
after = time.time()
print(x)
print("Regular for loop= " + str((after-before)*1000) + "ms")
print('\n')
time1 = (after-before)*1000

before = time.time()
x = np.exp(v)
after = time.time()
print(x)
print("Numpy operation= " + str((after-before)*1000) + "ms")
time2 = (after-before)*1000
print('\n')
print("Numpy is "+ str(round(time1/time2,2)) + " times faster than for loop in Python.")

[[0.85065458]
 [2.39523154]
 [0.06652518]
 ...
 [0.82251624]
 [1.41840514]
 [0.17977613]]
Regular for loop= 4935.1630210876465ms


[[0.85065458]
 [2.39523154]
 [0.06652518]
 ...
 [0.82251624]
 [1.41840514]
 [0.17977613]]
Numpy operation= 49.15475845336914ms


Numpy is 100.4 times faster than for loop in Python.


### Let's explore breast cancer data

Install sklearn  
`!pip install -U scikit-learn`

In [2]:
from sklearn.datasets import load_breast_cancer

cancer_data = load_breast_cancer()
cancer_data

{'data': array([[1.799e+01, 1.038e+01, 1.228e+02, ..., 2.654e-01, 4.601e-01,
         1.189e-01],
        [2.057e+01, 1.777e+01, 1.329e+02, ..., 1.860e-01, 2.750e-01,
         8.902e-02],
        [1.969e+01, 2.125e+01, 1.300e+02, ..., 2.430e-01, 3.613e-01,
         8.758e-02],
        ...,
        [1.660e+01, 2.808e+01, 1.083e+02, ..., 1.418e-01, 2.218e-01,
         7.820e-02],
        [2.060e+01, 2.933e+01, 1.401e+02, ..., 2.650e-01, 4.087e-01,
         1.240e-01],
        [7.760e+00, 2.454e+01, 4.792e+01, ..., 0.000e+00, 2.871e-01,
         7.039e-02]]),
 'target': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
        0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0,
        1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0,
        1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1,
        1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0

### Focus on
- `data`
- `target`
- `target_names`
- `feature_names`

### Extract only the `data` value

In [3]:
cancer_data['data']

array([[1.799e+01, 1.038e+01, 1.228e+02, ..., 2.654e-01, 4.601e-01,
        1.189e-01],
       [2.057e+01, 1.777e+01, 1.329e+02, ..., 1.860e-01, 2.750e-01,
        8.902e-02],
       [1.969e+01, 2.125e+01, 1.300e+02, ..., 2.430e-01, 3.613e-01,
        8.758e-02],
       ...,
       [1.660e+01, 2.808e+01, 1.083e+02, ..., 1.418e-01, 2.218e-01,
        7.820e-02],
       [2.060e+01, 2.933e+01, 1.401e+02, ..., 2.650e-01, 4.087e-01,
        1.240e-01],
       [7.760e+00, 2.454e+01, 4.792e+01, ..., 0.000e+00, 2.871e-01,
        7.039e-02]])

### Extract its datatype

In [4]:
type(cancer_data['data'])

numpy.ndarray

### Extract the shape of `data` 

In [5]:
cancer_data['data'].shape

(569, 30)

### Extract the `0`th data

In [6]:
cancer_data['data'][0]

array([1.799e+01, 1.038e+01, 1.228e+02, 1.001e+03, 1.184e-01, 2.776e-01,
       3.001e-01, 1.471e-01, 2.419e-01, 7.871e-02, 1.095e+00, 9.053e-01,
       8.589e+00, 1.534e+02, 6.399e-03, 4.904e-02, 5.373e-02, 1.587e-02,
       3.003e-02, 6.193e-03, 2.538e+01, 1.733e+01, 1.846e+02, 2.019e+03,
       1.622e-01, 6.656e-01, 7.119e-01, 2.654e-01, 4.601e-01, 1.189e-01])

### Do the same for `feature_names`
Discuss what it means

In [7]:
cancer_data['feature_names']

array(['mean radius', 'mean texture', 'mean perimeter', 'mean area',
       'mean smoothness', 'mean compactness', 'mean concavity',
       'mean concave points', 'mean symmetry', 'mean fractal dimension',
       'radius error', 'texture error', 'perimeter error', 'area error',
       'smoothness error', 'compactness error', 'concavity error',
       'concave points error', 'symmetry error',
       'fractal dimension error', 'worst radius', 'worst texture',
       'worst perimeter', 'worst area', 'worst smoothness',
       'worst compactness', 'worst concavity', 'worst concave points',
       'worst symmetry', 'worst fractal dimension'], dtype='<U23')

### Do the same for `target`
Discuss what it means

### Do the same for `target_names`
Discuss what it means

Now, can you generate a numpy array with
- 1 dim
- 2 dim

### The Basics

**Why use NumPy?**

NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems.

- NumPy arrays are faster and more compact than Python lists.
- An array consumes less memory and is convenient to use.
- NumPy uses much less memory to store data.
- Code optimization.

In [8]:
import numpy as np

In [9]:
a = np.array([1,2,3,4,5])

In [10]:
type(a)

numpy.ndarray

In [11]:
a.ndim

1

In [12]:
a.shape

(5,)

In [14]:
a = np.array([[1,2,3,4,5],[1,2,3,4,5]])

In [15]:
a.shape

(2, 5)

In [16]:
a.ndim

2

In [17]:
a = np.array([[[1,2,3,4,5],[1,2,3,4,5]],[[1,2,3,4,5],[1,2,3,4,5]]])

In [18]:
a.shape

(2, 2, 5)

In [19]:
a.ndim

3

In [20]:
a = np.array([[[[1, 2], [3, 4]], [[5, 6], [7, 8]]],
              [[[9, 10], [11, 12]], [[13, 14], [15, 16]]]])

In [21]:
a.ndim

4

In [22]:
a.shape

(2, 2, 2, 2)

In [23]:
a = np.array(
    [[[[[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9], [9, 10, 11, 12]]],
    [[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9], [9, 10, 11, 12]]]],
    [[[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9], [9, 10, 11, 12]]],
    [[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9999], [9, 10, 11, 12]]]]],
    [[[[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9], [9, 10, 11, 12]]],
    [[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9], [9, 10, 11, 12]]]],
    [[[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9], [9, 10, 11, 12]]],
    [[[1, 2, 3, 4], [5, 6, 8, 9], [9, 10, 11, 12]],
    [[1, 9, 3, 4], [5, 7, 8, 9], [9, 10, 11, 12]]]]]]
)

**Check the shape, dimension and find out where `9999` is located at.**

In [31]:
a.ndim

6

In [32]:
a.shape

(2, 2, 2, 2, 3, 4)

In [33]:
a[0][1][1][1][1][3]

9999

### `np.zeroes`, `np.ones` and `np.random`

In [34]:
import numpy as np

In [46]:
np.zeros((10,2,5))

array([[[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]])

In [47]:
np.ones((10,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [48]:
np.random.randn(10)

array([-0.97515897, -1.42446283,  0.24111614, -2.50724075,  0.42534697,
       -0.38879296,  0.40837191, -1.46537859,  1.41594555, -0.82015422])

In [49]:
np.random.randn(3,3,2)

array([[[-0.87092996, -0.24424998],
        [ 0.24299086,  0.33394608],
        [-0.62738185,  0.66798505]],

       [[ 0.41916911, -0.68809454],
        [-0.8731273 , -1.46392841],
        [-0.50870512,  1.32689836]],

       [[ 0.91783012,  1.44485091],
        [ 1.7467951 , -0.18792263],
        [ 1.34038924,  0.02606806]]])

### `np.arange`

How would you create a python list from 1 to 100?

In [51]:
# [1,2,3,4...100]

How would you create a python list of even numbers till 100?

In [54]:
# [2,4,6,8,...100]

In [55]:
np.arange(101)

array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
        91,  92,  93,  94,  95,  96,  97,  98,  99, 100])

In [56]:
np.arange(0,101,2)

array([  0,   2,   4,   6,   8,  10,  12,  14,  16,  18,  20,  22,  24,
        26,  28,  30,  32,  34,  36,  38,  40,  42,  44,  46,  48,  50,
        52,  54,  56,  58,  60,  62,  64,  66,  68,  70,  72,  74,  76,
        78,  80,  82,  84,  86,  88,  90,  92,  94,  96,  98, 100])

### Time comparision between normal list and `arange`.

In [60]:
import time

iterations = 100000000
before = time.time()
list(range(iterations))
after = time.time()
python_time = after - before
print(f"Python time: {python_time}")

before = time.time()
np.arange(iterations)
after = time.time()
numpy_time = after - before
print(f"Numpy time: {numpy_time}")

print(f"Numpy is {python_time/numpy_time} faster than regular Python.")

Python time: 3.287424087524414
Numpy time: 0.1764364242553711
Numpy is 18.63234364105142 faster than regular Python.


### Adding, removing and sorting

**Sorting**

In [67]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])

In [68]:
np.sort(arr) # quick sort - nlogn

array([1, 2, 3, 4, 5, 6, 7, 8])

**Concatenate**

In [10]:
import numpy as np
a = np.array([1, 2, 3, 4, 10])
b = np.array([5, 6, 7, 8])

In [11]:
np.concatenate((a,b))

array([ 1,  2,  3,  4, 10,  5,  6,  7,  8])

**Shape**

In [13]:
array_example = np.array([[0, 1, 2, 3],[4, 5, 6, 7],[4, 5, 6, 7]])

In [14]:
array_example

array([[0, 1, 2, 3],
       [4, 5, 6, 7],
       [4, 5, 6, 7]])

In [15]:
array_example.shape

(3, 4)

In [16]:
array_example.ndim

2

In [17]:
array_example.size

12

**Reshape**

In [18]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [19]:
a.reshape(5,2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

Create an array of `64` elements `(64,)` and reshape into
- `(16,4)`
- `(4,4,4)`
- `(8,2,4)`
- `(10,2)`


### Indexing and slicing

![Numpy](https://numpy.org/doc/stable/_images/np_indexing.png)

In [None]:
data = np.array([1, 2, 3])

In [None]:
data[1]

In [None]:
data[0:2]

In [None]:
data = np.array([[3,6,5],[7,8,4]])

In [None]:
data[1][1]

**Conditions**

In [30]:
a = np.array([1,2,3,4,5,6,7,8,9])

In [31]:
a[a<5]

array([1, 2, 3, 4])

In [32]:
a = np.array([[1 , 2, 3, 4, 5], [5, 3, 7, 8, 2], [9, 10, 11, 12,4]])

In [33]:
a[a < 5]

array([1, 2, 3, 4, 3, 2, 4])

In [34]:
a[a>=6]

array([ 7,  8,  9, 10, 11, 12])

In [35]:
five_up = (a >= 5)
a[five_up]

array([ 5,  5,  7,  8,  9, 10, 11, 12])

**Divisible by two**
- Create a numpy array from 1 to 50
- Filter only even elements by passing the condition as index.

In [36]:
a = np.arange(50)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])

In [39]:
a[a%2==0]

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,
       34, 36, 38, 40, 42, 44, 46, 48])

**Greater than and less than?**

In [40]:
a = np.arange(15)

In [41]:
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [42]:
a[(a > 5) & (a < 11)]

array([ 6,  7,  8,  9, 10])

**Broadcasting**

In [43]:
a = [1,2,3,4,5]

In [44]:
a*5

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

In [49]:
data = np.array([1, 2, 3, 4 ,5])
data * 20

array([ 20,  40,  60,  80, 100])

**`max`, `min`, `sum`**

In [50]:
data

array([1, 2, 3, 4, 5])

In [51]:
data.max()

5

In [52]:
data.min()

1

In [53]:
data.sum()

15

**Create an np array from 1 to 100 and compute the sum.**

In [54]:
x = np.arange(101)
x

array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
        91,  92,  93,  94,  95,  96,  97,  98,  99, 100])

In [55]:
x.sum()

5050