### Numpy
* Numpy means Numerical Python
* Python's Linear Algebra library
* Gives us multi dimensional (multi-dim) array

### Why do we need Numpy?
 * Memory requirement: Numpy lists require less memory than Python ones, see below example.
 * Operations on Numpy list are faster than that on normal lists.
 * Numpy is more convenient and has wider functionality.

In [1]:
# Memory requirement: Numpy lists require less memory than Python ones, see below example.

import numpy as np

li_arr = [i for i in range(0, 100)]
np_arr = np.arange(100)

print(li_arr)
print(np_arr)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]


In [2]:
## Size of one element in numpy array
print(np_arr.itemsize)

## Size of 100 elements in numpy array
print(np_arr.itemsize * np_arr.size)

8
800


In [3]:
import sys

a = 10

## Size req of 1 element in list
print(sys.getsizeof(a))

## Size of 100 elements in list
print(sys.getsizeof(1) * len(li_arr))

28
2800


In [4]:
import numpy as np
import time

size = 100000

## Addd elements of array a and b

def addition_using_list():
    t1 = time.time() # Epoch time
    a = range(size)
    b = np.arange(size)
    c = [a[i] + b[i] for i in range(size)]
    t2 = time.time()
    return (t2 - t1)

def addtion_using_numpy():
    t1 = time.time()
    a = np.arange(size)
    b = np.arange(size)
    c = a + b
    t2 = time.time()
    return (t2 - t1)

t_list = addition_using_list()
t_numpy = addtion_using_numpy()

print('List = ', t_list * 1000) ## Time in Milliseconds
print('Numpy = ', t_numpy * 1000) ## Time in Milliseconds

List =  88.17481994628906
Numpy =  1.1360645294189453


### Why are Numpy arrays fast?
In Numpy Array operations take place in chunks rather than element wise.

For example in case of adding respective elements of two lists, the addition takes place in chunks and not one element at a time.

**What is vectorization?**

"Vectorization" (simplified) is the process of rewriting a loop so that instead of processing a single element of an array N times, it processes (say) 4 elements of the array simultaneously N/4 times.

Source: https://stackoverflow.com/questions/1422149/what-is-vectorization

### How to create Numpy arrays?

In [5]:
import numpy as np # np becomes alias for numpy
# Syntax for numpy array -> np.array(arraylike, .., )

a = [1, 2, 3]
b = np.array(a)

print(b)
print(type(b))

[1 2 3]
<class 'numpy.ndarray'>


In [6]:
a = [1, 2, 3, '5', 'a']
b = np.array(a) # Numpy array can contain only homogenous elements ie elements of same type

# Hence here each element converts to string

print(b)

['1' '2' '3' '5' 'a']


In [7]:
a = [1, 2, 3, '5']
b = np.array(a, dtype = int) # We can specify the data-type for the numpy array elements

print(b)

[1 2 3 5]


In [8]:
a = [1, 2, 3, '5']

b = np.array(a*3) # A numpy array with 3 copies

print(b)

['1' '2' '3' '5' '1' '2' '3' '5' '1' '2' '3' '5']


In [9]:
b = np.ones(3)
b

array([1., 1., 1.])

In [10]:
b = np.ones(3, dtype = int)
b

array([1, 1, 1])

In [11]:
b = np.zeros(3)
b

array([0., 0., 0.])

In [12]:
b = np.zeros(3, dtype = int)
b

array([0, 0, 0])

In [13]:
b = np.empty(2)
b

array([-5.73021895e-300,  6.92998926e-310])

### 2 D Array

In [14]:
b = np.ones((2, 4), dtype = int)
b

array([[1, 1, 1, 1],
       [1, 1, 1, 1]])

In [15]:
b = np.full(3, 5) # dtype be default integer
b

array([5, 5, 5])

In [16]:
b = np.full(3, 5, dtype = float)
b

array([5., 5., 5.])

In [17]:
b = np.full((3, 4), 5)
b

array([[5, 5, 5, 5],
       [5, 5, 5, 5],
       [5, 5, 5, 5]])

### numpy.arange

**numpy.arange([start, ]stop, [step, ]dtype=None)** 

Return evenly spaced values within a given interval. Values are generated within the half-open interval [start, stop).

For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list. When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use numpy.linspace for these cases.

**Parameters**

***start : number, optional***

Start of interval. The interval includes this value. **The default start value is 0.**

***stop : number***

End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out.

***step : number, optional***

Spacing between values. For any output out, this is the distance between two adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified as a position argument, start must also be given.

***dtype : dtype***

The type of the output array. If dtype is not given, infer the data type from the other input arguments.

**Returns**

***arange : ndarray***

Array of evenly spaced values.

For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule may result in the last element of out being greater than stop.

In [18]:
b = np.arange(10)
b

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [19]:
b = np.arange(2, 10)
b

array([2, 3, 4, 5, 6, 7, 8, 9])

In [20]:
b = np.arange(2, 20, 2)
b

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

### numpy.linspace

**numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)**

Return evenly spaced numbers over a specified interval.

Returns num evenly spaced samples, calculated over the interval [start, stop].

The endpoint of the interval can optionally be excluded.

**Parameters**

***start : array_like***

The starting value of the sequence.

***stop : array_like***

The end value of the sequence, unless endpoint is set to False. In that case, the sequence consists of all but the last of num + 1 evenly spaced samples, so that stop is excluded. Note that the step size changes when endpoint is False.

***num : int, optional***

Number of samples to generate. Default is 50. Must be non-negative.

***endpoint : bool, optional***

If True, stop is the last sample. Otherwise, it is not included. Default is True.

***retstep : bool, optional***

If True, return (samples, step), where step is the spacing between samples.

***dtype : dtype, optional***

The type of the output array. If dtype is not given, infer the data type from the other input arguments.

***axis : int, optional***

The axis in the result to store the samples. Relevant only if start or stop are array-like. By default (0), the samples will be along a new axis inserted at the beginning. Use -1 to get an axis at the end.


**Returns:**

***samples : ndarray***

There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).

***step : float, optional***

Only returned if retstep is True. Size of spacing between samples.

In [21]:
b = np.linspace(2, 10)
b

array([ 2.        ,  2.16326531,  2.32653061,  2.48979592,  2.65306122,
        2.81632653,  2.97959184,  3.14285714,  3.30612245,  3.46938776,
        3.63265306,  3.79591837,  3.95918367,  4.12244898,  4.28571429,
        4.44897959,  4.6122449 ,  4.7755102 ,  4.93877551,  5.10204082,
        5.26530612,  5.42857143,  5.59183673,  5.75510204,  5.91836735,
        6.08163265,  6.24489796,  6.40816327,  6.57142857,  6.73469388,
        6.89795918,  7.06122449,  7.2244898 ,  7.3877551 ,  7.55102041,
        7.71428571,  7.87755102,  8.04081633,  8.20408163,  8.36734694,
        8.53061224,  8.69387755,  8.85714286,  9.02040816,  9.18367347,
        9.34693878,  9.51020408,  9.67346939,  9.83673469, 10.        ])

In [22]:
print(b[2] - b[1]) # Checking step
print(b[4] - b[3])

0.16326530612244872
0.16326530612244872


In [23]:
b = np.linspace(2, 10, 5, dtype = int) # endpoint by default included
b

array([ 2,  4,  6,  8, 10])

In [24]:
b = np.linspace(2, 10, 5, dtype = int, endpoint = False)
b

array([2, 3, 5, 6, 8])

In [25]:
b = np.identity(3) # Only get square matrices (n*n)
b

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [26]:
b = np.eye(3, 4) # Can get n*m matrices
b

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.]])

In [27]:
b = np.random.rand(3) # Created over range (0,1)
b

array([0.41027713, 0.93679443, 0.37259974])

In [28]:
b = np.random.rand(3)
b

array([0.79838007, 0.89557487, 0.18949808])

In [29]:
b = np.random.rand(2,3)
b

array([[0.10604869, 0.08761069, 0.25925387],
       [0.1693007 , 0.26361257, 0.16913813]])

In [30]:
b = np.random.rand(10)*10
b

array([3.97257597, 4.39982424, 3.97121746, 0.08536256, 9.56271873,
       9.8538392 , 1.543617  , 4.53054071, 9.12957621, 5.7124102 ])

In [31]:
np.random.randint(6, size=10)

array([4, 1, 2, 5, 4, 5, 5, 3, 0, 3])

In [32]:
np.random.randint(6, size=(4,3))

array([[1, 4, 4],
       [5, 3, 0],
       [2, 4, 0],
       [1, 0, 2]])