# NumPy

## Import the package

In [1]:
import numpy as np

## Creating a numpy array

### Using Python `array_like` objects

In [166]:
a_list = [1, 2, 3, 4, 5]

In [167]:
a_list

[1, 2, 3, 4, 5]

In [168]:
an_array = np.array (a_list)

In [169]:
an_array 

array([1, 2, 3, 4, 5])

In [170]:
type (an_array)

<class 'numpy.ndarray'>

- Unlike `list` in addition to soring the values that make up the elements, a header containing some of the book keeping info is also stored

In [7]:
an_array.shape # The dimension of the ndarray

(5,)

In [8]:
an_array.size # Total number of elements

5

### Using list of lists

In [18]:
ndarray = np.array ([
    [1, 2, 3],
    [11, 12, 13],
    [21, 22, 23],
])

In [10]:
ndarray

array([[ 1,  2,  3],
       [11, 12, 13],
       [21, 22, 23]])

### Properties

In [11]:
ndarray.shape

(3, 3)

In [12]:
ndarray.size

9

In [13]:
ndarray.ndim

2

In [16]:
ndarray.dtype

dtype('int32')

### Using tuples

In [19]:
ndarray = np.array ((1, 2, 3))

In [20]:
ndarray.shape

(3,)

In [21]:
ndarray

array([1, 2, 3])

### Mixture of tuples and lists

In [23]:
ndarray = np.array ([[1, 2, 3], (4, 5, 6), [7, 8]])

In [24]:
ndarray

array([list([1, 2, 3]), (4, 5, 6), list([7, 8])], dtype=object)

### Intrinsic creation
- Functions defined to create an ndarray

#### Zeros

In [407]:
np.zeros ((3, 4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [27]:
np.zeros ((3, 3), dtype=np.int32)

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [28]:
np.zeros ((4, 4), dtype=str)

array([['', '', '', ''],
       ['', '', '', ''],
       ['', '', '', ''],
       ['', '', '', '']], dtype='<U1')

Ones

Similar to `np.zeros`

In [31]:
np.ones ((3, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

Question: How would you create a matrix of 1's of which has the same dtype as `10j`?

In [33]:
np.ones ((3, 3), dtype=complex)

array([[1.+0.j, 1.+0.j, 1.+0.j],
       [1.+0.j, 1.+0.j, 1.+0.j],
       [1.+0.j, 1.+0.j, 1.+0.j]])

In [409]:
a = np.array ([
    [1, 2, 3],
    [4, 5, 6]
])

In [410]:
a.dtype

dtype('int32')

In [418]:
a [0] [0]= '1'

In [416]:
a

array([[1, 1, 1],
       [4, 5, 6]])

#### Arange

In [35]:
np.arange (10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [38]:
np.arange (10, 20, 2)

array([10, 12, 14, 16, 18])

#### Linspace

Its used to get `n` number of points between an interval

In [419]:
np.linspace (0, 1, 1000)

array([0.        , 0.001001  , 0.002002  , 0.003003  , 0.004004  ,
       0.00500501, 0.00600601, 0.00700701, 0.00800801, 0.00900901,
       0.01001001, 0.01101101, 0.01201201, 0.01301301, 0.01401401,
       0.01501502, 0.01601602, 0.01701702, 0.01801802, 0.01901902,
       0.02002002, 0.02102102, 0.02202202, 0.02302302, 0.02402402,
       0.02502503, 0.02602603, 0.02702703, 0.02802803, 0.02902903,
       0.03003003, 0.03103103, 0.03203203, 0.03303303, 0.03403403,
       0.03503504, 0.03603604, 0.03703704, 0.03803804, 0.03903904,
       0.04004004, 0.04104104, 0.04204204, 0.04304304, 0.04404404,
       0.04504505, 0.04604605, 0.04704705, 0.04804805, 0.04904905,
       0.05005005, 0.05105105, 0.05205205, 0.05305305, 0.05405405,
       0.05505506, 0.05605606, 0.05705706, 0.05805806, 0.05905906,
       0.06006006, 0.06106106, 0.06206206, 0.06306306, 0.06406406,
       0.06506507, 0.06606607, 0.06706707, 0.06806807, 0.06906907,
       0.07007007, 0.07107107, 0.07207207, 0.07307307, 0.07407

In [42]:
np.linspace (1, 10, 15)

array([ 1.        ,  1.64285714,  2.28571429,  2.92857143,  3.57142857,
        4.21428571,  4.85714286,  5.5       ,  6.14285714,  6.78571429,
        7.42857143,  8.07142857,  8.71428571,  9.35714286, 10.        ])

Question: Could you return the step size, i.e. the diff between two successive elements along with the samples
Hint: Check the docstring of the function

In [43]:
np.linspace (1, 10, 15, retstep=True)

(array([ 1.        ,  1.64285714,  2.28571429,  2.92857143,  3.57142857,
         4.21428571,  4.85714286,  5.5       ,  6.14285714,  6.78571429,
         7.42857143,  8.07142857,  8.71428571,  9.35714286, 10.        ]),
 0.6428571428571429)

What if you'd like to create a matrix?

In [67]:
a = np.linspace ([1, 2, 3], [10, 20, 30], num=10, axis=0) # <--- Note the keyword argument `axis`

In [68]:
a

array([[ 1.,  2.,  3.],
       [ 2.,  4.,  6.],
       [ 3.,  6.,  9.],
       [ 4.,  8., 12.],
       [ 5., 10., 15.],
       [ 6., 12., 18.],
       [ 7., 14., 21.],
       [ 8., 16., 24.],
       [ 9., 18., 27.],
       [10., 20., 30.]])

In [69]:
a.shape

(10, 3)

Question: Can you create 2d array with the shape `(4, 10)`? \
(Without changing num i.e `num=10`, you could use different starting and ending points)

In [73]:
a = np.linspace ([1, 2, 3, 4], [10, 20, 30, 40], num=10, axis=1)
a

array([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
       [ 2.,  4.,  6.,  8., 10., 12., 14., 16., 18., 20.],
       [ 3.,  6.,  9., 12., 15., 18., 21., 24., 27., 30.],
       [ 4.,  8., 12., 16., 20., 24., 28., 32., 36., 40.]])

In [74]:
a.shape

(4, 10)

Another way of doing it

In [75]:
np.linspace ([1, 2, 3, 4], [10, 20, 30, 40], num=10).T

array([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
       [ 2.,  4.,  6.,  8., 10., 12., 14., 16., 18., 20.],
       [ 3.,  6.,  9., 12., 15., 18., 21., 24., 27., 30.],
       [ 4.,  8., 12., 16., 20., 24., 28., 32., 36., 40.]])

#### Indices

In [78]:
np.indices ((2, 2))

array([[[0, 0],
        [1, 1]],

       [[0, 1],
        [0, 1]]])

In [77]:
row_idx, col_idx = np.indices ((2, 2))

In [82]:
row_idx

array([[0, 0],
       [1, 1]])

In [83]:
col_idx

array([[0, 1],
       [0, 1]])

In [81]:
5*row_idx + 4*col_idx

array([[0, 4],
       [5, 9]])

## Dtypes
- Can read more about it [here](https://docs.scipy.org/doc/numpy-1.10.4/reference/arrays.dtypes.html)

Some of the most commonly used numpy dtypes are: 
- 'float'
- 'int'
- 'bool'
- 'str' '
- 'object'

Question: What's the dtype of `ndarray`? (using mixed type)
```python
ndarray = np.array ([[1, 2, 3], (4, 5, 6), [7, 8]])
```

In [2]:
np.array ([[1, 2, 3], (4, 5, 6), [7, 8]])

array([list([1, 2, 3]), (4, 5, 6), list([7, 8])], dtype=object)

Question: What will be the type of each of the elements?

In [14]:
np.array (['string', 1, 10.2])

array(['string', '1', '10.2'], dtype='<U6')

What if you want to convert it from one type to another?

In [156]:
a = np.array ((range (10), range (10, 20)))

In [157]:
a

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

In [160]:
a.astype ('str')

array([['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],
       ['10', '11', '12', '13', '14', '15', '16', '17', '18', '19']],
      dtype='<U11')

In [159]:
a.astype ('float')

array([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14., 15., 16., 17., 18., 19.]])

## Operations

### Array and Scalar

Let's say we'd like to square every element

In [95]:
a = list (range (10))

In [96]:
a

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Question: Can you try to do this with list? (Don't use explicit for loops)

In [None]:
def fn (x):
    return x**2

In [97]:
list (map (lambda x: x**2, a))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Now using numpy

In [98]:
np.array (a) ** 2

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81], dtype=int32)

Similarly you can perform other arithmetic operations

### Array and Array

This is not limited to just operations involving an array and a scalar.. 

In [117]:
a = range (1000)
b = range (1000, 2000)

In [118]:
a, b

(range(0, 1000), range(1000, 2000))

Question: Can you write a piece of code to add the corresponding elements? \
Hint: Could use `zip`

In [138]:
c = []
for v1, v2 in zip (a, b):
    c.append (v1 + v2)

c [:100]

[1000, 1002, 1004, 1006, 1008, 1010, 1012, 1014, 1016, 1018, 1020, 1022, 1024, 1026, 1028, 1030, 1032, 1034, 1036, 1038, 1040, 1042, 1044, 1046, 1048, 1050, 1052, 1054, 1056, 1058, 1060, 1062, 1064, 1066, 1068, 1070, 1072, 1074, 1076, 1078, 1080, 1082, 1084, 1086, 1088, 1090, 1092, 1094, 1096, 1098, 1100, 1102, 1104, 1106, 1108, 1110, 1112, 1114, 1116, 1118, 1120, 1122, 1124, 1126, 1128, 1130, 1132, 1134, 1136, 1138, 1140, 1142, 1144, 1146, 1148, 1150, 1152, 1154, 1156, 1158, 1160, 1162, 1164, 1166, 1168, 1170, 1172, 1174, 1176, 1178, 1180, 1182, 1184, 1186, 1188, 1190, 1192, 1194, 1196, 1198]

In [141]:
c1 = np.array (a) + np.array (b)
c1 [:100]

array([1000, 1002, 1004, 1006, 1008, 1010, 1012, 1014, 1016, 1018, 1020,
       1022, 1024, 1026, 1028, 1030, 1032, 1034, 1036, 1038, 1040, 1042,
       1044, 1046, 1048, 1050, 1052, 1054, 1056, 1058, 1060, 1062, 1064,
       1066, 1068, 1070, 1072, 1074, 1076, 1078, 1080, 1082, 1084, 1086,
       1088, 1090, 1092, 1094, 1096, 1098, 1100, 1102, 1104, 1106, 1108,
       1110, 1112, 1114, 1116, 1118, 1120, 1122, 1124, 1126, 1128, 1130,
       1132, 1134, 1136, 1138, 1140, 1142, 1144, 1146, 1148, 1150, 1152,
       1154, 1156, 1158, 1160, 1162, 1164, 1166, 1168, 1170, 1172, 1174,
       1176, 1178, 1180, 1182, 1184, 1186, 1188, 1190, 1192, 1194, 1196,
       1198])

Lets see how fast numpy is

In [142]:
%%timeit
c = []
for v1, v2 in zip (a, b):
    c.append (v1 + v2)

356 µs ± 32.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [143]:
%%timeit
c1 = np.array (a) + np.array (b)

317 µs ± 1.62 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


We see that operations on arrays are faster. But what about space occupied?

In [126]:
from sys import getsizeof

In [144]:
getsizeof (c), getsizeof (c1)

(9024, 4096)

Question: Can you get the size of int?

In [152]:
# This is wrong
getsizeof (int)

400

In [153]:
getsizeof (int ())

24

Matrix multiplication
- Recall the previous session

In [175]:
# 3x3 matrix
x = [[12,7,3],
    [4 ,5,6],
    [7 ,8,9]]

# 3x4 matrix
y = [[5,8,1,2],
    [6,7,3,0],
    [4,5,9,1]]

What if u'd like to multiple these?

In [181]:
# result is 3x4
[[sum (a*b for a, b in zip (x_row, y_col)) for y_col in zip (*y)] for x_row in x]

[[114, 160, 60, 27], [74, 97, 73, 14], [119, 157, 112, 23]]

In [177]:
np.array (x).dot (np.array (y))

array([[114, 160,  60,  27],
       [ 74,  97,  73,  14],
       [119, 157, 112,  23]])

Its not just easy.. but also faster

In [187]:
%%timeit
[[sum (a*b for a, b in zip (x_row, y_col)) for y_col in zip (*y)] for x_row in x]

32.6 µs ± 6.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [188]:
%%timeit
np.array (x).dot (np.array (y))

7.04 µs ± 902 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


## Indexing numpy arrays

In [200]:
x = [[12,7,3],
    [4 ,5,6],
    [7 ,8,9]]

In [201]:
X = np.array (x)

In [202]:
X [2, 0] # Can also use x [2] [0]

7

In [217]:
X [:2, :2]

array([[12,  7],
       [ 4,  5]])

If there are n dimensions, you could use n comma seperated values

Question: How would you reverse this list without using `reverse`? \
Hint: Recollect `slices`

In [222]:
a = [1, 2, 3, 4, 5]

In [223]:
a [::-1]

[5, 4, 3, 2, 1]

Question: Can you reverse only the column in the following array?

In [224]:
a = np.array ([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [225]:
a [:, ::-1]

array([[3, 2, 1],
       [6, 5, 4],
       [9, 8, 7]])

In [228]:
a [::-1]

array([[7, 8, 9],
       [4, 5, 6],
       [1, 2, 3]])

In [229]:
a [::-1, ::-1]

array([[9, 8, 7],
       [6, 5, 4],
       [3, 2, 1]])

Using another array or a sequence like object (list/tuple...) to index

In [332]:
a = np.arange (100, 120)

In [333]:
a

array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,
       113, 114, 115, 116, 117, 118, 119])

In [334]:
idx = np.arange (0, 20, 3)

In [335]:
idx

array([ 0,  3,  6,  9, 12, 15, 18])

In [336]:
a [idx]

array([100, 103, 106, 109, 112, 115, 118])

### Boolean Array

In [203]:
a = np.array ([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [212]:
bool_arr = a > 5
bool_arr

array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True]])

In [213]:
a [bool_arr] # Filters out the required values

array([6, 7, 8, 9])

This could act as a mask to filter values

Question: Can we use `bool_arr` to filter values in `b`?

In [207]:
b = np.array ([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90]
])

In [214]:
b [bool_arr]

array([60, 70, 80, 90])

**Note:** It should have the same dimensions

In [273]:
a = np.array ([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

Question: Can you do this?

```python
a [0, 0] = '1'
```

What about this?
```python
a [0, 0] = 'string'
```

In [284]:
a [0, 0] = '1'

In [285]:
a [0, 0] = 'string'

ValueError: invalid literal for int() with base 10: 'string'

## Getting the list back

In [196]:
a = np.array ([1, 2, 3, 4, 5])

In [197]:
a

array([1, 2, 3, 4, 5])

In [199]:
a.tolist (), type (a.tolist ())

([1, 2, 3, 4, 5], <class 'list'>)

## Missing and Infinite values

In [231]:
np.nan, np.inf

(nan, inf)

In [251]:
a = np.array ([
    [1, 2, 3],
    [4, 5, np.inf],
    [7,  np.nan, 9]
], dtype=float)

In [253]:
a

array([[ 1.,  2.,  3.],
       [ 4.,  5., inf],
       [ 7., nan,  9.]])

In [254]:
np.isnan (a)

array([[False, False, False],
       [False, False, False],
       [False,  True, False]])

In [255]:
np.isinf (a)

array([[False, False, False],
       [False, False,  True],
       [False, False, False]])

## Reshaping an array

In [286]:
a = np.array ([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [337]:
bool_arr = np.arange (1, 10) > 5

In [338]:
bool_arr

array([False, False, False, False, False,  True,  True,  True,  True])

Question: Will this work?

In [289]:
a [bool_arr]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 3 but corresponding boolean dimension is 9

It should have the same shape

In [420]:
a.shape

(2, 3)

In [294]:
a [bool_arr.reshape (a.shape)]

array([6, 7, 8, 9])

## Flatten

In [295]:
a = np.array ([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [300]:
flat_a = a.flatten ()
flat_a

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Flattens an n-dimensional array to 1d

## Ravel

In [302]:
ravel_a = a.ravel ()
ravel_a

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

This difference between the two is `flatten` returns a copy but `ravel` references the original 

In [303]:
flat_a [0] = 100

In [304]:
a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [305]:
ravel_a [0] = 100

In [306]:
a

array([[100,   2,   3],
       [  4,   5,   6],
       [  7,   8,   9]])

## Unique elements

In [308]:
a = np.array ([1, 2, 3, 4, 1, 2, 4, 5, 6])

In [312]:
np.unique (a)

array([1, 2, 3, 4, 5, 6])

## Random functions

In [426]:
np.random.seed (1)
np.random.rand (2, 3)

array([[4.17022005e-01, 7.20324493e-01, 1.14374817e-04],
       [3.02332573e-01, 1.46755891e-01, 9.23385948e-02]])

In [None]:
np.random.seed (1) # Use this if you want to reproduce the same random numbers

In [None]:
np.random.ran

In [374]:
np.random.randint (1, 10, 5)

array([3, 5, 8, 8, 2])

In [375]:
np.random.random () # A random number between 0 and 1

0.14038693859523377

In [428]:
np.random.choice (['apple', 'ball', 'cat' ,'dog']) # Picks randomly from the given list

'dog'

In [429]:
np.random.choice (['apple', 'ball', 'cat' ,'dog'], size=10, p=[1, 0., 0., 0.]) # Specifing the probablity

array(['apple', 'apple', 'apple', 'apple', 'apple', 'apple', 'apple',
       'apple', 'apple', 'apple'], dtype='<U5')

## Broadcasting

In [360]:
a = np.array ([
    [1, 2, 3],
    [4, 5, 6],
])

In [349]:
b = np.array ([
    [10, 20, 30],
    [10, 20, 30]
]) 

In [None]:
np.array ([2, 2, 2 ])

In [430]:
b + 2

array([12, 22, 32])

In [350]:
a.shape, b.shape

((2, 3), (3,))

In [351]:
a + b

array([[11, 22, 33],
       [14, 25, 36]])

How did this work? 
- Internally the array b is reshaped to (1, 3)
- a - (2, 3)
- b - (1, 3)
- b is then replicated (twice in this case) - (2, 3)
- Result (2, 3)

In [358]:
np.array ((b, b))

array([[10, 20, 30],
       [10, 20, 30]])

In [361]:
a + np.array ((b, b))

array([[11, 22, 33],
       [14, 25, 36]])

**Broadcastin Rule**
- The dimensions should be equal
- Atleast of the dimensions should be 1

In [362]:
x = np.arange (4)
xx = x.reshape ((4,1))

y = np.arange (10, 15)

z = np.ones ((3,4))

In [363]:
x.shape, y.shape, xx.shape, z.shape

((4,), (5,), (4, 1), (3, 4))

In [346]:
x

array([0, 1, 2, 3])

In [347]:
y

array([10, 11, 12, 13, 14])

**Question:** Will this work?

In [344]:
x + y

ValueError: operands could not be broadcast together with shapes (4,) (5,) 

In [345]:
xx + y

array([[10, 11, 12, 13, 14],
       [11, 12, 13, 14, 15],
       [12, 13, 14, 15, 16],
       [13, 14, 15, 16, 17]])

(4, 1) - Broadcast along the column\
(1, 5) - Broadcast along the row\
(4, 5) - Result after broadcasting

## Exercise 1

The program will first randomly generate a number unknown to the user. The user needs to guess what that number is. (In other words, the user needs to be able to input information.) If the user’s guess is wrong, the program should return some sort of indication as to how wrong (e.g. The number is too high or too low). If the user guesses correctly, a positive indication should appear. You’ll need functions to check if the user input is an actual number, to see the difference between the inputted number and the randomly generated numbers, and to then compare the numbers.

Concepts to keep in mind:

- Random function
- Variables
- Integers
- Input/Output
- Print
- While loops
- If/Else statements


In [398]:
# Generate a random number
rand_no = int (np.random.randint (100, size=1))

# Run this untll the user guess the number correctly or exits
while True:
    # Ask the user to make a guess
    user_input = input ('Make a guess')
    
    # Convert the input to string
    guessed_value = int (user_input)
    
    # Info string
    info = 'Your guess is {0} than the actual number'
    
    # Terminating condition
    if user_input == 'exit':
        break
    elif guessed_value == rand_no:
        print ('Got it!!!')
    elif guessed_value > rand_no:
        print (info.format ('higher'))
    else:
        print (info.format ('lower'))

Make a guess 10


Your guess is lower than the actual number


Make a guess 100


Your guess is higher than the actual number


Make a guess 50


Your guess is lower than the actual number


Make a guess 60


Your guess is lower than the actual number


Make a guess 70


Your guess is lower than the actual number


Make a guess 80


Your guess is higher than the actual number


Make a guess 75


Your guess is higher than the actual number


Make a guess 73


Your guess is lower than the actual number


Make a guess 74
