# Numpy II

## 1. Generating array with random numbers

Numpy library has a sub-module called 'random', which is used to generate random numbers for a given distribution. It is especially useful for randomly sampling data for specific experiments.

Two functions - rand() and randint() can be studied. These functions are most common and intuitive in terms of usage.
* **rand()** function takes an integer as an argument and generates a given number of random values between 0 and 1. The values generated by this function are floating point values.
* **randint()** function takes 3 parameters - lower limit, upper limit and number of values to be generated. As the name suggests, it generates a given number of random integers in a specified range.

Examples:

```python
np.random.rand(5)
>>> array([ 0.93371582,  0.82386466,  0.34771991,  0.59338646,  0.41190981])

np.random.randint(1000.20,5000.50,10)
>>> array([4825, 1466, 4025, 2931, 1693, 2385, 2857, 1767, 2902, 1759])
```


### Exercise

Generate an array of 5 floating point values using numpy's random functions. Each value in the array should be greater than 1. (Pls use seed value)

In [1]:
import numpy as np

# Modify the code below
mixed_array = []

# hint

Use both the rand() and randint() functions and sum them to create an array of random floating values greater than 1. Make sure to set a value greater than 1 for the lower limit of values for randint() function. Use np.random.seed(0) to fix a seed value.

In [2]:
# solution
np.random.seed(0)
floats = np.random.rand(5)
ints = np.random.randint(1,5,5)

mixed_array = ints + floats
print(mixed_array)

[2.5488135  3.71518937 1.60276338 4.54488318 3.4236548 ]


In [3]:
from refactored import unittest
np.random.seed(0)

ref_tmp_var = False
value_check = True

for i in mixed_array:
    if i<=1:
        value_check = False

ref_tmp_var = unittest.test_value(np.random.rand(5)) and value_check
assert ref_tmp_var

## 2. Re-shaping an array

The reshape() function in numpy helps us reshape a given array into an array with a specified new shape. For example,

<img src="../images/numpy_2-reshaping_array.png" width="500">

```python
shape_shifter = np.random.rand(12)
shape_shifter
>>> array([ 0.906423  ,  0.55807204,  0.28928162,  0.47020116,  0.27403332,
>>>         0.94178672,  0.81342077,  0.5859645 ,  0.63569185,  0.84614272,
>>>         0.36454835,  0.63664789])

shape_shifter.shape
>>> (12,)

shape_shifter.reshape(3,4)
>>> array([[ 0.906423  ,  0.55807204,  0.28928162,  0.47020116],
>>>        [ 0.27403332,  0.94178672,  0.81342077,  0.5859645 ],
>>>        [ 0.63569185,  0.84614272,  0.36454835,  0.63664789]])

shape_shifter.reshape(4,3)
>>> array([[ 0.906423  ,  0.55807204,  0.28928162],
>>>        [ 0.47020116,  0.27403332,  0.94178672],
>>>        [ 0.81342077,  0.5859645 ,  0.63569185],
>>>        [ 0.84614272,  0.36454835,  0.63664789]])
```

### Exercise

Change the shape of the given array to 2 rows, 5 columns

In [4]:
# Modify the code below

twor_fivec = np.arange(10)

# hint

Use the **reshape()** function and refer to the usage above

In [5]:
twor_fivec = twor_fivec.reshape(2,5)
twor_fivec

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [6]:
ref_tmp_var = False

ref_tmp_var = unittest.test_value(np.arange(10).reshape(2,5))

assert ref_tmp_var

## 3. Indexing and Selection

The numpy array works like the list data structure and elements can be accessed by using their respective indices. The first element of an array is indexed with a '0' index and subsequent elements are indexed as 1,2,3...and so on, the nth element in the array will have an index of 'n-1'.

<img src="../images/numpy_2-indexing_array.png" width="400">

``` python
n_arr = np.array([1, 7, 4, 3, 3])

n_arr[3:5]
# Selects elements from index '3' to '4' (i.e until, but not including the specified end value)
>>> array([3, 3])

n_arr[:3]
# Absence of start value defaults to index '0' (i.e the first element)
>>> array([1, 7, 4])

n_arr[2:]
# Absence of end value defaults to index 'n-1' (i.e the last element)
>>> array([4, 3, 3])

n_arr[:]
>>> array([1, 7, 4, 3, 3])

n_arr[-1]
# Negative indexing corresponds to counting from the last
>>> 3
```

Elements of a numpy array can also be selected (or conditionally retrieved)  by using a condition in place of an index. When an array is subject to a condition (as we will show below), each element of the array will be validated against the said condition and a boolean array is generated which reflects the satisfaction of the set condition by every element of the array. When an 'array condition' is used in place of an index, the boolean array so generated gets passed to the outer array, and all elements which lie in the 'True' positions of the boolean array get retrieved. The below examples will clear this concept.

```python
array_one = np.array([16,  1,  8,  1, 17, 10,  8, 15,  6, 14])
array_one > 10
>>> array([ True, False, False, False, True, False, False, True, False, True])

array_one[array_one>10]
>>> array([16, 17, 15, 14])

array_two = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
array_two < 5
>>> array([ True,  True,  True,  True,  True, False, False, False, False, False])

array_two[array_two < 5]
>>> array([0, 1, 2, 3, 4])
```

### Exercise

Retrieve all elements in the given array that are greater than or equal to 25.63

In [7]:
array_three = np.array([46.56311588, 49.66285409, 28.01145694, 15.4632352, 16.36194605, 23.26915095, 36.77562698, 41.97868793, 35.6520983, 24.85098496])

# hint

Refer to the examples above

In [8]:
array_three[array_three >= 25.63]

array([46.56311588, 49.66285409, 28.01145694, 36.77562698, 41.97868793,
       35.6520983 ])

In [9]:
ref_tmp_var = False

ref_tmp_var = unittest.test_value(array_three[array_three >= 25.63])

assert ref_tmp_var

## 4. Re-casting, Broadcasting and Duplicating arrays

Re-casting and broadcasting are two ways to change the values of an array. If one or more values (but not all) of an array are changed, it is called **re-casting**. If all values of the array are changed, it would be called as **broadcasting**. The above scenario where we conditionally extracted elements of array could be modified to conditionally re-cast certain elements of an array. Refer to the below examples:

<img src="../images/numpy_2-recasting_array.png" width="600">

* <b>Re-casting:</b>
```python
array_rec = np.array([16,  1,  8,  1, 17, 10,  8, 15,  6, 14])
array_rec[3:6] = 100
array_rec
>>> array([ 16,   1,   8, 100, 100, 100,   8,  15,   6,  14])
```

* <b>Broadcasting:</b>
```python
array_rec = np.array([16,  1,  8,  1, 17, 10,  8, 15,  6, 14])
array_rec[:] = 100
array_rec
>>> array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100])
```

* <b>Conditional re-casting:</b>
```python
array_rec = np.array([16,  1,  8,  1, 17, 10,  8, 15,  6, 14])
array_rec[array_rec>10] = 100
array_rec
>>> array([100,   1,   8,   1, 100,  10,   8, 100,   6, 100])
```

**Duplicating** a numpy array is a tricky thing. As per normal programming routines, the value of a variable can be assigned to another variable, thus creating a copy. See example below:
```python
a = 10
b = 10
print(b)
>>> 10
```
However, when the same logic is used in assigning arrays, the values are not assigned but rather the pointers (or addresses) of original array elements are stored in the new array. It is for this reason that, any change in the second array will also reflect in the first array.
```python
arr_1 = np.array([1,2,3,4,5,6,7,8,9])
arr_2 = arr_1
arr_2[3:6] = 4444
print(arr_1,arr_2)
>>> [1, 2, 3, 4444, 4444, 4444, 7, 8, 9] [1, 2, 3, 4444, 4444, 4444, 7, 8, 9]

or

arr_1 = np.array([1,2,3,4,5,6,7,8,9])
arr_2 = arr_1
arr_2[:] = [1,22,333,4444,55555,666666,7777777,88888888,999999999]
print(arr_1,arr_2)
>>> [1, 22, 333, 4444, 55555, 666666, 7777777, 88888888, 999999999] [1, 22, 333, 4444, 55555, 666666, 7777777, 88888888,
>>>  999999999]
```

Hence, when a separate copy of an array is to be made, then the .copy() function needs to be used so as to create a new copy of the array which can be changed, without affecting the original array.

```python
arr_1 = np.array([1,2,3,4,5,6,7,8,9])
arr_2 = arr_1.copy()
arr_2[3:6] = 4444
print(arr_1,arr_2)
>>> [1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3, 4444, 4444, 4444, 7, 8, 9]
```

### Exercise

Given array 'tran_arr', create two copies of the array - one copy just referencing the values of the original array ('tran_arr') and another copy which duplicates the values of the array using .copy() function. Set 5th element of first copy array to 25, and set 5th element of second copy array to 50. Print all 3 arrays and observe the changes. 

In [10]:
tran_arr = np.arange(1,21)
copy_1 = []
copy_2 = []

# hint

In [11]:
copy_1 = tran_arr
copy_2 = tran_arr.copy()
copy_1[4] = 25
copy_2[4] = 50
print(tran_arr,"\n",copy_1,"\n",copy_2)

[ 1  2  3  4 25  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20] 
 [ 1  2  3  4 25  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20] 
 [ 1  2  3  4 50  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]


In [12]:
ref_tmp_var = False

a1 = [1, 2, 3, 4, 25, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
a2 = [1, 2, 3, 4, 50, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

pass_val = False
if np.array_equal(a1,copy_1) and np.array_equal(a2,copy_2):
    pass_val = True

ref_tmp_var = pass_val

assert ref_tmp_var

## 5. Max, Min, ArgMax, ArgMin

Four simple functions that help a great deal when performing numerical computations on a large array of data is max(), min(), argmax() and argmin().

* max() - can be used to find out what is the maximum value in a given array
* min() - can be used to find out what is the minimum value in a given array
* argmax() - can be used to find out what is the index position of the maximum value in the given array
* argmin () - can be used to find out what is the index position of the minimum value in the given array

```python
shape_shifter
>>> array([ 0.906423  ,  0.55807204,  0.28928162,  0.47020116,  0.27403332,
>>>         0.94178672,  0.81342077,  0.5859645 ,  0.63569185,  0.84614272,
>>>         0.36454835,  0.63664789])

shape_shifter.max()
>>> 0.94178671566784411

shape_shifter.min()
>>> 0.27403331882439208

shape_shifter.argmax()
>>> 5

shape_shifter.argmin()
>>> 4
```

### Exercise

An array is created below. Use the max, min, argmax and argmin functions on the given array and print the results out

In [13]:
# Edit the code below

X = np.array([70, 81, 80, 55, 48, 17, 60, 80, 20, 46])
# max_X = 
# min_X = 
# argmax_X = 
# argmin_X = 

# hint

In [14]:
max_X = X.max()
min_X = X.min()
argmax_X = X.argmax()
argmin_X = X.argmin()
print("Max value is %d,\nMin value is %d,\nMax value index is %d,\nMin value index is %d"
      %(max_X,min_X,argmax_X,argmin_X))

Max value is 81,
Min value is 17,
Max value index is 1,
Min value index is 5


In [15]:
ref_tmp_var = False

ref_tmp_var = unittest.test_value(X.max()) and unittest.test_value(X.min()) and unittest.test_value(X.argmax()) and unittest.test_value(X.argmin())

assert ref_tmp_var