# Advanced Numpy

In [1]:
import numpy as np

## 1. Universal functions

- Python is a dynamically-typed language, which means the data type is not explicitly declared when creating a variable.

```C
/* C code */
int result = 0;
for(int i=0; i<100; i++){
    result += i;
}
```

```python
# Python code
result = 0
for i in range(100):
    result += i
```

- With this mechanism, we can assign any type of data into a variable in Python. Therefore, a Python object contains more information than just the label to the position in memory.

![cint_vs_pyint](img/cint_vs_pyint.png)

- Python list can contains elements of different types, so each element must hold its own data type and other information, so they are different objects and are stored in non-contiguous block of memory.

![array_vs_list](img/array_vs_list.png)

- **Vectorized operation** is performing an operation on the array, which will then be applied to each element.

### What are universal functions?

- **Universal Functions (ufuncs)** in NumPy are functions that operate element-wise on arrays, meaning they perform an operation on each element of an array, producing an output array with the same shape. Ufuncs are designed for efficient and vectorized operations, which are much faster than using Python loops to perform the same computations.
- **unary ufuncs** operate on a single input, and **binary ufuncs** operate on two inputs.

In [None]:
# Binary ufuncs

x = np.arange(4)
print("x     =", x)
print("x + 5 =", x + 5)    # np.add(x, 5)
print("x - 5 =", x - 5)    # np.subtract(x, 5)
print("x * 2 =", x * 2)    # np.multiply(x, 2)
print("x / 2 =", x / 2)    # np.divide(x, 2)
print("x // 2 =", x // 2)  # np.floor_divide(x, 2)

In [None]:
# Unary ufuncs

print("-x     = ", -x)     # np.negative(x)
print("x ** 2 = ", x ** 2) # np.power(x)
print("x % 2  = ", x % 2)  # np.mod(x)

In [None]:
# Absolute
print(np.absolute(x))
print(np.abs(x))

In [None]:
# Trigonometric functions

theta = np.linspace(0, np.pi, 3)

print("theta      = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))

x = [-1, 0, 1]
print("x         = ", x)
print("arcsin(x) = ", np.arcsin(x))
print("arccos(x) = ", np.arccos(x))
print("arctan(x) = ", np.arctan(x))

In [None]:
# Exponential
x = [1, 2, 3]
print("x     =", x)
print("e^x   =", np.exp(x))
print("2^x   =", np.exp2(x))
print("3^x   =", np.power(3, x))

In [None]:
# Logarithm

x = [1, 2, 4, 10]
print("x        =", x)
print("ln(x)    =", np.log(x))
print("log2(x)  =", np.log2(x))
print("log10(x) =", np.log10(x))

## 2. Broadcasting

**Broadcasting** is simply a set of rules for applying binary ufuncs (e.g., addition, subtraction, multiplication, etc.) on arrays of different sizes.

In [None]:
a = np.array([0, 1, 2])
b = np.array([5, 5, 5])
a + b

In [None]:
print(np.arange(3) + 5)
print(np.ones((3,3)) + np.arange(3))
print(np.arange(3).reshape((3,1)) + np.arange(3))

![broadcasting](img/02.05-broadcasting.png)

Rules of broadcasting

1. When comparing two arrays with different numbers of dimensions, the array with fewer dimensions is adjusted by adding ones to its left side.

2. If the shapes of the two arrays do not align in any dimension, the array having a shape of 1 in that dimension is expanded to match the shape of the other array.

3. If there is a mismatch in the sizes of dimensions, and neither size is equal to 1, an error is triggered.

Example 1:

In [None]:
matrix = np.ones((2, 3))
array = np.arange(3)

print(matrix.shape)
print(array.shape)

matrix.shape -> (2, 3) <br>
array.shape -> (1, 3)

matrix.shape -> (2, 3) <br>
array.shape -> (2, 3)

In [None]:
print(matrix, end="\n\n")
print(array,end="\n\n")
print(matrix + array)

Example 2:

In [None]:
matrix = np.arange(3).reshape((3, 1))
array = np.arange(3)

print(matrix.shape)
print(array.shape)

matrix.shape -> (3, 1) <br>
array.shape -> (1, 3)

matrix.shape -> (3, 3) <br>
array.shape -> (3, 3)

In [None]:
print(matrix, end="\n\n")
print(array,end="\n\n")
print(matrix + array)

Example 3:

In [None]:
matrix = np.ones((3, 2))
array = np.arange(3)

print(matrix.shape)
print(array.shape)

matrix.shape -> (3, 2)
array.shape -> (1, 3)

matrix.shape -> (3, 2)
array.shape -> (3, 3)

In [None]:
print(matrix, end="\n\n")
print(array,end="\n\n")
print(matrix + array)

In [None]:
# solution: reshape array
matrix = np.ones((3, 2))
array = np.arange(3)
array = array[:, np.newaxis]

print(matrix.shape)
print(array.shape)

matrix + array

**EXERCISE**

Using Numpy to write these following metrics

1. Mean Square Error (MSE)

![mse](img/mse.png)

In [None]:
def mse(target, prediction):
    # YOUR CODE
    pass

target = np.random.randint(low=0, high=2, size=(3,))
prediction = np.random.random(size=target.shape)
print(f"Target:\n{target}")
print(f"Prediction:\n{prediction}")
print(f"MSE: {mse(target, prediction)}")

2. Root Mean Square Error (RMSE)

![rmse](img/rmse.png)

In [None]:
def rmse(target, prediction):
    # YOUR CODE
    pass

target = np.random.randint(low=0, high=2, size=(3,))
prediction = np.random.random(size=target.shape)
print(f"Target:\n{target}")
print(f"Prediction:\n{prediction}")
print(f"RMSE: {rmse(target, prediction)}")

3. Mean Absolute Error (MAE)

![mae](img/mae.png)

In [None]:
def mae(target, prediction):
    # YOUR CODE
    pass

target = np.random.randint(low=0, high=2, size=(3,))
prediction = np.random.random(size=target.shape)
print(f"Target:\n{target}")
print(f"Prediction:\n{prediction}")
print(f"MAE: {mae(target, prediction)}")

4. Update the `MSE()` function so that it can return mse along a certain axis.

In [None]:
def mse(target, prediction, axis=None):
    return ((target - prediction)**2).mean(axis=axis)

target = np.random.randint(low=0, high=2, size=(3,5))
prediction = np.random.random(size=target.shape)
print(f"Target:\n{target}")
print(f"Prediction:\n{prediction}")

## 3. Boolean Masking

### 3.1. Boolean Array

NumPy also implements comparison operators such as `<` (less than) and `>` (greater than) as element-wise ufuncs.

In [None]:
my_array = np.array((1,2,3,4,5,6))
print(my_array < 3)
print(np.less(my_array, 3))

| Operator | Equivalent ufunc |
| -------- | -----------------|
| ==       | np.equal         |
| <        | np.less          |
| >        | np.greater       |
| !=       | np.not_equal     |
| <=       | np.less_equal    |
| >=       | np.greater_equal |


#### Practical uses of Boolean Array

- Count elements based on a condition

In [None]:
# 1D array
my_array = np.arange(10)
print(f"my_array: {my_array}")

n_less_than_4 = (my_array < 4)
print(f"Boolean array of elements less than 4: {n_less_than_4}")
print(f"How many elements less than 4? {n_less_than_4.sum()}") # False is interpreted as 0, and True is interpreted as 1

sum_less_than_4 = (my_array < 4)

In [None]:
# 2D array
my_array = np.arange(12).reshape((3,4))
print(f"my_array:\n{my_array}")

bool_even = (my_array % 2 ==0)
print(f"Boolean array even elements:\n{bool_even}")

n_even = bool_even.sum(axis=1)
print(f"How many even elements on each rows: {n_even}")

- Check all or any of the elements in an array match a condition with `np.all()` and `np.any()`

In [None]:
my_array = np.arange(10)
print(f"my_array:\n{my_array}")

print("All elements are non-negative?", np.all(my_array >= 0))
print("Any of elements are larger than 9?", np.any(my_array > 9))

### 3.2. Boolean operators

You can use bitwise logic operators to combine conditions.

| Operator | Equivalent ufunc |
| -------- | -----------------|
| &        | np.bitwise_and   |
| \|        | np.bitwise_or    |
| ^        | np.bitwise_xor   |
| ~        | np.bitwise_not   |

In [None]:
my_array = np.arange(20)
(my_array > 15) & (my_array %3 == 0)

In [None]:
my_array = np.arange(20)
(my_array %2 == 0) | (my_array %3 == 0)

#### Using keywords (`and`, `or`, `not`) vs operators (`&`, `|`, `~`)

**Keywords** determine the overall truth or falsity of an entire object, whereas **operators** operate on individual bits within each object.

In [None]:
my_array = np.arange(10)
(my_array > 4) & (my_array < 8)
# (my_array > 4) and (my_array < 8) # ValueError

### 3.3 Boolean Masking

Boolean masking in NumPy is a powerful technique for filtering and manipulating arrays based on a set of boolean (True/False) conditions. It allows you to extract, modify, or operate on elements of an array that satisfy specific criteria defined by boolean conditions. You can use a boolean mask to filter elements from an array or perform operations on elements that satisfy the conditions. 

In [None]:
my_array = np.arange(10)
mask = my_array > 5
filtered = my_array[mask]
#filtered = my_array[my_array > 5]

print(f"my_array\n{my_array}")
print(f"mask\n{mask}")
print(f"filterd\n{filtered}")

**EXERCISE**

1. Create binary image as a Numpy array with values as in `img/smile1.png`

In [None]:
# Inplace modification
img = (np.random.random(size=(8, 8)) * 255).astype(np.uint8)
print(f"original img\n{img}")

img[img < 100] = 255
print(f"new img\n{img}")

In [None]:
# YOUR CODE

2. Reverse the colors of the image (black -> white, white -> black).
    - Create a mask from a copy of the image
    - Using the mask, modify 0 values of image to 1 and 1 values to 0.

In [None]:
# YOUR CODE

# REFENCES

1. [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/02.06-boolean-arrays-and-masks.html)