# Part 1 Numpy Basics

In Machine Learning and Deep Learning, we use various mathematical operations. It is important to have a set of high-performance and numerically stable implementations of those operations. This is where scientific computational libraries and packages such as NumPy play an important role. Throughout the first parts of the course, we will use NumPy to build fundamental algorithms and layers in Deep Learning.

We will walk through the basics of NumPy to get some familaritiy with scientific computation libraries.

### Array Construction

NumPy represents data in multidimensional arrays. An array object has
- ```dtype```: Data type such as float32, float64, int32, int64, bool, etc...
- ```shape```: A tuple that contains the length of each dimension
- ```size```: Total number of elements
- ```ndim```: Number of dimensions of the array

In [13]:
from typing import Tuple, Callable
import numpy as np

scalar = np.float32(1/3)
vector = np.array([0, 1, 2.])
matrix = np.array([[0, 1, 2], [0, 2, 3]], dtype=np.float64)
tensor = np.array([[[0, 1], [0, 2]], [[0, 3.0], [0, 4]]], dtype=np.int32)


Although you can create an array using Python lists, there are other convenient NumPy functions to fill and create arrays of different shapes and values.


In [14]:
ones_vector = np.ones(shape=5, dtype=np.int32, order="C")
zeros_matrix = np.zeros(shape=(6, 6))
empty_tensor = np.empty(shape=(2, 2, 3), dtype=np.float32)
zeros_tensor = np.zeros_like(empty_tensor)


Similarly, you can create ranged arrays using [```np.arange```](https://numpy.org/doc/stable/reference/generated/numpy.arange.html) and [```np.linspace```](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html). Please read the documentation of these functions for further details.

> Implement ```alternative_linspace``` using ```np.arange``` without ```np.linspace``` function.

In [15]:
def alternative_linspace(start: float, stop: float, num: int) -> np.ndarray:
    """ Linspace

    Args:
        start (float): starting value
        stop (float): end value
        num (int): number of elements

    Returns:
        np.ndarray: ranged array between <start> and <stop> values of size <num>
    """

#assert(np.allclose(alternative_linspace(-1, 1, 50), np.linspace(-1, 1, 50)))
#assert(np.allclose(alternative_linspace(-5, 5, 1), np.linspace(-5, 5, 1)))


### Merge and Split

We can merge arrays using [np.concatenate](https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html) and [np.stack](https://numpy.org/doc/stable/reference/generated/numpy.stack.html). Similarly, we divide arrays into multiple sub-arrays with [np.split](https://numpy.org/doc/stable/reference/generated/numpy.split.html) and [np.array_split](https://numpy.org/doc/stable/reference/generated/numpy.array_split.html).

> Implement ```combine_feautures``` and ```data_split``` functions using the NumPy functions mentioned above.

In [16]:
def combine_feautures(first_features: np.ndarray, second_features: np.ndarray) -> np.ndarray:
    """Combine two features into a single array.

    Args:
        first_features (np.ndarray): shape: (B, D)
        second_features (np.ndarray): shape: (B, F)

    Returns:
        np.ndarray: shape: (B, D+F)
    """


#assert(np.allclose(combine_feautures(np.array([[1, 1], [2, 2]]), np.array([[3, 3], [4, 4]])), np.array([[1, 1, 3, 3], [2, 2, 4, 4]])))
    
def data_split(data: np.ndarray, train_split_ratio: float) -> Tuple[np.ndarray, np.ndarray]:
    """ Split <data> into train and test data based on the value of <train_split_ratio>.

    Args:
        data (np.ndarray): data array of shape : (B, D)
        train_split_ratio (float): Ratio of train samples to whole data

    Returns:
        Tuple[np.ndarray, np.ndarray]: Tuple of train and test samples. Shapes: (K, D), (L, D) where K/(K+L) = <train_split_ratio>
    """


#part_1, part_2 = data_split(np.concatenate([np.zeros((6, 4)), np.ones((6, 4))], axis=0), 0.5)
#assert(np.allclose(part_1, np.zeros((6, 4))))
#assert(np.allclose(part_2, np.ones((6, 4))))


### Random Arrays

NumPy provides random array construction for a variety of distributions.
- ```np.random.rand``` Uniform random between 0 and 1
- ```np.random.randn``` Standard Normal
- ```np.random.randint``` Discrete Uniform

In [17]:
random_labels = np.random.randint(0, 2, size=(12,))
random_weights = np.random.randn(16, 16)


### Shape and Order

NumPy represents an array as a contiguous block in memory. The shape or dimensions of an array does not change how it's represented in the memory. We can modify the shape of an array or add and remove dimensions to/from it without reallocation or modification in the array memory (O(1) complexity).

In [18]:
array = np.arange(6)  # Shape: (6,)
matrix = array.reshape(2, 3)  # Shape: (2, 3)
column_vector = np.expand_dims(array, axis=1)  # Shape: (6, 1)
row_vector = np.expand_dims(array, axis=0)  # Shape: (1, 6)

#assert(np.allclose(array, np.squeeze(row_vector, axis=0)))
column_vector, row_vector, matrix


(array([[0],
        [1],
        [2],
        [3],
        [4],
        [5]]),
 array([[0, 1, 2, 3, 4, 5]]),
 array([[0, 1, 2],
        [3, 4, 5]]))

In NumPy, arrays are ordered by row as default (similar to "C" representation and contrary to "Fortran" representation). If we want to change this ordering, for example, with ```transpose```, NumPy changes the order instantly without modifing the array memory.

In [19]:
array = np.arange(6)
row_first_matrix = array.reshape(2, 3)  # By default C order
column_first_matrix = array.reshape(2, 3, order="F")  # F for Fortran

row_first_matrix, column_first_matrix


(array([[0, 1, 2],
        [3, 4, 5]]),
 array([[0, 2, 4],
        [1, 3, 5]]))

We can change the order as shown above. For matrices, this corresponds to transpose operation. However, for multidimensional arrays (tensors), we need to specify the new order in the ```transpose``` function.

In [20]:
array = np.arange(24)
tensor = array.reshape(2, 3, 4)  # Shape: (2, 3, 4)
transposed_tensor = np.transpose(tensor, (2, 1, 0))  # Shape: (4, 3, 2)

tensor, transposed_tensor


(array([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],
 
        [[12, 13, 14, 15],
         [16, 17, 18, 19],
         [20, 21, 22, 23]]]),
 array([[[ 0, 12],
         [ 4, 16],
         [ 8, 20]],
 
        [[ 1, 13],
         [ 5, 17],
         [ 9, 21]],
 
        [[ 2, 14],
         [ 6, 18],
         [10, 22]],
 
        [[ 3, 15],
         [ 7, 19],
         [11, 23]]]))

### Indexing

NumPy supports Python indexing and slicing. That is, we can use slicing like ```array[::2]``` and ```array[:-2]```. Since NumPy arrays can be multidimensional, we can index or slice each dimension separately (```array[::2, :-3, 0]```).

In [21]:
array = np.arange(4 * 3 * 2).reshape(3, 2, 4)
array[:-1, 0, ::2], array 

(array([[ 0,  2],
        [ 8, 10]]),
 array([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7]],
 
        [[ 8,  9, 10, 11],
         [12, 13, 14, 15]],
 
        [[16, 17, 18, 19],
         [20, 21, 22, 23]]]))

### Advanced Indexing

Simple slices may not be enough for some tasks. If we need to take values of specific indices we can use arrays for indexing (```array[np.array([2, 3, 6])]```). We can also obtain values that satisfy a condition. For example, all positive values ```array[array > 0]```. Please read the [documentation](https://docs.scipy.org/doc/numpy-1.10.0/reference/arrays.indexing.html) for further details.

In multilabel classification tasks, we often need a function that can convert labels vector into a sparse matrix, which we call one-hot representation, where each row is a vector that contains one "1" at the index specified by the corresponding label. For example:

- number of classes: 5

- label:
\begin{bmatrix}
0 & 4 & 2 & 1
\end{bmatrix}

- onehot_matrix: 
\begin{bmatrix}
1 & 0 & 0 & 0 & 0\\
0 & 0 & 0 & 0 & 1\\
0 & 0 & 1 & 0 & 0\\
0 & 1 & 0 & 0 & 0\\
\end{bmatrix}


> Implement the ```one_hot``` function using advanced indexing. 


In [22]:
def one_hot(labels: np.ndarray, n_labels: int) -> np.ndarray:
    """ Convert labels to one-hot matrix

    Args:
        labels (np.ndarray): 1D integer vector
        n_labels (int): Number of classes/(unique labels)

    Returns:
        np.ndarray: One-hot matrix of the given <labels> vector
    """



    one_hot_matrix = np.zeros((len(labels), n_labels))
    
    one_hot_matrix[np.arange(len(labels)), labels] = 1
    
    return one_hot_matrix
assert(np.allclose(one_hot(np.array([1, 0, 2, 2]), 3),
                   np.array([[0, 1, 0], [1, 0, 0], [0, 0, 1], [0, 0, 1]])))


### Broadcasting

Probably, broadcasting is one of the most taken-for-granted functionalities of NumPy. It automatically matches the shapes of two arrays when possible before element-wise operations. For example, let the two arrays have the shape (K, L) and (K, 1). Whenever an element-wise operation is called upon these to arrays, instead of raising shape mismatch exception, NumPy applies broadcasting operation and adds additional axes or repeats the array over some axes if possible.

In [23]:
first_array = np.ones((5, 4))
second_array = np.ones((5, 1))

auto_broadcasted_array = first_array + second_array  # Shape: (5, 4)
manual_broadcasted_array = first_array + np.repeat(second_array, 4, axis=1)  # Shape: (5, 4)

assert(np.allclose(auto_broadcasted_array, manual_broadcasted_array))


In the above example, NumPy automatically repeats the ```second_array``` at the second dimension before applying addition.

#### One-hot with broadcasting example

We can use broadcasting to create a one-hot matrix as shown below.

In [24]:
labels = np.random.randint(0, 3, size=6)  # Shape: (6)
range_row_vector = np.arange(3).reshape(1, 3)  # Shape: (1, 3)
label_column_vector = labels.reshape(-1, 1)  # Shape: (6, 1)
onehot_matrix = (range_row_vector == label_column_vector).astype(np.float32)  # Shape: (6, 3)

onehot_matrix.astype(np.float32), labels


(array([[0., 0., 1.],
        [0., 1., 0.],
        [0., 1., 0.],
        [1., 0., 0.],
        [0., 0., 1.],
        [0., 1., 0.]], dtype=float32),
 array([2, 1, 1, 0, 2, 1]))


#### Pitfalls of auto-broadcasting

Automatic broadcasting may lead to unexpected results. What would happen if we sum an array of shape (K, 1) with an array of shape (K)? The resulting array has the shape (K, K).




In [25]:
first_array = np.ones((5, 1))
second_array = np.ones(5)

first_array + second_array  # Shape: (5, 5)


array([[2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.]])

In regression tasks, the prediction array of our models generally has the shape of (B, 1), where "B" denotes the batch axis. But the ground truth values may have the shape (B). That leads to the former issue. 

> Implement ```sum_square_loss``` function. 

In [26]:
def sum_square_loss(prediction: np.ndarray, ground_truth: np.ndarray) -> np.float32:
    """ Calculate the square distance over the elements of the given arrays and return the sum of errors.

    Args:
        prediction (np.ndarray): Prediction array of shape (B, 1)
        ground_truth (np.ndarray): Array of true values which has the shape (B)

    Returns:
        np.float32: Scalar loss summed alonged the batch axis "B"
    """

    
    return sum_square_loss
    

assert(sum_square_loss(np.ones((5, 1)), np.zeros(5)) == 5.0)


AssertionError: 



Functions like ```np.sum```, ```np.max```, and ```np.mean``` reduce the dimension of an array at the dimension that they operate. If you want to keep that dimension as a dummy dimension of "1" you can use ```keepdims``` parameter of those functions.


> Implement ```normalization``` function 


In [None]:
def normalization(data: np.ndarray) -> np.ndarray:
    """ Normalize the data array by subtracting it from its mean and dividing it by its standard deviation 

    Args:
        data (np.ndarray): Data array of shape (B, D)

    Returns:
        np.ndarray: Normalized array of shape (B, D)
    """
    

    
    return normal_data

assert(np.allclose(normalization(np.arange(6, dtype=np.float32).reshape(3, 2)),
                   np.array([[-1.2247448, -1.2247448],
                             [0.,  0.],
                             [1.2247448,  1.2247448]], dtype=np.float32)))


We can use broadcasting, element-wise math (summation, multiplication, ...) and reduce operations (sum, mean, max, std, ...) to implement "slow" matrix multiplication and many other layers in Deep Learning.

> Implement ```slow_matrix_multiplication``` using the aforementioned operations. **Do not** use matrix multipliation functions of NumPy.

In [None]:

def slow_matrix_multiplication(a: np.ndarray, b: np.ndarray) -> np.ndarray:
    """ Matrix multiplication with element-wise multiplication, sum, and reshape/expand_dims
    
    Args:
        first_matrix (np.ndarray): 2D array of shape (K, L)
        second_matrix (np.ndarray): 2D array of shape (L, T)

    Returns:
        np.ndarray: resultant matrix of shape (K, T) 
    """

    return matrix_multiplication

    
    

first_matrix = np.random.randn(6, 4)
second_matrix = np.random.randn(4, 8)
assert(np.allclose(slow_matrix_multiplication(first_matrix, second_matrix), first_matrix @ second_matrix))


### Einsum
Alternatively, there is a very convenient function that NumPy provides. For most of the mathematical operations ```np.einsum``` is all you need. Although it is not mandatory, we highly suggest that you take a look at the [documentation](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html) of the einsum function and try to use it in upcoming homework.

