# Working with Python and Numpy


## **Introduction**

**Why Python?** 

Python is a popular choice in various fields, including computer vision, due to its simplicity and readability, making it great for beginners. It also has a strong community support and a wealth of libraries, which makes complex tasks more manageable. This combination of ease of use and powerful features makes Python a go-to language for both learning and implementing advanced concepts.


**Yupiter Notebook**

Jupyter Notebooks are interactive coding environments in a webpage format. Write your code between the ### START CODE HERE ### and ### END CODE HERE ### comments. To run your code, press "SHIFT"+"ENTER" or click the "Run Cell" button with a play symbol. Don't worry about the exact number of lines of code; "(≈ X lines of code)" is just a guide. 




Now, let's get started!

## 1 - Getting started with Numpy
**What is Numpy?**

Numpy is a fundamental package in Python used for scientific computing. It allows us to work with high-performance arrays and matrices, making operations that involve large amounts of data much faster and easier. Numpy is not just useful in computer vision; it's a versatile tool that's essential in various areas of data analysis, scientific research, and engineering. Its efficiency and wide range of functionalities make it a cornerstone for anyone working with data in Python.
If you want to learn more about Numpy, visit https://Numpy.org/.

### 1.1 - Applying function to arrays

We will start with the sigmoid function. $sigmoid(x) = \frac{1}{1+e^{-x}}$ is sometimes also known as the logistic function. It is a non-linear function used not only in Machine Learning (Logistic Regression), but also in Deep Learning.

<img src="images/Sigmoid.png" style="width:500px;height:228px;">


To apply the sigmoid function, we need the exponential function. Numpy lets you apply an arithmetic operation or a function to every element of an array $ x = (x_1, x_2, ..., x_n)$ (row vector). Try it our yourself:

In [2]:
import numpy as np # this means you can access Numpy functions by writing np.function() instead of Numpy.function()

# Create a one-dimensional Numpy array
x = np.array([1, 2, 3, 4, 5])
print("1D Array (row vector):", x)

# Basic arithmetic operations, the output is the same shape as the input
x_plus_10 = x + 10
print("Array plus 10:", x_plus_10)

1D Array (row vector): [1 2 3 4 5]
Array plus 10: [11 12 13 14 15]


Now apply the exponential function np.exp() to every element in the original array $x$.

In [3]:
x = np.array([1, 2, 3, 4, 5])

### START CODE HERE ### (≈ 1 line of code)
x = np.exp(x)
### END CODE HERE ###
print("e^x:", x)

e^x: [  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]


**Expected Output**:

e^x: [  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]

**Exercise**: Implement the sigmoid function using Numpy. 

**Instructions**: x could now be either a real number, a vector, or a matrix. The data structures we use in Numpy to represent these shapes (vectors, matrices...) are called Numpy arrays. You don't need to know more for now.
$$ \text{For } x \in \mathbb{R}^n \text{,     } sigmoid(x) = sigmoid\begin{pmatrix}
    x_1  \\
    x_2  \\
    ...  \\
    x_n  \\
\end{pmatrix} = \begin{pmatrix}
    \frac{1}{1+e^{-x_1}}  \\
    \frac{1}{1+e^{-x_2}}  \\
    ...  \\
    \frac{1}{1+e^{-x_n}}  \\
\end{pmatrix}\tag{1} $$

In [4]:
def sigmoid(x):
    """
    Compute the sigmoid of x

    Arguments:
    x -- A scalar or Numpy array of any size

    Return:
    s -- sigmoid(x)
    """

    ### START CODE HERE ### (≈ 1 line of code)
    s = 1 / (1 + np.exp(-x))
    ### END CODE HERE ###

    return s

In [5]:
x = np.array([1, 2, 3])
sigmoid(x)

array([0.73105858, 0.88079708, 0.95257413])

**Expected Output**: 
<td> array([ 0.73105858,  0.88079708,  0.95257413]) </td> 

### 1.2 - Sigmoid gradient

As you've seen in lecture, you will need to compute gradients to optimize loss functions using backpropagation. Let's code your first gradient function.

**Exercise**: Implement the function sigmoid_grad() to compute the gradient of the sigmoid function with respect to its input x. Use the formular from the lecture (is is very simple).

In [6]:
# GRADED FUNCTION: sigmoid_derivative

def sigmoid_derivative(x):
    """
    Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.
    You can store the output of the sigmoid function into variables and then use it to calculate the gradient.

    Arguments:
    x -- A scalar or Numpy array

    Return:
    ds -- Your computed gradient.
    """

    ### START CODE HERE ### (≈ 2 lines of code)
    ds = sigmoid(x) * (1 - sigmoid(x))
    ### END CODE HERE ###

    return ds

In [7]:
x = np.array([1, 2, 3])
print ("sigmoid_derivative(x) = " + str(sigmoid_derivative(x)))

sigmoid_derivative(x) = [0.19661193 0.10499359 0.04517666]


**Expected Output**: 

<td> [ 0.19661193  0.10499359  0.04517666] </td>  



### 1.3 - Reshaping arrays ###

Two common Numpy functions used in deep learning are [np.shape](https://docs.scipy.org/doc/Numpy/reference/generated/Numpy.ndarray.shape.html) and [np.reshape()](https://docs.scipy.org/doc/Numpy/reference/generated/Numpy.reshape.html). 
- X.shape is used to get the shape (dimension) of a matrix/vector X. 
- X.reshape(...) is used to reshape X into some other dimension. 

For example, in computer science, an image is represented by a 3D array of shape $(length, height, depth = 3)$. However, when you read an image as the input of an algorithm you convert it to a vector of shape $(length*height*3, 1)$. In other words, you "unroll", or reshape, the 3D array into a 1D vector.

<img src="images/image2vector_kiank.png" style="width:500px;height:300;">

**Exercise**: Implement `image2vector()` that takes an input of shape (length, height, 3) and returns a vector of shape (length\*height\*3, 1). Using the reshape method $ndarray.reshape(shape)$.

- Please don't hardcode the dimensions of image as a constant. Instead look up the quantities you need with `image.shape[0]`, etc. 

In [8]:
def image2vector(image):
    """
    Argument:
    image -- a Numpy array of shape (length, height, depth)

    Returns:
    v -- a vector of shape (length*height*depth, 1)
    """

    ### START CODE HERE ### (≈ 1 line of code)
    v = np.reshape(image, (image.shape[0] * image.shape[1] * image.shape[2], 1))
    ### END CODE HERE ###

    return v

In [9]:
# This is a 3 by 3 by 2 array, typically images will be (num_px_x, num_px_y,3) where 3 represents the RGB values
image = np.array(
         [[[ 0.67826139,  0.29380381], [ 0.90714982,  0.52835647], [ 0.4215251 ,  0.45017551]],
         [[ 0.92814219,  0.96677647], [ 0.85304703,  0.52351845], [ 0.19981397,  0.27417313]],
         [[ 0.60659855,  0.00533165], [ 0.10820313,  0.49978937], [ 0.34144279,  0.94630077]]])

print(image2vector(image))

[[0.67826139]
 [0.29380381]
 [0.90714982]
 [0.52835647]
 [0.4215251 ]
 [0.45017551]
 [0.92814219]
 [0.96677647]
 [0.85304703]
 [0.52351845]
 [0.19981397]
 [0.27417313]
 [0.60659855]
 [0.00533165]
 [0.10820313]
 [0.49978937]
 [0.34144279]
 [0.94630077]]


**Expected Output**: 
<td>
[[0.67826139]
 [0.29380381]
 [0.90714982]
 [0.52835647]
 [0.4215251 ]
 [0.45017551]
 [0.92814219]
 [0.96677647]
 [0.85304703]
 [0.52351845]
 [0.19981397]
 [0.27417313]
 [0.60659855]
 [0.00533165]
 [0.10820313]
 [0.49978937]
 [0.34144279]
 [0.94630077]] </td>

### 1.4 Centering Input Examples Around Feature Mean
In machine learning and deep learning, normalizing data is a common practice, often leading to better performance as it aids in faster convergence of gradient descent algorithms. Normalization in this context refers to adjusting data such that the mean of each feature across the dataset is zero. This is accomplished by subtracting the mean of each feature from the corresponding values.

Consider the following example where $x$ is a dataset:
For example, if $$x = 
\begin{bmatrix}
    1 & 5 & 9 \\
    4 & 7 & 10 \\
    3 & 2 & 6 \\
\end{bmatrix}\tag{3}$$
In this matrix, the dimensions (n, m) represent three features (n) and three training examples (m). To center the data around the feature mean, we calculate the mean of each row (feature) using Numpy's mean function with axis=1 and keepdims=True:
$$\bar{x} =  x_{mean} = np.mean(x, axis = 1, keepdims=True) = \begin{bmatrix}    5\\
    7 \\
    3.6667 \\
\end{bmatrix}\tag{3}$$
Then, we shift the original matrix x by this mean:
$$ x_{shifted} = x - x_{mean} = \begin{bmatrix}
   -4 &  0 &  4 \\
   -3 & 0 & 3\\
 -0.6667  & -1.6667  & 2.3333 \\
\end{bmatrix}\tag{5}$$ 

Notice how we can subtract matrices of different sizes, thanks to broadcasting in Numpy.

**Exercise**: Implement the `shiftRowsAroundMean()` function to normalize the data by centering each feature (row) of a matrix around its mean.

In [10]:
def shiftRowsAroundMean(x):
    """
    Normalize the data by shifting each row of the matrix x around the mean of that row.

    Arguments:
    x -- A Numpy matrix of shape (n, m), where n is the number of features, and m is the number of examples

    Returns:
    x_shifted -- The matrix with normalized data.
    """

    ### START CODE HERE ### (≈ 2 lines of code)
    x_shifted = x - np.mean(x, axis=1, keepdims=True)
    ### END CODE HERE ###

    return x_shifted

In [11]:
# Test the function with an example matrix
example_matrix = np.array([[1, 5, 9], [4, 7, 10], [3, 2, 6]])
shifted_matrix = shiftRowsAroundMean(example_matrix)
print("Shifted Matrix:\n", shifted_matrix)

Shifted Matrix:
 [[-4.          0.          4.        ]
 [-3.          0.          3.        ]
 [-0.66666667 -1.66666667  2.33333333]]


**Expected Output**: 


 [[-4.          0.          4.        ]
 [-3.          0.          3.        ]
 [-0.66666667 -1.66666667  2.33333333]]


<font color='red'>
**What you need to remember:**

* The sigmoid function and it's derivative
* How to use Numpy to efficiently manipulate arrays
* Data normalization and array transformations which are frequently used in machine learning


## 2 - Vectorization
### 2.1 Definition and importance
Vectorization is a pivotal concept in the field of data science and machine learning, particularly when it comes to optimizing computational efficiency. At its core, vectorization is about expressing operations as occurring on entire arrays or matrices, rather than through explicit iteration over individual elements.

In the context of programming, especially with languages like Python, vectorization refers to the practice of using more abstract and powerful operations that act on whole arrays or datasets at once, rather than using slower, explicit loops. This not only results in **cleaner, more readable code** but also takes advantage of underlying optimizations and parallel processing capabilities, leading to **significant speed improvements**.

For instance, in Python, libraries such as Numpy are built with vectorization in mind, allowing for operations on entire arrays with a single line of code. 

Let's take an example where we will sum the elements of a large array. We'll compare a traditional loop-based approach with a vectorized approach using Numpy, and measure the execution time for both. To create a large array, we will use *np.random.rand* which generates a large array of random floating-point numbers, where each number is uniformly distributed between 0 and 1.

In [12]:
import time

# Creating a large array
large_array = np.random.rand(1000000)

# Start measuring time
start_time = time.time()

# Summing using a loop
sum_loop = 0
for i in large_array:
    sum_loop += i

# End measuring time
end_time = time.time()
time_loop = end_time - start_time

print(f"Sum using loop: {sum_loop:.3f}")
print(f"Time taken by loop: {time_loop:.5f} seconds")


Sum using loop: 499791.979
Time taken by loop: 0.14855 seconds


In [13]:
# Start measuring time
start_time = time.time()

# Summing using Numpy's sum function
sum_vectorized = np.sum(large_array)

# End measuring time
end_time = time.time()
time_vectorized = end_time - start_time

print(f"Sum using loop: {sum_vectorized:.3f}")
print(f"Time taken by loop: {time_vectorized:.5f} seconds")

print(f"Speed-up: ~{time_loop/time_vectorized:.0f}x")

Sum using loop: 499791.979
Time taken by loop: 0.00200 seconds
Speed-up: ~74x


### 2.2 Implementing L2 Loss Function
**Remind**: The L2 loss function, also known as the mean squared error or quadratic loss, is commonly used in regression problems. It measures the squared average difference between the actual and predicted values. The formula for L2 loss is:
$$\begin{align*} & L_2(\hat{y},y) = \frac{1}{2 m}\sum_{i=0}^m(y^{(i)} - \hat{y}^{(i)})^2 \end{align*}\tag{7}$$
where $y$ is the vector of true values, and $\hat{y}$ is the vector of predicted values.
The L2 loss is a method to quantify the difference between the predicted and true values. A lower L2 loss value indicates a better model performance.

**Exercise**: Implement the `calculate_l2_loss` function to compute the L2 loss for given true values and predictions. You should use `np.dot` for this task.

**Note**:
* The `np.dot` function computes the dot product of two arrays. For 1-D arrays, it computes the inner product, and for 2-D arrays, it's equivalent to matrix multiplication. In the context of the L2 loss function, np.dot is used to calculate the sum of the squared differences. \
Example: np.dot(a, b) where a and b are 1-D arrays.

* In Numpy, the `*` operator is used for element-wise multiplication of arrays. If you multiply two arrays of the same shape using *, each pair of elements at corresponding positions in the arrays are multiplied together. \
Example: a * b multiplies each element of array a with the corresponding element of array b.

* `np.multiply` is another way to perform element-wise multiplication. It's functionally identical to using the `*` operator but is explicitly named, which can be more readable. \
Example: np.multiply(a, b) is equivalent to a * b.

In [26]:
def calculate_l2_loss(y_true, y_pred):
    """
    Calculates the L2 loss between y_true and y_pred.

    Arguments:
    - y_true: Numpy array of true values
    - y_pred: Numpy array of predicted values

    Returns:
    - l2_loss: Calculated L2 loss as a float
    """
    ### START CODE HERE ### (≈ 2-3 lines of code)
    diff = y_pred - y_true
    l2_loss = np.multiply(1/(2*y_true.shape[0]), np.dot(diff, diff))

    # oder:
    #l2_loss = np.multiply(1/(2*y_true.shape[0]), np.multiply(np.square(y_true, y_pred)))
    ### END CODE HERE ###

    return l2_loss

In [27]:
# Test your function
y_true = np.array([1.0, 2.0, 3.0])
y_pred = np.array([1.1, 1.9, 3.2])

l2_loss = calculate_l2_loss(y_true, y_pred)
print("L2 Loss:", l2_loss)


L2 Loss: 0.010000000000000018


**Expected Output**: 
<table style="width:20%">
     <tr> 
       <td> **L2** </td> 
       <td> 0.01 </td> 
     </tr>
</table>

Congratulations on completing this assignment. We hope that this little warm-up exercise helps you in the future assignments, which will be more exciting and interesting!

<font color='red'>
What you need to remember about Vectorization:

* Vectorization is key for speed and efficiency in Python, particularly with Numpy.
* It allows for element-wise operations on arrays without explicit loops.
* Utilize Numpy's vectorized functions like np.sum, np.dot for performance gains.

Resources used for this exercise:
- Respective problem sets in the Deep Learning Specialization from deeplearning.ai (https://www.coursera.org/specializations/deep-learning).
- DHBW Student input from Gregori Daiger and Lenz Blattner
