# ReLU Activation Function in Machine Learning

The Rectified Linear Unit (ReLU) is a widely used activation function in neural networks, especially in deep learning. Simply put, it's a function that outputs the input directly if the input is positive, but outputs zero if the input is zero or negative. Mathematically, it's defined as:

$$
  f(x) = max(0,x)
$$

This creates a simple "switch" where only positive information is passed through, while negative information is effectively turned off. Here is how the function looks:

<p align="center">
    <img src="./images/relu.png" height="300px" alt="Graph showing the shape of the ReLU activation function" />
</p>

ReLU's relevance in machine learning stems from its ability to introduce non-linearity into neural networks, which is crucial for learning complex patterns in data. Without non-linear activation functions, a deep neural network would essentially behave like a single linear model, severely limiting its capacity to understand intricate relationships. Furthermore, ReLU is computationally very efficient compared to older activation functions like sigmoid or tanh, as it only involves a simple comparison operation, leading to faster training times for large networks.

One of the most significant advantages of ReLU is its role in mitigating the "vanishing gradient problem." In deep networks using sigmoid or tanh, gradients can become extremely small as they are propagated backward through many layers, effectively stopping earlier layers from learning. ReLU's constant gradient for positive inputs (1) allows for a more stable and effective flow of gradients, enabling deeper networks to be trained successfully and achieve better performance. <sup>[1](#source)</sup>

## Exercise

In the following exercise you will implement the ReLU activation function. You can use the mathematical definition described above or use a conditional to generate the same effect. Once you are done execute the unit tests in the cells below to validate your result, have fun!


<a name="source">1</a>: Description generated by Gemini 2.5 Flash

In [None]:
def relu (x):
    # Remember you can leverage Python's built-in math function to implement ReLU or use 
    # an if statement. 
    # The goal is to return 0 when the input x is negative or zero and the value of x in any other case.

    return 5 # This will fail... unless x is equal to 5!

In [None]:
# This is an alternative version guiding the student away from using the max function.
# It might help the student to understand the purpose of ReLU better than simply implementing the math definition.

def alternative_relu (x):
    # Replace the return values according to the definition of ReLU.
    # Remember that the goal is to return 0 when the input x is negative or zero and the value of x in any other case.
    if x > 0:
        return 10    

    return 5

## Test cases

For the test cases I am using pytest along with the ipytest library to support the tests execution within the notebook. 

The tests are covering examples from the range where the function needs to return 0 and where it needs to return the value of x. I also called the case of the 0 value explicitly as it is the main edge case of the function.

In [None]:
# Install the required modules to execute pytest inside the notebook
# The following line is not needed if the notebook is executed using the included Dockerfile
# %pip install pytest ipytest

In [None]:
import ipytest

ipytest.autoconfig()
verbosity = "-vv"

def test_relu_should_return_0_for_negative_values():
    message = "The result of the ReLU function for any negative value should be zero (0)"
    assert solution(-0.001) == 0, message
    assert solution(-0.1) == 0, message
    assert solution(-5) == 0, message
    assert solution(-20) == 0, message

def test_relu_should_return_0_for_zero_value():
    assert solution(0) == 0, "The result of the ReLU function for zero (0) value should be zero (0)"

def test_relu_should_return_x_for_positive_values():
    message = "The result of the ReLU function for any positive value should be the same value"
    assert solution(0.001) == 0.001, message
    assert solution(0.5) == 0.5, message
    assert solution(1) == 1, message
    assert solution(15) == 15, message
    assert solution(15.001) == 15.001, message

solution = relu

ipytest.run(verbosity)

## Solutions

Bellow you can find 2 valid solutions, one usin ghte max function and one with an if statement. Both versions pass the proposed unit tests.

In [None]:
# Solution using the max function

def relu_solution_1(x):
    # Remember you can leverage Python's built-in math function to implement ReLU or use 
    # an if statement. 
    # The goal is to return 0 when the input x is negative or zero and the value of x in any other case.

    return max(0, x)

solution = relu_solution_1

ipytest.run(verbosity)

In [None]:
# Solution with an if

def relu_solution_2(x):
    # Replace the return values according to the definition of ReLU.
    # Remember that the goal is to return 0 when the input x is negative or zero and the value of x in any other case.

    if x > 0:
        return x
    return 0

solution = relu_solution_2

ipytest.run(verbosity)

### LLM usage

While generating this notebook I used the Gemini 2.5 flash model to generate the description from the introduction and the code to generate the graph of the ReLU function. Here are the prompts:

1. Write a simple explanation of the ReLU activation function and its relevance for machine learning in less than 3 paragraphs (formatted reply in the first cell)
2. Generate a piece of python code that will render a graph of the ReLU activation function in a jupyter notebook

Here is the response to the second prompt:

```python
import numpy as np
import matplotlib.pyplot as plt

def relu(x):
"""
The ReLU activation function.
"""
return np.maximum(0, x)

# Generate a range of x values
x = np.linspace(-5, 5, 100) # From -5 to 5 with 100 points

# Calculate the corresponding y values using the ReLU function
y = relu(x)

# Plot the ReLU function
plt.figure(figsize=(8, 6)) # Set the figure size
plt.plot(x, y, label='ReLU(x) = max(0, x)', color='blue')

# Add title and labels
plt.title('ReLU Activation Function')
plt.xlabel('x')
plt.ylabel('ReLU(x)')

# Add a grid for better readability
plt.grid(True)

# Add a legend
plt.legend()

# Highlight the axes at y=0 and x=0
plt.axhline(0, color='black', linewidth=0.5)
plt.axvline(0, color='black', linewidth=0.5)

# Show the plot
plt.show()
```