# ReLU activation function

Welcome to this hands-on exercise! 🎉  
In this notebook, you’ll implement one of the most fundamental components of neural networks: the **Rectified Linear Unit (ReLU)** activation function.  

## What are Activation Functions?

Activation functions transform the input of a neuron into an output that gets passed to the next layer.  
Without them, neural networks would only capture **linear** relationships — which isn’t enough for modeling real-world complexity.  

Think of it this way:  
👉 Going from **0 → 1 child** changes your banking behavior very differently than going from **3 → 4**. That’s non-linearity in action!  

---

### What is ReLU?

The **ReLU** activation function is defined as:

$$f(x) = \max(0, x)$$  

- If input > 0 → output = input  
- If input ≤ 0 → output = 0  

It’s simple, fast, and extremely common in deep learning.  


### Your Task

You need to implement the `relu` function that takes a numpy array as input and returns the ReLU transformation of that array. The function should work for both scalar values and arrays.

### Requirements

- Your function should handle both positive and negative numbers correctly
- It should work with numpy arrays of any shape
- The function should be vectorized (no loops needed!)
- Make sure to handle edge cases like zero values

Let’s get started by setting up our environment 👇

In [None]:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt

# Set up plotting
%matplotlib inline
plt.style.use('default')
np.random.seed(42)  # For reproducible results

## Exercise: Implement the ReLU Function

Complete the function below. You can use numpy functions to make your implementation efficient and vectorized.

In [None]:
def relu(x):
    """
    Implement the Rectified Linear Unit (ReLU) activation function.

    Args:
        x: Input array or scalar (numpy array or scalar)

    Returns:
        numpy array: ReLU transformation of the input

    Examples:
        >>> relu(3)
        3
        >>> relu(-2)
        0
        >>> relu([1, -1, 0, 2])
        array([1, 0, 0, 2])
    """
    # Convert input to numpy array if it's not already
    x = np.array(x)

    # TODO: Implement the ReLU function
    # Hint: You can use np.maximum() or np.where()

    # Your code goes here
    result = #your code goes here

    return result

## Test Your Implementation

Let's test your function with some simple examples first:

In [None]:
# Test with simple values
print("Testing with simple values:")
print(f"relu(5) = {relu(5)}")
print(f"relu(-3) = {relu(-3)}")
print(f"relu(0) = {relu(0)}")

# Test with arrays
test_array = np.array([1, -1, 0, 2, -5, 3])
print(f"\nInput array: {test_array}")
print(f"ReLU output: {relu(test_array)}")

# Test with 2D array
test_2d = np.array([[-1, 2], [0, -3]])
print(f"\nInput 2D array:\n{test_2d}")
print(f"ReLU output:\n{relu(test_2d)}")

## Unit Tests

Great! Now that we’ve confirmed the basics, let’s run a more rigorous set of unit tests to be completely sure our implementation is solid.  

In [None]:
def run_unit_tests():
    """Run comprehensive unit tests for the relu function"""

    print("🧪 Running Unit Tests for ReLU Function\n")

    # Test 1: Basic functionality
    print("Test 1: Basic functionality")
    try:
        assert relu(5) == 5, f"Expected 5, got {relu(5)}"
        assert relu(-3) == 0, f"Expected 0, got {relu(-3)}"
        assert relu(0) == 0, f"Expected 0, got {relu(0)}"
        print("✅ Passed: Basic functionality works correctly")
    except AssertionError as e:
        print(f"❌ Failed: {e}")
        return False

    # Test 2: Array handling
    print("\nTest 2: Array handling")
    try:
        test_array = np.array([1, -1, 0, 2, -5, 3])
        expected = np.array([1, 0, 0, 2, 0, 3])
        result = relu(test_array)
        assert np.array_equal(result, expected), f"Expected {expected}, got {result}"
        print("✅ Passed: Array handling works correctly")
    except AssertionError as e:
        print(f"❌ Failed: {e}")
        return False

    # Test 3: 2D array handling
    print("\nTest 3: 2D array handling")
    try:
        test_2d = np.array([[-1, 2], [0, -3]])
        expected_2d = np.array([[0, 2], [0, 0]])
        result_2d = relu(test_2d)
        assert np.array_equal(result_2d, expected_2d), f"Expected {expected_2d}, got {result_2d}"
        print("✅ Passed: 2D array handling works correctly")
    except AssertionError as e:
        print(f"❌ Failed: {e}")
        return False

    # Test 4: Edge cases
    print("\nTest 4: Edge cases")
    try:
        # Test with very large numbers
        assert relu(1e10) == 1e10, f"Expected 1e10, got {relu(1e10)}"
        assert relu(-1e10) == 0, f"Expected 0, got {relu(-1e10)}"

        # Test with very small numbers
        assert relu(1e-10) == 1e-10, f"Expected 1e-10, got {relu(1e-10)}"
        assert relu(-1e-10) == 0, f"Expected 0, got {relu(-1e-10)}"

        # Test with empty array
        empty_result = relu(np.array([]))
        assert empty_result.size == 0, f"Expected empty array, got {empty_result}"

        print("✅ Passed: Edge cases handled correctly")
    except AssertionError as e:
        print(f"❌ Failed: {e}")
        return False

    # Test 5: Data type preservation
    print("\nTest 5: Data type preservation")
    try:
        # Test with float array
        float_array = np.array([1.5, -2.7, 0.0])
        float_result = relu(float_array)
        assert float_result.dtype == float_array.dtype, f"Expected dtype {float_array.dtype}, got {float_result.dtype}"

        # Test with int array
        int_array = np.array([1, -2, 0])
        int_result = relu(int_array)
        assert int_result.dtype == int_array.dtype, f"Expected dtype {int_array.dtype}, got {int_result.dtype}"

        print("✅ Passed: Data types preserved correctly")
    except AssertionError as e:
        print(f"❌ Failed: {e}")
        return False

    # Test 6: Mathematical properties
    print("\nTest 6: Mathematical properties")
    try:
        # Test that ReLU is idempotent (applying it twice gives the same result)
        test_values = np.array([-2, -1, 0, 1, 2])
        first_apply = relu(test_values)
        second_apply = relu(first_apply)
        assert np.array_equal(first_apply, second_apply), "ReLU should be idempotent"

        # Test that ReLU preserves non-negative values
        non_neg = np.array([0, 1, 2, 3])
        non_neg_result = relu(non_neg)
        assert np.array_equal(non_neg, non_neg_result), "ReLU should preserve non-negative values"

        print("✅ Passed: Mathematical properties verified")
    except AssertionError as e:
        print(f"❌ Failed: {e}")
        return False

    print("\n🎉 All tests passed! Your ReLU implementation is correct!")
    return True

# Run the tests
run_unit_tests()

## Visualize the ReLU Function

Let's create a visualization to see how your ReLU function behaves:

In [None]:
# Create a range of values to plot
x_values = np.linspace(-5, 5, 1000)
y_values = relu(x_values)

# Create the plot
plt.figure(figsize=(10, 6))
plt.plot(x_values, y_values, 'b-', linewidth=2, label='ReLU(x)')
plt.plot(x_values, x_values, 'r--', alpha=0.5, label='y = x')
plt.plot(x_values, np.zeros_like(x_values), 'g--', alpha=0.5, label='y = 0')

plt.grid(True, alpha=0.3)
plt.xlabel('x')
plt.ylabel('ReLU(x)')
plt.title('ReLU Activation Function')
plt.legend()
plt.xlim(-5, 5)
plt.ylim(-1, 5)
plt.show()

print("If your implementation is correct, you should see:")
print("- A line that follows y=x for positive values (blue line)")
print("- A line that stays at y=0 for negative values (blue line)")
print("- The function should be continuous at x=0")

🎉 **Great job!**  
You’ve not only implemented ReLU in NumPy, but you’re now ready to bring that knowledge into **PyTorch**.  
This is a big step forward: from coding things manually to using the same tools that power real-world deep learning models.  


## Next Steps with PyTorch 🐍⚡

Now that you’ve implemented ReLU in NumPy, let’s move on to **PyTorch**, where you’ll use it in real neural networks:

- **Use `nn.ReLU`**: practice applying ReLU as a standalone activation layer.  

🎯 By doing this, you’ll bridge the gap between theory (NumPy) and practice (PyTorch) — the same activation functions you coded manually are the building blocks of deep learning models.

---

## ReLU Function with PyTorch

PyTorch provides ReLU in two flavors:
- **Module API:** `nn.ReLU(inplace=False)` — use it like a layer (e.g., inside `nn.Sequential`).  
- **Functional API:** `torch.nn.functional.relu(x, inplace=False)` — call it directly on tensors in `forward`.  

[Docs](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html)

👉 In this exercise, we’ll start with the **Module API**.  
**Task:** Apply `nn.ReLU()` to a sample tensor.  


In [None]:
import torch
import torch.nn as nn

# === Part A: Module API ===
# TODO 1: Create a ReLU module (do NOT use inplace for now)
relu_layer = #your code goes here

# Example input tensor
x = torch.tensor([[-2.0, -0.5, 0.0, 1.0, 3.0]])
print("Input:", x)

# TODO 2: Apply ReLU to x using the module
out_mod = #your code goes here
print("Output (nn.ReLU):", out_mod)


# Unit tests for PyTorch ReLU

In [None]:
# 1) Correct output on the sample tensor
expected = torch.tensor([[0., 0., 0., 1., 3.]])
assert torch.equal(out_mod, expected), "❌ ReLU output is incorrect on the sample input."

# 2) Non-negativity check
assert torch.all(out_mod >= 0), "❌ ReLU produced negative values."

# 3) Shape preservation
assert out_mod.shape == x.shape, "❌ ReLU changed the shape of the tensor."

# 4) Random tensor test (reproducible)
torch.manual_seed(0)
rand_x = torch.randn(5, 5)
rand_out = relu_layer(rand_x)
assert torch.all(rand_out >= 0), "❌ ReLU produced negatives on random input."
assert rand_out.shape == rand_x.shape, "❌ Output shape mismatch on random test."

print("✅ All ReLU PyTorch tests passed!")


🎉 Fantastic work!  
You’ve now implemented and validated ReLU both in **NumPy** and **PyTorch**.  
This means you’re not only coding activation functions from scratch, but also using them inside one of the most popular deep learning frameworks today. 🚀


---

## SOLUTION (NOT FOR LEARNERS)

**Note: This section contains the complete solution and should not be shown to learners.**

Here's the complete implementation of the ReLU function:

In [None]:
# SOLUTION: Complete ReLU implementation
def relu_solution(x):
    """
    Complete solution for the ReLU activation function.

    Args:
        x: Input array or scalar (numpy array or scalar)

    Returns:
        numpy array: ReLU transformation of the input
    """
    # Convert input to numpy array if it's not already
    x = np.array(x)

    # Method 1: Using np.maximum (most common and efficient)
    result = np.maximum(0, x)

    # Alternative methods:
    # Method 2: Using np.where
    # result = np.where(x > 0, x, 0)

    # Method 3: Using boolean indexing
    # result = x.copy()
    # result[x < 0] = 0

    return result

print("Solution implementation:")
print("def relu(x):")
print("    x = np.array(x)")
print("    return np.maximum(0, x)")

print("\nKey points about this solution:")
print("1. np.maximum(0, x) is the most efficient and readable way")
print("2. It automatically handles broadcasting for arrays of different shapes")
print("3. It preserves the data type of the input")
print("4. It's vectorized, so it's fast for large arrays")
print("5. It correctly handles all edge cases (zeros, negative numbers, etc.)")

Here's the complete implementation of the ReLU with pytorch function:

In [None]:
import torch
import torch.nn as nn

# === Part A: Module API ===
# TODO 1: Create a ReLU module (do NOT use inplace for now)
relu_layer = nn.ReLU()

# Example input tensor
x = torch.tensor([[-2.0, -0.5, 0.0, 1.0, 3.0]])
print("Input:", x)

# TODO 2: Apply ReLU to x using the module
out_mod = relu_layer(x)
print("Output (nn.ReLU):", out_mod)
