<a href="https://colab.research.google.com/github/tobypullan/tobypullan.github.io/blob/main/MicrogradPart1Exercises.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# Part 1 Exercises:

# Exercise 1: Derivative of a Simple Function with One Input

# Let's explore how to numerically approximate the derivative of a function.


def f(x):
    """A simple quadratic function."""
    return x**2 + 3*x + 1

def plot_function(f, x_range):
    """Plot the function and its derivative."""
    x = np.linspace(x_range[0], x_range[1], 100)
    y = f(x)

    plt.figure(figsize=(10, 6))
    plt.plot(x, y, label='f(x)')
    plt.title('Function')
    plt.xlabel('x')
    plt.ylabel('y')
    plt.legend()
    plt.grid(True)
    plt.show()

# Plot the function
plot_function(f, (-5, 5))

In [None]:
# TODO: Implement a function to numerically approximate the derivative
def approximate_derivative(f, x, h=1e-5):
    """
    Approximate the derivative of function f at point x.

    Parameters:
    f (function): The function to differentiate
    x (float): The point at which to calculate the derivative
    h (float): The small step size for approximation

    Returns:
    float: The approximated derivative
    """
    # Your code here
    pass

# Test your implementation
x_test = 2
approx_deriv = approximate_derivative(f, x_test)
print(f"Approximated derivative of f(x) at x = {x_test}: {approx_deriv}")




In [None]:
# Bonus: Can you think of a way to verify if your approximation is correct?
# Hint: For this simple function, you can calculate the actual derivative analytically.

# Exercise 2: Exploring the Effect of Step Size

def plot_approximation_error(f, x, true_derivative):
    """Plot the approximation error for different step sizes."""
    step_sizes = np.logspace(-12, 0, 100)
    errors = [abs(approximate_derivative(f, x, h) - true_derivative) for h in step_sizes]

    plt.figure(figsize=(10, 6))
    plt.loglog(step_sizes, errors)
    plt.title('Approximation Error vs Step Size')
    plt.xlabel('Step Size (h)')
    plt.ylabel('Absolute Error')
    plt.grid(True)
    plt.show()

# TODO: Calculate the true derivative of f(x) at x = 2
true_derivative = None  # Replace None with the correct value

# Plot the approximation error
plot_approximation_error(f, 2, true_derivative)


In [None]:
# Questions to consider:
# 1. What happens to the approximation error as the step size becomes very small?
# 2. What happens as the step size becomes very large?
# 3. Is there an optimal step size? Why or why not?

In [2]:
# Exercise 2: Derivatives of Functions with Multiple Inputs

def f(x, y, z):
    """A function with multiple inputs."""
    return x**2 + y*z + np.sin(x*y)

In [5]:
def partial_derivative(f, var, point, h=1e-5):
    """
    Calculate the partial derivative of a function with respect to a specific variable.

    Parameters:
    f (function): The function to differentiate
    var (int): The index of the variable to differentiate with respect to
    point (tuple): The point at which to calculate the derivative
    h (float): The small step size for approximation

    Returns:
    float: The approximated partial derivative
    """
    # TODO: Implement the partial derivative function
    point_list = list(point)
    point_list[var] += h
    return (f(*point_list) - f(*point)) / h
# Test the partial derivative function
test_point = (1, 2, 3)
print(f"Partial derivative with respect to x at {test_point}: {partial_derivative(f, 0, test_point)}")
print(f"Partial derivative with respect to y at {test_point}: {partial_derivative(f, 1, test_point)}")
print(f"Partial derivative with respect to z at {test_point}: {partial_derivative(f, 2, test_point)}")


Partial derivative with respect to x at (1, 2, 3): 1.1676981410246867
Partial derivative with respect to y at (1, 2, 3): 2.5838486170215447
Partial derivative with respect to z at (1, 2, 3): 2.0000000000131024


In [None]:
# My partial derivative function:
# point_list = list(point)
# point_list[var] += h
# return (f(*point_list) - f(*point)) / h

# Explanation:
# Line 1: converts the tuple of points into a list to enable h to be added to
# each point
# Line 2: adds h to the point at var
# Line 3: unpacks points in point_list and point. calculates partial derivative
# with respect to the point at position var in the point_list and returns this.

In [None]:
# Collecting the partial derivatives
def gradient(f, point, h=1e-5):
    """
    Calculate the gradient of a function at a given point.

    Parameters:
    f (function): The function to differentiate
    point (tuple): The point at which to calculate the gradient
    h (float): The small step size for approximation

    Returns:
    numpy.array: The gradient vector
    """
    return np.array([partial_derivative(f, i, point, h) for i in range(len(point))])

# Test the gradient function
grad = gradient(f, test_point)
print(f"Gradient at {test_point}: {grad}")

# Questions to consider:
# 1. How does the concept of partial derivatives relate to the gradient of the loss function?
# 2. Can you think of a real-world scenario where calculating gradients of multi-input functions is useful?
# 3. How might you verify the correctness of your gradient calculation?


# Answers to Questions (Exercise 1):

## 1. What happens to the approximation error as the step size becomes very small?
As the step size becomes very small, the approximation error initially decreases,
but then starts to increase again due to numerical precision limitations of floating-point arithmetic.
This is known as round-off error.

## 2. What happens as the step size becomes very large?
As the step size becomes very large, the approximation error increases.
This is because the approximation becomes less accurate as we move further away from the point of interest.

## 3. Is there an optimal step size? Why or why not?
Yes, there is typically an optimal step size. It's a balance between two competing factors:
- Too small: leads to round-off errors due to limited floating-point precision
- Too large: leads to poor approximation of the tangent line

The optimal step size minimizes the combined effect of these two sources of error.
It often occurs where the curve in the log-log plot of error vs. step size has a minimum.


# Answers to Questions to Consider (Exercise 2):

## 1. How does the concept of partial derivatives relate to the gradient of the loss function?

The gradient is a vector composed of all the partial derivatives of a function with respect to each of its input variables. Each component of the gradient represents how the function changes when we vary one input variable while keeping all others constant. In essence, the gradient is a collection of all partial derivatives, providing a complete picture of how the function changes with respect to all its inputs.

## 2. Can you think of a real-world scenario where calculating gradients of multi-input functions is useful?

One practical application is in machine learning, specifically in training neural networks through gradient descent. The gradient of the loss function with respect to the model parameters indicates the direction of steepest increase in the loss. By moving in the opposite direction of this gradient, we can adjust the model parameters to minimize the loss and improve the model's performance.

Another example is in optimization problems in economics or engineering, where we might want to maximize or minimize a function that depends on multiple variables.

## 3. How might you verify the correctness of your gradient calculation?

There are several ways to verify the gradient calculation:

a) Compare with analytical derivatives: For simple functions, we can calculate the gradient analytically and compare it with our numerical approximation.

b) Finite difference method: We can use a more accurate finite difference method (like central difference) and compare it with our current implementation.

c) Gradient checking: We can slightly perturb each input variable and observe if the function's change matches what we'd expect based on our calculated gradient.

d) Symmetry tests: For some functions, we might know that certain partial derivatives should be equal due to the function's symmetry.

e) Unit tests: We can create test cases with known gradients and ensure our function produces the expected results.
