# Symbolic Differentiation (SD)
### v.s.
# Automatic Differentiation (AD)
### v.s.
# Finite Differences (FD)

Automatic Differentiation, Symbolic Differentiation, and Finite Differences are three different approaches used in the field of computational mathematics and computer science for calculating derivatives of mathematical functions.
Each approach has its own advantages and disadvantages, and they are used in different contexts depending on the specific requirements of a problem.

1. **Symbolic Differentiation**

Symbolic Differentiation involves manipulating mathematical expressions symbolically to obtain their derivatives.
Instead of numerical values, it works with algebraic expressions. Symbolic differentiation provides exact derivatives as symbolic expressions.

For example, if you have an algebraic expression like $f(x) = x^2 + 3x + 5$, symbolic differentiation will yield $f'(x) = 2x + 3$.

Advantages of symbolic differentiation include precision and exactness, which is crucial in certain mathematical and scientific computations.
However, it can be computationally expensive and may lead to complex expressions, especially for functions with many variables or intricate forms.

2. **Automatic Differentiation (AutoDiff)**

Automatic Differentiation, also known as autodiff or AD, is a computational technique for efficiently and accurately computing derivatives of functions.
It is particularly useful for functions that are defined by computer programs or algorithms.
AutoDiff is based on the chain rule from calculus.

There are two main modes of AutoDiff:

- *Forward-mode AutoDiff:* This method computes the derivative of a function with respect to one input variable at a time. It is efficient when the number of input variables is relatively small compared to the number of functions to be evaluated.
- *Reverse-mode AutoDiff (Backpropagation):* This method computes the derivatives of all output variables with respect to a single input variable simultaneously. It is particularly efficient when the number of output variables is much larger than the number of input variables, which is common in machine learning and neural network training.

AutoDiff is widely used in optimization problems and machine learning, where gradients (derivatives) are essential for training algorithms like gradient descent.

3. **Finite Differences**

Finite Differences is a numerical method for approximating derivatives by using the values of a function at discrete points.
It is a simple and intuitive approach but is generally less accurate than AutoDiff or Symbolic Differentiation.

The basic idea is to calculate the derivative by dividing the change in the function's value ($\delta y$) by the change in the input variable ($\delta x$).
There are several finite difference schemes, including forward differences, backward differences, and central differences, depending on how the neighboring points are chosen.

For example, the forward difference approximation for the derivative of a function *f(x)* at a point $x$ is given by:

$$f'(x) \approx \frac{f(x + \delta x) - f(x)}{\delta x}$$

Finite Differences are straightforward to implement and can be used when you have access to the function's values but not its analytical expression.
They are commonly used in numerical analysis and scientific computing for problems where symbolic differentiation is impractical or too costly.
It is also used when the function of interest is a "black box".

---

Let's use the function $f(x) = 3*(2x+1)^2$ as an example for each technique in Python.

1. For symbolic differentiation, you can use a symbolic math library like `SymPy`:

In [1]:
import sympy as sp

# Define the symbolic variable and function
x = sp.symbols('x')
f = 3 * (2 * x + 1)**2

# Compute the derivative symbolically
derivative_symbolic = sp.diff(f, x)

print("Derivative using Symbolic Differentiation:", derivative_symbolic)

Derivative using Symbolic Differentiation: 24*x + 12


2. For AutoDiff in Python, you can use libraries like `TensorFlow` or `PyTorch`.
Here's an example using `PyTorch` (we will use `PyTorch` in this course):

In [2]:
import torch

# Define the input variable and function
x = torch.tensor([2.0], requires_grad=True)
f = 3 * (2 * x + 1)**2

# Compute the derivative using AutoDiff
f.backward()

# The derivative is stored in x.grad
derivative_auto_diff = x.grad.item()

print("Derivative using AutoDiff:", derivative_auto_diff)


Derivative using AutoDiff: 60.0


3. For finite differences, you can approximate the derivative numerically:

In [3]:
# Define the function
def f(x):
    return 3 * (2 * x + 1)**2

# Choose a small delta x
delta_x = 0.001

# Calculate the derivative using finite differences (forward differences)
x_value = 2.0
derivative_finite_diff = (f(x_value + delta_x) - f(x_value)) / delta_x

print("Derivative using Finite Differences:", derivative_finite_diff)


Derivative using Finite Differences: 60.01199999998619


We can see that the output is very different ofr the 3 techniques:
- SD gives us an expression
- AD gives us the exact value, for a given input
- FD gives us an approximative value, for a given input

---

Let's compare the computation times of the three techniques for the same example function ($f(x) = 3*(2x+1)^2$) using the `time` module of Python.

In [5]:
import torch
import sympy as sp
import time

# Define the function
def f(x):
    return 3 * (2 * x + 1)**2

# Method 1: Symbolic Differentiation
x_symbolic = sp.symbols('x')
f_symbolic = f(x_symbolic)
start_time = time.time()
derivative_symbolic = sp.diff(f_symbolic, x_symbolic)
symbolic_time = time.time() - start_time

# Method 2: Automatic Differentiation
x_auto_diff = torch.tensor([2.0], requires_grad=True)
start_time = time.time()
f_auto_diff = f(x_auto_diff)
f_auto_diff.backward()
derivative_auto_diff = x_auto_diff.grad.item()
auto_diff_time = time.time() - start_time

# Method 3: Finite Differences
x_value = 2.0
delta_x = 0.001
start_time = time.time()
derivative_finite_diff = (f(x_value + delta_x) - f(x_value)) / delta_x
finite_diff_time = time.time() - start_time

print("Derivative using Symbolic Differentiation:", derivative_symbolic)
print("Derivative using Automatic Differentiation:", derivative_auto_diff)
print("Derivative using Finite Differences:", derivative_finite_diff)

print("Computation time for Symbolic Differentiation:", symbolic_time)
print("Computation time for Automatic Differentiation:", auto_diff_time)
print("Computation time for Finite Differences:", finite_diff_time)


Derivative using Symbolic Differentiation: 24*x + 12
Derivative using Automatic Differentiation: 60.0
Derivative using Finite Differences: 60.01199999998619
Computation time for Symbolic Differentiation: 0.001008749008178711
Computation time for Automatic Differentiation: 0.0009920597076416016
Computation time for Finite Differences: 0.0


- We can see that SD is the slowest, followed by AD, and FD is the fastest.
- For more complex functions, SD would much slower than AD (for simple example like this one, they're about the same).
- FD is always very fast, but it gives an approximation, and may not be stable on more complex examples.

---

Let's do one multi-variables example: 
$$f(x, y) = x^2y^3 + 3(x-1)^2y + (y+1)^2/x$$