# Purpose of derivatives in Gradient Descent
- The purpose of derivatives in gradient descent is to determine the rate of change of the cost function with respect to the model's parameters.
- The cost function represents the difference between the model's predictions and the actual values, and the goal of gradient descent is to minimize this cost.

- In gradient descent, the derivatives of the cost function with respect to the model's parameters are used to compute the gradient, which is a vector of the partial derivatives.
 - The gradient points in the direction of the steepest increase of the cost function, and the magnitude of the gradient represents the rate of change.

- During each iteration of gradient descent, the parameters are updated in the direction of the negative gradient, i.e., in the direction of the steepest decrease of the cost function.
- This ensures that the parameters are updated in such a way that the cost is gradually minimized.

- By computing the derivatives, gradient descent is able to effectively navigate the parameter space and find the optimal parameters that minimize the cost function, which in turn leads to better model performance.

#### Derivatives of "interacting" functions (multiplication and embedding) is unintuitive
#### In practice, libraries like `pytorch`, `tensorflow`, etc have routines that estimate complicated derivatives very efficiently and accurately

In [None]:
import numpy as np
import sympy as sym

# make the equation look nicer
from IPython.display import display

In [None]:
# create symbolic variables in sympy
x=sym.symbols('x')

# create two functions
fx=2*x**2
gx=4*x**3 - 3*x**4

# compute their individual derivatives
df=sym.diff(fx)
dg=sym.diff(gx)

# apply the product rule manually
manual = df*gx+fx*dg
thewrongway=df*dg

# via sympy
viasympy=sym.diff(fx*gx)

# print it all
print('The function is: ')
display(fx) # display command is part of ipython library
display(gx)
print(' ')

print('The derivatives: ')
display(df)
display(dg)
print(' ')

print('Manual Product rule: ')
display(manual)
display()
print(' ')

print('Via sympy: ')
display(viasympy)
display(viasympy)
print(' ')

The function is: 


2*x**2

-3*x**4 + 4*x**3

 
The derivatives: 


4*x

-12*x**3 + 12*x**2

 
Manual Product rule: 


2*x**2*(-12*x**3 + 12*x**2) + 4*x*(-3*x**4 + 4*x**3)

 
Via sympy: 


2*x**2*(-12*x**3 + 12*x**2) + 4*x*(-3*x**4 + 4*x**3)

2*x**2*(-12*x**3 + 12*x**2) + 4*x*(-3*x**4 + 4*x**3)

 


In [None]:
# repeat with the chain rule
fx=(x**2+4*x**3)**5

print('The function: ')
display(fx)
print(' ')

print('Its derivative')
display(sym.diff(fx))

The function: 


(4*x**3 + x**2)**5

 
Its derivative


(60*x**2 + 10*x)*(4*x**3 + x**2)**4