# Rules for the Derivative

### Introduction

In the last lesson we were introduced to the derivative.  The derivative is the instantaneous rate of change of the function.  Or another way to think about it is that it is the slope of a function at a given point.  We defined the derivative by the mathematical formula:

$\frac{\delta y}{\delta x} = lim_{\delta x\to0}\frac{y_1 - y_0}{x_1 - x_0}$.

> And can read the above as the change in $y$ divided per a change in $x$, as the change in $x$ approaches zero.

We saw an example of calculating the derivative for a function, $f(x) = 3x^2$, with the following code:

> First, we defined our function $f(x)$.

In [1]:
def f(x):
    return 3*(x**2)

> And then we defined our `rate_of_change` formula. 

In [3]:
def rate_of_change(x_0, x_1):
    return (f(x_1) - f(x_0))/(x_1 - x_0)

> And then we found the rate of change as delta x approached zero.

In [9]:
rate_of_change(-3, -3.01)

-18.02999999999993

In this lesson we'll see that there are certain shortcuts we can use for calculating the derivative at a given point.  These rules are essentially to how a program like Pytorch calculates the the slope of the cost function so quickly -- it just applies these rules.  And lucky for us, there's only three rules we need to know.

### 1. The power rule 

The main rule about derivatives we need to know is the power rule.  This is the power rule:

If $f(x) = x^z$, then the derivative, $f'(x)$ is

$f'(x) = z*x ^ {z-1}$

It's easier to see this with a couple examples.  Say that we have the following function:

$g(x) = x^2$ then $g'(x) = 2x^ 1 = 2x$ 

Or if $j(x) = 3x$ then $j'(x) = 3x^{1 - 1} = 3x^0 = 3$

And if $h(x) = x^4$ then $h'(x) = 4x^3$ 

Now it's your turn, if $z(x) = x^3$, use the power rule to find $z'(x)$.  
> Answer is at the end of this lesson.

Before moving on, let's take a moment to appreciate how nice this rule is.  It means that if we want to see the instantaneous rate of a function of the form $f(x) = x^z$, we just use the power rule.  

So when we found $h(x) = x^4 $ and $h'(x) = 4x^3 $.  This means that to see the instantaneous rate of change of $h(x)$ when $x = 2$, we can calculate that:

* $h'(x) = 4x^3 $ and

* $h'(2) = 4*2^3 = 4*8 = 32$

Which is also what we get with our previous formula for calculating the derivative.

In [13]:
def rate_of_change_h(x_0, x_1):
    return (h(x_1) - h(x_0))/(x_1 - x_0)

def h(x): 
    return x**4

rate_of_change_h(2, 2.001)

32.024008000998776

And we found this just by applying the power rule.

Before moving on, let's see one more common example.  If $f(x) = 3x$, then $f'(x) = 3$.  Do you see why? It's because applying our $f(x) = 3x=  3x^1$ and therefore:

* $f'(x) = 3x^{1 - 1} = 3*x^0 = 3*1 = 3$

So what's $f'(x)$ if $f(x) = 5x$?  

> Well the answer, by the same logic, $f'(x) = 5x^{1 - 1} = 5x^0 = 5*1 = 5$.

### 2. Constant Multiple Rule

Ok, now let's find derivative of a few more functions: 

$h(x) = 3x^2$

$h'(x) = 3*2x = 6x$

Or here's another example $g(x) = 4x^3$

$g'(x) = 4*3x^2 = 12x^2$

Both of these were solved with the constant multiple rule.

> The constant multiple rule states that the derivative of a constant times a function, is just the constant times the derivative.

In other words, if we have a function of the form $f(x) = c*x^z$, then the derivative of our function is:

$f'(x) = c*z*x^{z - 1}$

Let's see one more example.

When $z(x) = 5x^4$

$g'(x) = 5*4x^3 = 20x^3$

> Now it's your turn.  Find the derivative of the function $f(x)$ as defined below.

$f(x) = 5x^6$

$f'(x) = ?$

> The answer is at the end of the lesson.

### 3. Changing Variables

Now remember that we have two different notations for the derivative.  For example, instead of writing the derivative of $f(x) = 5x^6$ as $f'(x) = 30x^5$ we can also write it as $\frac{\delta f}{\delta x} = 30x^5$.  This latter way is a little more explicit.  We can read this as the change in our function $f$ per a nudge in $x$.

Now what if we have a function that looks like the following: $z(w, x) = w^2*x$.  Let's calculate $\frac{\delta z}{\delta w}$.  This is the answer:

*  $\frac{\delta z}{\delta w} = (2w)*x$

So the point from the above is that, if we are not taking the derivative with respect to a variable, we just treat this variable as a constant.  So above, we are taking the derivative with respect to $w$ -- as expressed by $\frac{\delta z}{\delta w} $ -- we apply the derivative rules to $w$ variable, and treat everything else like a constant. 

Let's see another example.  Say we have the following function:

* $j(a, b) = 2*b^3*a$

And we calculate $\frac{\delta j}{\delta b}$.  Then this time we treat $a$ like a constant and get:

$\frac{\delta j}{\delta b} = 2*3b^2*a  = 6b^2a$

So notice above that it's important to understand the notation to the left when making the caclulation.  If we see $\frac{\delta j}{\delta b}$, this means to find the change in the output of a function $j$ if we just nudge the variable $b$.  And we read this as the derivative of $j$ *with respect to b*.  If we see $\frac{\delta z}{\delta w}$, this means to find the derivative of a function $\z$ with respect to $w$.  Or in other words, it means to find the change in the output of $z$ as we nudge a parameter $w$.

> When we have a function that has multiple variables (like $a$ and $b$) above, and we want to take the derivative with respect to one of the variables this is called taking the **partial derivative**.  If you'd like to learn more about the partial derivative, we have an entire lesson on it in the math supplements.

### 4. Multiple Terms

Now, let's discuss how we find the derivative of a function that has multiple terms.  For example, the function below has three terms:

$g(x) = 3x^3  + 2x^2 + 100$

To find the derivative of the function $g(x)$ above, we apply our derivative rule to each term in the function.  So we get the following:

$g'(x) = 9x^2  + 4x + 0 $

> This is called the *sum rule* for the derivative.  

Notice that the derivative of the last term, 100, equals 0 because the rate of change of a constant term is always 0.

> Your turn.  Now find the derivative of the function $h(x)$ below:

$h(x) = 2x^4  + x^2 + 200x$

> $h'(x) = ?$

### Summary

Ok, that's all it for our math review!  There is really just one main rule, and that's our power rule:

* If $f(x) = x^z$, then the derivative, $f'(x)$ is $f'(x) = c*z*x^{z - 1}$.

Remember why this is so powerful.  If we have a function like:

$g(x) = x^2$ then $g'(x) = 2x^ {2- 1} = 2x$ 

And once we calculate the derivative if we want to find the instantaneous rate of change at any givene value, we just need to plug in our different values for $x$.  

So $g'(3) = 2*3 = 6$ and $g'(4) = 2*4 = 8$.  That's it.

<center>
<a href="https://www.jigsawlabs.io/free" style="position: center"><img src="jigsaw-icon.png" width="10%" style="text-align: center"></a>
</center>

### Answers

$z(x) = x^3$, $z'(x) = 3x^2$


$f(x) = 5x^6$, $f'(x) = 5*6x^5 = 30x^5$

$h(x) = 2x^4  + x^2 + 200x$, $h'(x) = 8x^3 + 2x + 200$