# Derivatives of straight lines

### Learning objectives

* Understand that derivatives are the instantaneous rate of change of a function
* Understand how to calculate a derivative of a straight line

### Introduction

In the lesson discussing step sizes of our gradient descent algorithm, we filled in some more information on how to find "best fit" regression line with using gradient descent.  Namely, we learned how to more carefully change the y-intercept of the regression line to minimize the residual sum of squares.  

We did this by calibrating the size and direction of of our change of a regression line parameter -- let's say $b$, our y intercept -- to the slope of the line tangent to the cost curve at that value of $b$. By tangent line, we mean a line that "just touches" our curve at a given point.  

So below is a curve that shows the RSS of a regression line with different values of $b$.  Our orange, green, and red lines are each tangent to the curve at their respective points. 

![](./tangent-lines.png)

With our gradient descent algorithm, the larger the absolute value of the slope, the larger the change in our regression line parameter -- that is, the larger our step size.  So we take a much larger step when our slope is -146.17 at $b = 70$ than we do when our slope equals -58.51 at $b = 85$.

So here is what **we know so far:** 
* Our gradient descent technique depends on changing our values according to the slope of our cost curve

Here is **what we do not know:**
* How to find that slope or rate of change of a function at a given point.  

The instantaneous rate of change of a function at a given point is called the function's derivative.  

> The instantaneous rate of change of a function is called the **derivative**.  

Derivatives are so important because they tell us how a function is changing at any given point.  If we see how something is changing there is a good chance we can see what is coming next.  Derivatives allow us to see what is coming next.  

But all a derivative means is the instantaneous rate of change of a function.  And if our functions are straight lines, you likely know how how to calculate that rate of change: rise over run.  We'll focus on calculating the derivatives of straight line functions, or linear functions, and will move onto calculating the derivative of curved lines (like our cost curve) in a future lesson.   

### Understanding the rate of change

Let's say that we want a function that  represents a person taking a jog.  We'll represent this by drawing a straight line.

![](./running-miles.png)

The graph above helps us see how distance changes in relation to time, or in other words speed.  So here, when we ask about rate of change, we're asking how fast is our jogger travelling? 

### Calculating the rate of change

To calculate the miles per hour we can simply see where a person is at a given time, then wait an hour and to see how far he travelled.  Or we can wait two hours and divide distance travelled by two.  Generically, our technique is to divide number of miles travelled by the number of hours passed.  In this specific example we'll imagine doing the following to calculate the speed at hour 1.

> * Start a stop watch after one hour and see the distance at that hour
> * Then, let time elapse one hour and see the distance at that next hour.  
> * Then divide the difference in the distances by the elapsed time.  

In the below graph, we begin to calculate the speed at hour number one.

![](./deltaxdeltay.png)

So we calculate speed at hours one by seeing were our jogger starts and ends in that hour.  Our jogger went from mile numbers three to six -- indicated by the orange line.  And so miles per hour is:

 $$ \frac{miles}{hour} = \frac {end distance - start distance}{end time - start time} = \frac {6 - 3}{2 - 1} = 3$$

Moving from miles per hour to **calculating the rate of change in generically**, the rate of change is the change in y divided by the change in x. 

* And another way of expressing **change in y** is:  
   * $y_2 - y_1$ or $\Delta y$, read delta y 
* And another way of expressing **change in x** is:  
   * $x_2 - x_1$ or $\Delta x$, read delta x

And generically we can say that: 

* rate of change $= \frac{rise}{run} = \frac{\Delta y}{\Delta x} = \frac{y_2 - y_1}{x_2 - x_1}$

Just like in our example, we saw: 

* miles per hour =  $\frac{distance_2 - distance_1}{time_2 - time_1} = \frac{18 - 12}{4 - 6} = \frac{6}{2} = 3$ mph

We're doing great.  Derivatives are the rate of change of a function at a given point.  For a linear function as we work with here, we calculate them through rise over run, or the change in y divided by the change in x, expressed $\frac{\Delta y}{\Delta x}$.  The rest of this lesson, will simply be introducing more math terms and symbols for expressing this same concept.  

> Stick with us, fully understanding these will pay off when we take the derivative of more complex functions.

### Derivatives with *even more symbols* 

So when we calculated that the rate of change of our jogger is 3 miles per hour, we calculated the derivative.  The derivative is the rate of change.  Of course, we know that in math we express our functions as $f(x)$.  Let's do that here.

![](./fxderivative.png)

Now if we are given a function $f(x)$, we say the derivative of that function is $f'(x)$ -- read f primed of x. 

So we can already express the derivative of a linear function $f(x)$ many different ways: 

* $ f'(x) = \frac{rise}{run} = \frac{\Delta y}{\Delta x} = \frac{y_2 - y_1}{x_2 - x_1} =  \frac{f(x_2) - f(x_1)}{x_2 - x_1}$

Take a look to at the equation far to the right:
    
$$f'(x) = \frac{f(x_2) - f(x_1)}{x_2 - x_1} $$ 

You see that we replaced $y_2 - y_1$ with $f(x_2) - f(x_1)$.  This makes sense, because really when we say $y_2$ and $y_1$, we mean the function's output at the first x value and the function's output at the second x value.  

And to indicate that we are calculating the derivative of $f(x)$ at a specific point we say, say hour 1, we say $f'(1)$.  That's the rate of change at hour 1.  So now we can plug in our values to calculate the derivative.  

* $x_1 = 1$ as hour 1 is our starting point
* $x_2 = 2$ as hour 2 is ending point starting point

giving us: 

$$f'(1) = \frac{f(2) - f(1)}{2 - 1} = \frac{6 - 3}{2 - 1} = 3 $$ 

So $f(x)$ equals the output at a given point.  And $f'(x)$ is the rate of change at a given point.  So then:
* $f(1)$ 
    * means the output at $x = 1$, or in our example, *the distance* at hour one, and 
* $f'(1)$ 
    * means the rate of change at $x = 1$, or in our example, *the speed* at hour one

And because the jogger's speed never changes throughout, and the derivative is the rate of change, the derivative also never changes.  We can see this visually, by plotting the distance through hours zero through five on the left and the speed through hours zero through five on the right.

![](./side-derivative.png)

> * To the left is a graph of $f(x) = 3x$ for different values of x.  
> * And to the right is a plot of the rate of change of that function, $f'(x)$, for different values $x$.
> * So while *the distance* changes through time, *the speed*, or rate of change, stays the same.

### Expressing the derivative in terms of change

Now our above formulas for calculating the derivative do the job, but they don't exactly express our technique in the example of our jogger.  Remember that our technique for calculating the jogger's speed is the following: 

> * Start a stop watch after one hour and see the distance at that hour
> * Then, let time elapse one hour and see the distance at that hour.  
> * Then divide the difference in the distances by the elapsed time.  

This is what this looks like in terms of math: 

$f'(x) = \frac{f(x_1 + \Delta x) - f(x_1)}{\Delta x} $

Let's take a second to fully understand this new formula.  It's not going away.  

* $f'(x)$ is the rate of change at a given value, or here the speed at a given time


* $f(x)$ is the distance at a given time, and $f(x_1)$ is the distance at the starting time, $x_1$


* The elapsed time is $\Delta x$, the change in x.


* $f(x_1 + \Delta x)$ is the distance at the starting time plus the elapsed time 

This is the definition that we will often see.  It expresses our technique of calculating the derivative.  
* Subtract the output at one input, x, from the output at that initial input plus a change in x.  
* Then divide that difference by the change in x.  

So that is the derivative, or the rate of change of a linear function.  The rate of change answers how much is our output changing at a given point.

### Summary 

In this lesson, we saw that the derivative is the change in output per a change in input.  So in the case of our jogger, with out input being time, we see that the derivative is the change in the runner's location (distance travelled) divided by the amount of time passed.

A lot of the tricky parts of derivatives is the mechanisms of expressing it.  Graphically, we see that the derivative is simply the rise over run or change in x divided by change in y or:

$$ f'(x) = \frac{\Delta y}{\Delta x} = \frac{y_2 - y_1}{x_2 - x_1} $$

Then we saw that we can express the derivative in terms of $f(x)$ instead of $y_1$ and $y_2$ as in the output at second x minus the output at the first x divided by the difference between the two x values.  Or, in an equation:

$$ f'(x) = \frac{f(x_2) - f(x_1)}{x_2 - x_1} $$

And finally we saw how we can express the derivative in terms of $\Delta x$ as in subtract the output at an initial x value from the output at that initial x value plus some change in x, then divide by that change in x.  Or, in an equation:

$$ f'(x) = \frac{f(x_1 + \Delta x) - f(x_1)}{\Delta x} $$
