###### Content under Creative Commons Attribution license CC-BY 4.0, code under BSD 3-Clause License © 2021 Lorena A. Barba

# Linear regression by gradient descent

This module of _Engineering Computations_ takes a step-by-step approach to introduce you to the essential ideas of deep learning, an algorithmic technology that is taking the world by storm. 
It is at the core of the artificial intelligence boom, and we think every scientist and engineer should understand the basics, at least. 

Another term for deep learning is deep neural networks. 
In this module, you will learn how neural-network models are built, computationally. 
The inspiration for deep learning may have been how the brain works, but in practice what we have is a method to build models, using mostly linear algebra and a little bit of calculus. 
These models are not magical, or even "intelligent"—they are just about _optimization_, which every engineer knows about!

We start with the very basics of model-building: linear regression. 
In Module 1 of the _Engineering Computations_ series, we discuss [linear regression with real data](http://go.gwu.edu/engcomp1lesson5), and find the model parameters (slope and $y$-intercept) analytically. 
Let's forget about that for this lesson. 
The key concept here will be _gradient descent_. Start your ride here!

## Gradient descent

This lesson is partly based on a tutorial at the 2019 SciPy Conference by Eric Ma [1]. He begins his tutorial by presenting the idea of _gradient descent_ with a simple quadratic function: the question is how do we find this function's minimum?

$$f(w) = w^2 +3w -5$$

We know from calculus that at the minimum, the derivative of the function is zero (the tangent to the function is horizontal), and the second derivative is positive (the function slants _up_ on each side of the minimum). 
The analytical derivative of the function above is $f^\prime(w) = 2w + 3$ and the second derivative is $f^{\prime\prime}(w)=2>0$. Thus, we make $2w+3=0$ to find the minimum.

Let's play with this function using SymPy. We'll later use NumPy, and make plots with Matplotlib, so we load all the libraries in one place.

In [None]:
import sympy
import numpy

from matplotlib import pyplot
%matplotlib inline

We run this SymPy method to get beautiful typeset symbols and equations (in the Jupyter notebook, it will use MathJax by default): 

In [None]:
sympy.init_printing()

Now we'll define the Python variable `w` to be a SymPy symbol, and create the expression `f` to match the mathematical function above, and plot it.

In [None]:
w = sympy.Symbol('w', real=True)

f = w**2 + 3*w - 5
f

In [None]:
sympy.plotting.plot(f);

A neat parabola. We can see from the plot that the minimum of $f(w)$ is reached somewhere betweetn $w=-2.5$ and $w=0$. SymPy can tell us the derivative, and the value where it is zero:

In [None]:
fprime = f.diff(w)
fprime

In [None]:
sympy.solve(fprime, w)

That looks about right: $-3/2$.

## References

1. Eric Ma, "Deep Learning Fundamentals: Forward Model, Differentiable Loss Function & Optimization," SciPy 2019 tutorial. [video on YouTube](https://youtu.be/JPBz7-UCqRo) and [archive on GitHub](https://github.com/ericmjl/dl-workshop/releases/tag/scipy2019).

In [None]:
# Execute this cell to load the notebook's style sheet, then ignore it
from IPython.core.display import HTML
css_file = '../style/custom.css'
HTML(open(css_file, "r").read())