## Mathematical Optimization Series

In this brief post we motivate the study of **mathematical optimization**, the collection of methods built on basic calculus by which we determine proper parameters for machine learning / deep learning models. When viewed geometrically the pursuit of proper parameters is also the search for the lowest point - or minimum - of a machine learning model's associated cost function.  

# Part 1: Motivation

Every machine learning / deep learning learning problem has parameters that must be tuned properly to ensure optimal learning. For example, there are two parameters that must be properly tuned in the case of a simple linear regression - when fitting a line to a scatter of data: the slope and intercept of the linear model.  

These two parameters are tuned by forming what is called a *cost function* or *loss function*.  This is a continuous function in both parameters - that measures how well the linear model fits a dataset given a value for its slope and intercept. The proper tuning of these parameters via the cost function corresponds geometrically to finding the values for the parameters that make the cost function as small as possible or, in other words, *minimize* the cost function.  The image below - taken from [[1]]((#references)) -  illustrates how choosing a set of parameters higher on the cost function results in a corresponding line fit that is poorer than the one corresponding to parameters at the lowest point on the cost surface.

<img src="../../mlrefined_images/math_optimization_images/bigpicture_regression_optimization.png" width=500 height=250/>

This same idea holds true for regression with higher dimensional input, as well as classification where we must properly tune parameters to *separate* classes of data.
Again, the parameters minimizing an associated cost function provide the best classification result. This is illustrated for classification below - again taken from [[1]](#references).

<img src="../../mlrefined_images/math_optimization_images/bigpicture_classification_optimization.png" width=500 height=250/>

The tuning of these parameters, or the *minimization of a cost function*, is accomplished by a set of tools known collectively as **mathematical optimization**.  This is a set of algorithms built using the basic components of vector calculus described in our *Vital Elements of Calculus* series.  

> The tools of mathematical optimization are designed to minimize cost functions.  When applied to learning problems this corresponds to properly tuning the parameters of a learning model.


Mathematical optimization is the workhorse of machine learning / deep learning, playing a role in virtually every learning problem.  In this series of posts we describe the major concepts and algorithms of mathematical optimization used in practice today for machine learning / deep learning problems.

<a id='references'></a>
## References

[1]  Jeremy Watt, Reza Borhani, and Aggelos Katsaggelos. Machine Learning Refined. Cambridge University Press, 2016.