In [6]:
import numpy as np
import descent
import matplotlib.pyplot as plt
%matplotlib inline



# Introduction to the descent package

Let's say we want to optimize the following:

$$ \min_x f(x) $$

given a computable expression for $f(x)$ and $\nabla f(x)$.

For example, here's a toy function (a quadratic):

In [7]:
def f_df(x):
    objective = 0.5 * x ** 2
    gradient = float(x)
    return objective, gradient

These objective & gradient functions are at the core of using the descent package

## Organization

Descent contains a number of agorithms that can be used to perform unconstrained and constrained optimization. These algorithms are split into two types, the _first-order gradient-based algorithms_ and _proximal algorithms_.

First-order gradient-based algorithms:
- Gradient descent (can be stochastic, with or without momentum)
- [Stochastic average gradient (SAG)](http://arxiv.org/abs/1309.2388)
- RMSProp
- [Adam](http://arxiv.org/abs/1412.6980)

Proximal algorithms:
- Proximal gradient descent
- Alternating direction method of multipliers (ADMM)

In [8]:
x0 = 5.

In [17]:
opt = descent.sgd(f_df, x0, learning_rate=1e-1)

In [18]:
descent.check_grad(f_df, 24)

------------------------------------
Numerical  | Analytic   | Error          
------------------------------------
24.0000    | 24.0000    | 0.000000 | [32m✔[0m


In [19]:
opt.display = None
opt.run()

In [21]:
opt.theta

8.739356258613211e-46