In [3]:
pip install numpy

Note: you may need to restart the kernel to use updated packages.


In [5]:
pip install autograd

Note: you may need to restart the kernel to use updated packages.


<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/3c/AutomaticDifferentiationNutshell.png/1920px-AutomaticDifferentiationNutshell.png">

#### In mathematics and computer algebra, automatic differentiation (AD), also called algorithmic differentiation, computational differentiation,[1][2] auto-differentiation, or simply autodiff, is a set of techniques to numerically evaluate the derivative of a function specified by a computer program. AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations (addition, subtraction, multiplication, division, etc.) and elementary functions (exp, log, sin, cos, etc.). By applying the chain rule repeatedly to these operations, derivatives of arbitrary order can be computed automatically, accurately to working precision, and using at most a small constant factor more arithmetic operations than the original program.

In [6]:
import autograd.numpy as np
from autograd import grad

In [7]:
# Tester
print("Hello Autograd")

Hello Autograd


In [8]:
def taylor_sine(x):
    ans = currterm = x
    i = 0
    while np.abs(currterm) > 0.001:
        currterm = -currterm * x**2 /((2 * i + 3) * (2 * i + 2))
        ans = ans + currterm
        i += 1
    return ans

In [10]:
grad_sine = grad(taylor_sine)
print("Gradient of sin(pie) is ",grad_sine(np.pi))

Gradient of sin(pie) is  -0.9998995297042174


## <font color='red'> A common use case for automatic differentiation is to train a probabilistic model. Here we have a very simple (but complete) example of specifying and training a logistic regression model for binary classification</font>

In [11]:
def sigmoid(x):
    return 0.5 * (np.tanh(x/2.)+1)

In [12]:
# Outputs probability of a label being true according to logistic model.
def logistic_predictions(weight,inputs):
    return sigmoid (np.dot(inputs,weights))

In [13]:
# Training loss is the negative log-likelihood of the training labels
def training_loss(weights):
    preds = logistic_predictions(weights,inputs)
    label_probabilities = preds * targets + (1 - preds) * (1 - targets)
    return -np.sum(np.log(label_probabilities))

In [14]:
# Building a toy dataset
inputs = np.array([[0.52,1.12,0.77],
                  [0.88, -1.08,0.15],
                  [0.52,0.06,-1.30],
                  [0.74,-2.49,1.39]])
targets = np.array([True,True,False,True])

In [15]:
# Define a function that returns gradients of training loss using Autograd
training_gradient_fun = grad(training_loss)


In [16]:
# Optimize weights using gradient descent
weights = np.array([0.0,0.0,0.0])
print("Initial loss: ",training_loss(weights))

Initial loss:  2.772588722239781


In [17]:
for i in range(100):
    weights -= training_gradient_fun(weights) * 0.01

print("Trained loss: ",training_loss(weights))

Trained loss:  2.772588722239781


