Logistic Regression Part 2
--------------------------

In this example, we extend the code from Part 1 with several important features:
- Instead of just updating the weight matrix ``W``, we add a bias ``b`` and use the ``.variables()`` method to compactly update both variables.
- We attach an additional computation to the transformer to compute the loss on a held-out validation dataset.
- We switch from a flat ``C``-dimensional feature space to a ``W x H`` feature space to demonstrate multi-dimensional logistic regression.

The corresponding jupyter notebook is found [here](https://github.com/NervanaSystems/ngraph/blob/master/examples/walk_through/Logistic_Regression_Part_2.ipynb).

In [None]:
import ngraph as ng
import ngraph.transformers as ngt
import gendata

The axes creation is the same as before, except we now add a new axes ``H`` to represent the new feature space.

In [None]:
ax = ng.make_name_scope(name="ax")

ax.W = ng.make_axis(length=4)
ax.H = ng.make_axis(length=1)  # new axis added
ax.N = ng.make_axis(length=128, batch=True)

### Building the graph
Our model has three placeholders: ``X``, ``Y``, and ``alpha``. Now, the the input ``X`` has shape ``(W, H, N)``:

In [None]:
alpha = ng.placeholder(())
X = ng.placeholder([ax.W, ax.H, ax.N])  # now has shape (W, H, N)
Y = ng.placeholder([ax.N])

Similarly, the weight matrix is now multi-dimensional, with shape ``(W, H)``, and we add a new scalar bias variable.

In [None]:
W = ng.variable([ax.W - 1, ax.H - 1], initial_value=0).named('W')  # now has shape (W, H)
b = ng.variable((), initial_value=0).named('b')

Our predicted output now include the bias ``b``:

In [None]:
Y_hat = ng.sigmoid(ng.dot(W, X) + b)
L = ng.cross_entropy_binary(Y_hat, Y, out_axes=()) / ng.batch_size(Y_hat)

For the parameter updates, instead of explicitly specifying the variables ``W`` and ``b``, we can call ``L.variables()`` to retrieve all the variables that the loss function depends on:

In [None]:
print([var.name for var in L.variables()])

For complicated graphs, the ``variables()`` method makes it easy to iterate over all its dependant variables. Our new parameter update is then

In [None]:
updates = [ng.assign(v, v - alpha * ng.deriv(L, v) / ng.batch_size(Y_hat))
           for v in L.variables()]

The ``ng.deriv`` function computes the backprop using autodiff. We are almost done.  The update step computes the new weight and assigns it to ``W``:

In [None]:
all_updates = ng.doall(updates)

### Computation

We have our update computation as before, but we also add an evaluation computation that computes the loss on a separate dataset without performing the updates:


In [None]:
transformer = ngt.make_transformer()

update_fun = transformer.computation([L, W, b, all_updates], alpha, X, Y)
eval_fun = transformer.computation(L, X, Y)

For convenience, we define a function that computes the average cost across the validation set.

In [None]:
def avg_loss(xs, ys):
    total_loss = 0
    for x, y in zip(xs, ys):
        loss_val = eval_fun(x, y)
        total_loss += loss_val
    return total_loss / x.shape[-1]

We then generate our training and evaluation sets and perform the updates. We emit the average loss on the validation set during training.

In [None]:
g = gendata.MixtureGenerator([.5, .5], (ax.W.length, ax.H.length))
XS, YS = g.gen_data(ax.N.length, 10)
EVAL_XS, EVAL_YS = g.gen_data(ax.N.length, 4)

print("Starting avg loss: {}".format(avg_loss(EVAL_XS, EVAL_YS)))
for i in range(10):
    for xs, ys in zip(XS, YS):
        loss_val, w_val, b_val, _ = update_fun(5.0 / (1 + i), xs, ys)
    print("After epoch %d: W: %s, b: %s, avg loss %s" % (i, w_val.T, b_val, avg_loss(EVAL_XS, EVAL_YS)))