# BIDMach: parameter tuning

In this notebook we'll explore automated parameter exploration by grid search. 

In [None]:
import $exec.^.lib.bidmach_notebook_init
if (Mat.hasCUDA > 0) GPUmem

## Dataset: Reuters RCV1 V2

The dataset is the widely used Reuters news article dataset RCV1 V2. This dataset and several others are loaded by running the script <code>getdata.sh</code> from the BIDMach/scripts directory. The data include both train and test subsets, and train and test labels (cats). 

In [None]:
var dir = "../data/rcv1/"             // adjust to point to the BIDMach/data/rcv1 directory
tic
val train = loadSMat(dir+"docs.smat.lz4")
val cats = loadFMat(dir+"cats.fmat.lz4")
val test = loadSMat(dir+"testdocs.smat.lz4")
val tcats = loadFMat(dir+"testcats.fmat.lz4")
toc

First lets enumerate some parameter combinations for learning rate and time exponent of the optimizer (texp)

In [None]:
val lrates = col(0.03f, 0.1f, 0.3f, 1f)        // 4 values
val texps = col(0.3f, 0.4f, 0.5f, 0.6f, 0.7f)  // 5 values

The next step is to enumerate all pairs of parameters. We can do this using the kron operator for now, this will eventually be a custom function:

In [None]:
val lrateparams = ones(texps.nrows, 1) ⊗ lrates
val texpparams = texps ⊗ ones(lrates.nrows,1)
lrateparams \ texpparams

Here's the learner again:

In [None]:
val (mm, opts) = GLM.learner(train, cats, GLM.logistic)

To keep things simple, we'll focus on just one category and train many models for it. The "targmap" option specifies a mapping from the actual base categories to the model categories. We'll map from category six to all our models:

In [None]:
val nparams = lrateparams.length
val targmap = zeros(nparams, 103)
targmap(?,6) = 1

In [None]:
opts.targmap = targmap
opts.lrate = lrateparams
opts.texp = texpparams

In [None]:
mm.train

In [None]:
val (pp, popts) = GLM.predictor(mm.model, test)

And invoke the predict method on the predictor:

In [None]:
pp.predict
val preds = FMat(pp.preds(0))

In [None]:
pp.model.asInstanceOf[GLM].mats.length

Although ll values are printed above, they are not meaningful (there is no target to compare the prediction with). 

We can now compare the accuracy of predictions (preds matrix) with ground truth (the tcats matrix). 

In [None]:
val vcats = targmap * tcats                                          // create some virtual cats
val lls = mean(ln(1e-7f + vcats ∘ preds + (1-vcats) ∘ (1-preds)),2)  // actual logistic likelihood
mean(lls)

A more thorough measure is ROC area:

In [None]:
val rocs = roc2(preds, vcats, 1-vcats, 100)   // Compute ROC curves for all categories

In [None]:
plot(rocs)

In [None]:
val aucs = mean(rocs)

The maxi2 function will find the max value and its index.

In [None]:
val (bestv, besti) = maxi2(aucs)

And using the best index we can find the optimal parameters:

In [None]:
texpparams(besti) \ lrateparams(besti)

> Write the optimal values in the cell below:

<b>Note:</b> although our parameters lay in a square grid, we could have enumerated any sequence of pairs, and we could have searched over more parameters. The learner infrastructure supports more intelligent model optimization (e.g. Bayesian methods). 