Fine-tune with R API #4817

statist-bhfz · 2017-01-27T18:19:51Z

Hi,

Many thanks for this library which provides almost only way to do deep learning in R. Current R documentation is not very comprehensive, but examples and discussions in issues help a lot.

Now I got stuck when looking for R equivalent of get_internals() method in Python (http://mxnet.io/how_to/finetune.html).
Unfortunately, model$symbol$get.internals() doesn't return any adoptable R object, so modifying symbol in pre-trained model seems to be impossible in R API without direct editing of symbol.json file.

Is it possible to add in the R package capabilities for writing code like this?

all_layers = sym.get_internals()

net = all_layers[layer_name+'_output']

net = mx.symbol.FullyConnected(data=net, num_hidden=num_classes, name='fc1')

net = mx.symbol.SoftmaxOutput(data=net, name='softmax')

new_args = dict({k:arg_params[k] for k in arg_params if 'fc1' not in k})

The text was updated successfully, but these errors were encountered:

thirdwing · 2017-01-27T20:01:52Z

I think the last line may not be possible. Others should be OK.

jeremiedb · 2017-01-28T05:38:42Z

Hello, I think you should actually be able to do it by using the get.output:

resnet101<- mx.model.load("Models/Resnet101/resnet-101", iteration=0)

symbol<- resnet101$symbol
internals<- symbol$get.internals()
outputs<- internals$outputs

flatten<- internal2$get.output(which(outputs=="flatten0_output"))

Tricky part is that get.output seems to only accept a numeric index and not a named one.

After, it should be relatively straightforward through the use of the initializers:

new_fc<- mx.symbol.FullyConnected(data=flatten, num_hidden=24, name="new_fc_24") 
new_soft = mx.symbol.SoftmaxOutput(data=new_fc, name='new_softmax')

arg_params_ori<- resnet101$arg.params
fc1_weights_ori<- arg_params_ori[["fc1_weight"]]

devices<- mx.ctx.default()
arg_params_new<- mxnet:::mx.model.init.params(symbol = new_soft, input.shape = c(224,224,3,32), initializer = mxnet:::mx.init.uniform(0.1), ctx = devices)$arg.params
fc1_weights_new<- arg_params_new[["new_fc_24_weight"]]

### Alternative
arg_params_new2<- mx.init.create(initializer = mx.init.uniform(0.1), shape.array = mx.symbol.infer.shape(new_soft, data=c(224,224,3,32))$arg.shapes, ctx = devices)
fc1_weights_new2<- arg_params_new2[["new_fc_24_weight"]]

And finally reassigning the original weights to the new initialiser for all but the new FC layer:

arg_params_new_patch<- arg_params_new
arg_params_new_patch[setdiff(names(arg_params_new_patch), "new_fc_24_weight")]<- arg_params_ori[setdiff(names(arg_params_new_patch), "new_fc_24_weight")]

statist-bhfz · 2017-01-28T06:15:56Z

Thank you for the response.

Is it possible to replace the line new_args = dict({k:arg_params[k] for k in arg_params if 'fc1' not in k}) in Python with something like model$arg.params[c("fc1_weight", "fc1_bias")] <- NULL in R? Or we should always use initializers?

OwlECoyote · 2017-02-07T17:36:38Z

I'm not an expert, but i am pretty sure if you set all the weights to zero then a network is not able to learn anything (If that what you meant to do by setting the params to NULL)

statist-bhfz · 2017-02-07T18:28:50Z

I hope that setting subset of parameters to NULL will force some kind of default initialization during training process for these parameters (similar to mxnet:::mx.init.uniform(0.1) in jeremiedb`s answer). Possibly I'm wrong.

OwlECoyote · 2017-02-08T19:41:30Z

Well, I used jeremiedb's answer for finetuning a network and had to add some lines because after calling the FeedForward function I got an error that the bias for new_fc_24 was NULL, so in that case at least there was no such default initialization.

jeremiedb · 2017-02-09T02:29:53Z

Thanks OwlECoyote for pointing out the bias argument.
Effectively, the R feedforward module is designed so as to either initialize all the arguments or take the full and complete list of arguments in the arg.params.

Lines 416-417 in /model.R should make the behavior of the model clearer:

params <- mx.model.init.params(symbol, input.shape, initializer, mx.cpu())
if (!is.null(arg.params)) params$arg.params <- arg.params

To make the picture clearer, the Feedforward wrapper calls the model.train function which initiates the arg.params to 0 before assigning the Pre-trained arg.params. Therefore, if there are missing arguments like statist-bhfz mentionned, their weights will be "initialised" to 0. In this case, the model will run, but won't learn, so it's not a good idea!

Bottom line: arg.params, if provided, should simply be list containing the ndarrays for each of the model arguments (symbol$arguments, excluding data and label). And such list must have its names match the model arguments names.

lichen11 · 2018-01-05T01:25:56Z

Hi, I recently attempted fine-tuning ResNet101 based on your comments.

However, when I initiate training (using GPU), R outputs the following msg:

Start training with 1 devices
[19:35:16] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the             best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)

Then it crashes. I updated mxnet to 1.0.0. Fine-tuning using Inception-BN at 126 does not crash. I am wondering if there is an internal bug in mxnet R to cause this crash.

jeremiedb · 2018-01-05T05:40:09Z

@lichen11, please see answer in #7968

thirdwing self-assigned this Jan 27, 2017

statist-bhfz mentioned this issue Feb 25, 2017

Fine-tune in R: end-to-end example and some troubles #5149

Closed

thirdwing added the R label Apr 16, 2017

thirdwing closed this as completed Jul 11, 2017

pono unassigned thirdwing Jul 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tune with R API #4817

Fine-tune with R API #4817

statist-bhfz commented Jan 27, 2017 •

edited

Loading

thirdwing commented Jan 27, 2017

jeremiedb commented Jan 28, 2017

statist-bhfz commented Jan 28, 2017 •

edited

Loading

OwlECoyote commented Feb 7, 2017

statist-bhfz commented Feb 7, 2017

OwlECoyote commented Feb 8, 2017

jeremiedb commented Feb 9, 2017

lichen11 commented Jan 5, 2018

jeremiedb commented Jan 5, 2018

Fine-tune with R API #4817

Fine-tune with R API #4817

Comments

statist-bhfz commented Jan 27, 2017 • edited Loading

thirdwing commented Jan 27, 2017

jeremiedb commented Jan 28, 2017

statist-bhfz commented Jan 28, 2017 • edited Loading

OwlECoyote commented Feb 7, 2017

statist-bhfz commented Feb 7, 2017

OwlECoyote commented Feb 8, 2017

jeremiedb commented Feb 9, 2017

lichen11 commented Jan 5, 2018

jeremiedb commented Jan 5, 2018

statist-bhfz commented Jan 27, 2017 •

edited

Loading

statist-bhfz commented Jan 28, 2017 •

edited

Loading