Various questions #13

abagshaw · 2018-04-13T03:54:19Z

First off, thanks very much for all your work on this - a really clean and understandable implementation of YOLO 😄 I have a few questions (probably very basic) about getting started with training and inference with tensornets - in specific to the YOLOv2 model:

I am wanting to train YOLOv2 on the Open Images dataset. How can the tensornets implementation of YOLOv2 be used with a custom number of classes (545 in my case)?
After training, how would one save the weights to a file and load them back in later for inference?
Can layers be frozen during training (i.e. only train the last 4 layers) similar to what one can do in darknet with stopbackward=1?
Is there support for training the YOLOv2 model with a TPU? If not, is there a somewhat straightforward path to adding that functionality?
Can YOLOv2 be trained and used for inference with FP16 precision to speed things up - and can the existing pretrained weights be converted to FP16 so as to not necessitate training from scratch?

Thanks very much!

The text was updated successfully, but these errors were encountered:

taehoonlee · 2018-04-13T05:07:43Z

Thank you for your attention, @abagshaw! Basically, the model functions always return tf.Tensor; thus you can manipulate models with any regular TensorFlow APIs. My answers are:

I am planning to release the loss function and training codes for YOLO in a week. The following is an example of your own YOLOv2 (I think I should write a simpler API 😂):

from tensornets.utils import set_args
from tensornets.utils import var_scope

opts = nets.references.yolo_utils.opts('yolov2voc')
opts['classes'] = 545

@var_scope('yolov2')
@set_args(nets.references.yolos.__args__)
def yolov2_for_you(x, is_training=False, scope=None, reuse=None):
    def _get_boxes(*args, **kwargs):
        return nets.references.yolo_utils.get_v2_boxes(opts, *args, **kwargs)
    x = nets.references.yolos.yolo(x, [1, 1, 3, 3, 5, 5], 125, is_training, scope, reuse)
    x.get_boxes = _get_boxes
    return x

You can use regular tf.train.Saver. Note that the restoring mechanism of TensorNets is:

np.save('your_weights', sess.run(model.get_weights()))  # save
sess.run([w.assign(v) for (w, v) in zip(model.get_weights(), np.load('your_weights.npz'))])  # load

There are many ways to freeze weights, and I recommend:

train_list = model.get_weights()[6:]  # The first six weights are not going to be updated.
train = tf.train.AdamOptimizer().minimize(loss, var_list=train_list)

I didn't do it, but it seems easy to couple models with a TPU (see the official example). If you don't mind, can you give it a try? 😄
Very good point. I will add the FP16 supports, Thank you!

abagshaw · 2018-04-13T05:27:38Z

@taehoonlee Thanks for the super quick response!

Thanks for the example - makes sense. One other thing, shouldn't the number of filters be changed to 2750 given the number of classes?

I'll certainly see if I can get training up and running on a TPU when I have a chance after the loss function is implemented. Thanks again for all your work on this!

taehoonlee · 2018-04-13T05:40:21Z

You are right @abagshaw, (# classes + 5) * # anchors = (545 + 5) * 5 = 2750. Thank you so much!

taehoonlee · 2018-04-23T13:44:02Z

@abagshaw, Please see #14 for the FP16 supports. The native operations support the FP16 since TF>=1.1.0 except that tf.nn.depthwise_conv2d, tf.nn.separable_conv2d, and tf.contrib.layers.batch_norm do TF>=1.5.0. Therefore, you can just put tf.placeholder(tf.float16) and then call all the network functions when TF>=1.5.0.

abagshaw · 2018-04-23T15:49:03Z

@taehoonlee Awesome, thanks! Also, no pressure - but any progress on YOLO training? 😃

taehoonlee · 2018-04-24T00:35:26Z

@abagshaw, Almost finished 🔥.

taehoonlee · 2018-05-14T02:24:48Z

@abagshaw, I wrote and tested training codes based on darknet and darkflow, but learning did not progress at 5% mAP. Thus, I added a draft version first and will revise it with pytorch implementation.

Would you have the bandwidth to review the draft, @abagshaw? I would appreciate if you could find any reasons for poor convergence 😭.

abagshaw · 2018-05-18T17:50:44Z

@taehoonlee Awesome, thanks so much! Right now I'm swamped with work so I don't see myself getting time to tinker around with it - but if a break comes up I'll certainly give it a shot.

taehoonlee closed this as completed Apr 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various questions #13

Various questions #13

abagshaw commented Apr 13, 2018 •

edited

taehoonlee commented Apr 13, 2018

abagshaw commented Apr 13, 2018

taehoonlee commented Apr 13, 2018

taehoonlee commented Apr 23, 2018

abagshaw commented Apr 23, 2018

taehoonlee commented Apr 24, 2018

taehoonlee commented May 14, 2018

abagshaw commented May 18, 2018

Various questions #13

Various questions #13

Comments

abagshaw commented Apr 13, 2018 • edited

taehoonlee commented Apr 13, 2018

abagshaw commented Apr 13, 2018

taehoonlee commented Apr 13, 2018

taehoonlee commented Apr 23, 2018

abagshaw commented Apr 23, 2018

taehoonlee commented Apr 24, 2018

taehoonlee commented May 14, 2018

abagshaw commented May 18, 2018

abagshaw commented Apr 13, 2018 •

edited