Unpooling indices stored as tf.Variables #23

rodamn · 2017-07-11T21:34:50Z

In network.py:

  def prepare_indices(self, before, row, col, after, dims ):
    x_0 = tf.Variable(x0.reshape([-1]), name = 'x_0')
    x_1 = tf.Variable(x1.reshape([-1]), name = 'x_1')
    x_2 = tf.Variable(x2.reshape([-1]), name = 'x_2')
    x_3 = tf.Variable(x3.reshape([-1]), name = 'x_3')

This is the wrong use of tf.Variable (variable nodes are used as a source for value that the net expects to change, typically weights). In this case, this is just reshaping the indices from np.meshgrid, so these values aren't weights, or anything like that. There could be a specific reason these are made as tf.Variable that I'm unaware of, but it seems these lines should be:

  x_0 = x0.reshape([-1])

Why this matters:
tf.Variable nodes typically store "trainable" values, which must be stored in checkpoints and loaded weight files. Since these are four 4D-flattened-to-1D arrays and there is a set of these for each up-conversion, this is a lot of data being stored to disk, which must also be loaded from disk (and saved, when creating checkpoints). These are basically indices so no change (learning) is expected, this saving and loading I propose is needless.

Case in point, these variables seems to be the prime contributor to the long load time of the weights file. In predict.py:

  net.load(model_data_path, sess)

takes several minutes on my computer in the current state. Changing prepare_indices() as indicated above reduces the load time by orders of magnitude, however making this change MIGHT make the new model incompatible with the current weights file, NYU_ResNet-UpProj.npy (I am having trouble making the net work with this change, so more investigation is needed on my end, but I figured I would raise this issue in case others are available to work on resolve this).

Since this is a non-functional change, I propose the authors try the following:

Remove the tf.Variable nodes as shown above (making them simple operations)
Retrain using identical meta-parameters as in the paper (if the starting weight values are still available)
Compare results pre- and post- change to ensure they generate the same output?

If the starting weights aren't available, I suppose a full retraining would just need to generate acceptable results.

The text was updated successfully, but these errors were encountered:

chrirupp · 2017-07-21T13:18:21Z

Thanks, rodamn! You are completely right. We will fix this ASAP.

rodamn · 2017-07-28T05:50:41Z

Actually, I merged the pull request submitted by @jtatusko that uses a faster interleave method and makes prepare_indices() unnecessary. Recommend merging his pull request in order to close this issue.

NikolausDemmel · 2017-07-31T13:39:33Z

@rodamn, does that mean that with @jtatusko's patch you can use the same trained models, or do you need to retrain from scratch? What about the model size?

rodamn · 2017-08-10T04:55:01Z

@NikolausDemmel Yeah, I was able to load the model trained on NYU depth dataset and get it to perform inference. Even though the .npy file stores data for the tf.Variables I mentioned above, I presume Tensor Flow just ignores that data when loading all the params when those nodes are no longer present.

rodamn · 2017-08-10T05:01:37Z

FWIW, I added TF code to add a checkpoint after I loaded the dataset .npy file, and then in future runs I load from my checkpoint. Removing these variable nodes actually only reduces the load size from ~250 MB to ~200MB if I recall, so you don't end up saving that much load time or model data, however, loading the model from a checkpoint seemed faster than using the current framework that loads the model from a .npy file (I didn't time them for comparison, but I recall it being quite a bit faster). Hope this helps.

irolaina · 2017-09-22T16:24:45Z

We have now fixed this issue. Based on @jtatusko's suggestion, the interleaving layer has been made faster and does not use tf.Variable for the unpooling indices.
We also added the corresponding checkpoint files for faster loading of the weights.

irolaina closed this as completed Sep 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unpooling indices stored as tf.Variables #23

Unpooling indices stored as tf.Variables #23

rodamn commented Jul 11, 2017 •

edited

chrirupp commented Jul 21, 2017

rodamn commented Jul 28, 2017 •

edited

NikolausDemmel commented Jul 31, 2017

rodamn commented Aug 10, 2017

rodamn commented Aug 10, 2017

irolaina commented Sep 22, 2017

Unpooling indices stored as tf.Variables #23

Unpooling indices stored as tf.Variables #23

Comments

rodamn commented Jul 11, 2017 • edited

chrirupp commented Jul 21, 2017

rodamn commented Jul 28, 2017 • edited

NikolausDemmel commented Jul 31, 2017

rodamn commented Aug 10, 2017

rodamn commented Aug 10, 2017

irolaina commented Sep 22, 2017

rodamn commented Jul 11, 2017 •

edited

rodamn commented Jul 28, 2017 •

edited