Make it optional to subtract mean pixel or mean image #321

jmancewicz · 2015-09-18T22:14:44Z

Closes #169

lukeyeager · 2015-09-28T19:34:16Z

digits/model/tasks/caffe_train.py

            t.set_mean('data', mean_pixel)
+        elif self.use_mean == 'image':
+            mean_image =  self.get_mean_image(mean_file)
+            t.set_mean('data', mean_image)


I think you need to crop the image here.

Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1475, in full_dispatch_request rv = self.dispatch_request() File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1461, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/lyeager/digits/digits/model/images/classification/views.py", line 276, in image_classification_model_classify_one predictions, visualizations = job.train_task().infer_one(image, snapshot_epoch=epoch, layers=layers) File "/home/lyeager/digits/digits/model/tasks/caffe_train.py", line 972, in infer_one layers=layers, File "/home/lyeager/digits/digits/model/tasks/caffe_train.py", line 1002, in classify_one preprocessed = self.get_transformer().preprocess( File "/home/lyeager/digits/digits/model/tasks/caffe_train.py", line 1473, in get_transformer t.set_mean('data', mean_image) File "/home/lyeager/caffe/nv-0.13.2/python/caffe/io.py", line 255, in set_mean raise ValueError('Mean shape incompatible with input shape.') ValueError: Mean shape incompatible with input shape.

jmancewicz · 2015-09-29T20:20:14Z

get_mean_image does an imresize. I'll look into what's going wrong.

jmancewicz · 2015-10-06T20:08:33Z

@gheinrich, please look at the changes to torch_train.py and see if they are correct.

gheinrich · 2015-10-06T21:18:36Z

@jmancewicz the parameter mappings look OK to me. I have trained LeNet on MNIST using all three methods (subtractMeanImage, subtractMeanPixel, none) and they all converged in approximately the same time. Maybe MNIST is too trivial to actually observe the effect of mean subtraction... anyway, the Torch part looks OK to me thanks!

lukeyeager · 2015-10-06T21:25:01Z

You're getting a bunch of test failures.

https://travis-ci.org/NVIDIA/DIGITS/builds/83967654
https://s3.amazonaws.com/archive.travis-ci.org/jobs/83967655/log.txt

There are a bunch of weird PIL messages too - any idea what that's about?

PIL.PngImagePlugin: DEBUG: STREAM IHDR 16 13

gheinrich · 2015-10-13T17:37:12Z

@jmancewicz the support for generic inference with Torch is now on the master branch. There might be some rebasing to do for this PR. Let me know when you have rebased as I now have a network for which mean subtraction is critical so I will be able to test how the network behaves depending on the mean subtraction method.

lukeyeager · 2015-10-14T18:19:28Z

@jmancewicz can you make sure that #365 is fixed in this PR too?

jmancewicz · 2015-10-14T19:17:16Z

To handle #365, I'm deleting setting mean_pixel rather than appending, and deleting mean_file when using mean_pixel. Using mean mean_file will similarly del mean_pixel and set mean_file.

lukeyeager · 2015-10-14T19:51:48Z

digits/model/tasks/torch_train.py

@@ -136,6 +135,8 @@ def task_arguments(self, resources, env):
                '--save=%s' % self.job_dir,
                '--snapshotPrefix=%s' % self.snapshot_prefix,
                '--snapshotInterval=%s' % self.snapshot_interval,
+                '--mean=%s' % self.dataset.path(constants.MEAN_FILE_IMAGE),
+                '--labels=%s' % self.dataset.path(self.dataset.labels_file),


Datasets don't necessarily have a labels_file. That's why you're getting these TravisCI errors for generic Torch tests:

====================================================================== FAIL: digits.model.images.generic.test_views.TestTorchCreation.test_create_wait_delete ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/travis/miniconda/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/travis/build/NVIDIA/DIGITS/digits/model/images/generic/test_views.py", line 245, in test_create_wait_delete assert self.model_wait_completion(job_id) == 'Done', 'create failed' AssertionError: create failed -------------------- >> begin captured logging << -------------------- digits: ERROR: AttributeError: 'GenericImageDatasetJob' object has no attribute 'labels_file' --------------------- >> end captured logging << ---------------------

Why are you adding labels at all? It's already being done below.

these two lines were moved to lines 150-151
line 192 addresses the generic dataset case (where we need to create an image file out of the .protobuf from the dataset)

line 174:

if self.use_mean:

Maybe this should be:

if self.use_mean != 'none':

lukeyeager · 2015-10-14T21:14:46Z

Looks like I added a bug way back in #41. At line 1427, this:

t = caffe.io.Transformer(
    inputs = {'data':  data_shape}
    )

needs to change to this.

t = caffe.io.Transformer(
    inputs = {'data':  tuple(data_shape)}
    )

It was originally me that added the overly strict check to Caffe with BVLC/caffe#2031. Now I wish I would have done it in such a way that it worked with either lists or tuples.

We didn't notice the issue in DIGITS because we haven't done mean file subtraction for so long.

lukeyeager · 2015-10-14T21:51:55Z

digits/model/tasks/caffe_train.py

+    def set_mean_file(self, layer, mean_file):
+        # remove any values that may already be in the network
+        layer.transform_param.ClearField('mean_value')
+        layer.transform_param.mean_file = mean_file


These two functions do the correct thing from a functional point of view (at least I think they do - haven't tested it), but I'd like to print some WARNINGs to the log if any of these ClearField() calls are actually overwriting any information.

jmancewicz · 2015-10-17T00:06:28Z

@lukeyeager I've added tests for mean image and pixel for training and inference. Also dependency on the deploy.prototxt was avoidable, because I didn't need to imgresize the mean file at that time.

gheinrich · 2015-10-17T08:39:48Z

digits/model/images/classification/test_views.py

@@ -317,6 +317,20 @@ def check_select_gpus(self, gpu_list):
        job_id = self.create_model(select_gpus_list=','.join(gpu_list), batch_size=len(gpu_list))
        assert self.model_wait_completion(job_id) == 'Done', 'create failed'

+    def test_mean_image(self):
+        options = {
+            'use_mean': 'image',


if my understanding is correct the default value for use_mean is 'image'. Therefore, is this test not already covered by the default case?
Either way it is worth considering adding a test case for use_mean='none'

Either way it is worth considering adding a test case for use_mean='none'

👍

Any objection to leaving the image test in there?

None from me, I think it's good to explicitly test all three cases

gheinrich · 2015-10-19T11:21:52Z

My auto-encoder network in Torch is quite sensitive to initialization so I thought this was a good candidate to try the various mean subtraction methods on. Along the way I found another issue in the mean pixel subtraction code. Besides I wasn't entirely happy about the Lua wrapper command-line flags (some were mutually exclusive) so I pushed another commit gheinrich@559f6a5.
So using mean image subtraction:

Using mean pixel subtraction (note the loss is decreasing less rapidly):

And using no mean subtraction: the network diverges:

gheinrich · 2015-10-22T10:10:15Z

digits/standard-networks/torch/lenet.lua

-        loss = nn.ClassNLLCriterion(),
-        trainBatchSize = 64,
-        validationBatchSize = 100,
+        loss = nn.ClassNLLCriterion()


you need to keep the version from master here too

jmancewicz · 2015-10-22T22:03:20Z

@gheinrich, I have made the changes from your comments. Now, it's failing on digits.model.images.classification.test_views.TestTorchCreation.test_classify_one_mean_pixel 8 out of 10 times with AssertionError: image misclassified.

The command that is being issued is

So --subtractMean=pixel is being passed. If it's expected that that should fail, I can ignore the classification error.

gheinrich · 2015-10-22T22:20:37Z

tools/torch/data.lua

@@ -216,9 +212,9 @@ function DBSource:new (backend, db_path, labels_db_path, mirror, crop, croplen,
        self.dbs = {}
        self.total = 0
        for line in file:lines() do
-            local fileName = paths.concat(db_path, paths.basename(line))
+            db_path = paths.concat(db_path, paths.basename(line))


I think you have overwritten the master version here and one line 217

gheinrich · 2015-10-22T22:26:57Z

Joe, if you wish I can try an integrate my patch again over 84653db?

jmancewicz · 2015-10-22T23:05:25Z

@gheinrich, if this doesn't look correct, then maybe it would be best for you to integrate your patch. At some point my branch got out of sync, and it's become difficult to know which changes to use from master.

gheinrich · 2015-10-23T13:41:50Z

@jmancewicz I have pushed this change.
The Travis test would have passed if it wasn't for this strange (unrelated?) error (another code -11!) on:

ERROR: test suite for <class 'digits.model.images.classification.test_views.TestCaffeCreatedCropInForm'>
...
digits.webapp: ERROR: Create DB (val) task failed with error code -11

The C standard error code -11 means EAGAIN: try again...

gheinrich · 2015-10-27T16:41:35Z

digits/model/tasks/torch_train.py

        if self.trained_on_cpu:
            args.append('--type=float')

-        # input image has been resized to network input dimensions by caller
-        args.append('--crop=no')


that --crop=no has disappeared but it isn't an issue because that flag defaults to no

gheinrich · 2015-10-27T16:48:40Z

The changes look OK to me.

Make it optional to subtract mean pixel or mean image

lukeyeager added the enhancement label Sep 18, 2015

jmancewicz force-pushed the mean branch from b24a701 to 48dea29 Compare September 19, 2015 00:02

lukeyeager reviewed Sep 28, 2015
View reviewed changes

jmancewicz force-pushed the mean branch from 48dea29 to adb5f4e Compare October 6, 2015 19:47

jmancewicz force-pushed the mean branch from adb5f4e to 21c3824 Compare October 6, 2015 20:11

jmancewicz force-pushed the mean branch from 21c3824 to 0eb00cb Compare October 13, 2015 20:07

lukeyeager reviewed Oct 14, 2015
View reviewed changes

lukeyeager mentioned this pull request Oct 14, 2015

Display InferenceErrors like all other exceptions #366

Merged

jmancewicz force-pushed the mean branch from 0eb00cb to 8afc082 Compare October 14, 2015 21:36

lukeyeager reviewed Oct 14, 2015
View reviewed changes

jmancewicz force-pushed the mean branch 2 times, most recently from 295df25 to 0315054 Compare October 17, 2015 00:01

jmancewicz added a commit to jmancewicz/DIGITS that referenced this pull request Oct 17, 2015

Make it optional to subtract mean pixel or mean image NVIDIA#321

0315054

gheinrich reviewed Oct 17, 2015
View reviewed changes

jmancewicz force-pushed the mean branch from 0315054 to 3d921a5 Compare October 18, 2015 04:10

jmancewicz added a commit to jmancewicz/DIGITS that referenced this pull request Oct 18, 2015

Make it optional to subtract mean pixel or mean image NVIDIA#321

3d921a5

gheinrich reviewed Oct 22, 2015
View reviewed changes

jmancewicz force-pushed the mean branch from e4f8ab3 to 54f12a2 Compare October 22, 2015 21:45

jmancewicz added a commit to jmancewicz/DIGITS that referenced this pull request Oct 22, 2015

Make it optional to subtract mean pixel or mean image NVIDIA#321

3edaaac

gheinrich reviewed Oct 22, 2015
View reviewed changes

jmancewicz force-pushed the mean branch from 47845ab to db823ab Compare October 23, 2015 00:30

jmancewicz force-pushed the mean branch from db823ab to dc92a40 Compare October 23, 2015 17:38

jmancewicz added a commit to jmancewicz/DIGITS that referenced this pull request Oct 23, 2015

Make it optional to subtract mean pixel or mean image NVIDIA#321

ae41ceb

jmancewicz force-pushed the mean branch from ac2acce to f5c31b5 Compare October 27, 2015 01:16

jmancewicz added a commit to jmancewicz/DIGITS that referenced this pull request Oct 27, 2015

Make it optional to subtract mean pixel or mean image NVIDIA#321

f5c31b5

gheinrich reviewed Oct 27, 2015
View reviewed changes

jmancewicz force-pushed the mean branch from f5c31b5 to 3ee632d Compare October 27, 2015 18:59

jmancewicz added a commit to jmancewicz/DIGITS that referenced this pull request Oct 27, 2015

Make it optional to subtract mean pixel or mean image NVIDIA#321

3ee632d

gheinrich mentioned this pull request Oct 29, 2015

Torch multi-threaded data loader #388

Closed

Make it optional to subtract mean pixel or mean image NVIDIA#321

a0286d3

jmancewicz force-pushed the mean branch from 3ee632d to a0286d3 Compare October 29, 2015 22:06

jmancewicz added a commit that referenced this pull request Oct 30, 2015

Merge pull request #321 from jmancewicz/mean

dd9927b

Make it optional to subtract mean pixel or mean image

jmancewicz merged commit dd9927b into NVIDIA:master Oct 30, 2015

jmancewicz deleted the mean branch November 4, 2015 01:07

lukeyeager mentioned this pull request Nov 6, 2015

DIGITS sometimes adds too many mean_values to network definition #365

Closed

lukeyeager mentioned this pull request Nov 19, 2015

Fix the upgrade path for TrainTask.use_mean #421

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it optional to subtract mean pixel or mean image #321

Make it optional to subtract mean pixel or mean image #321

jmancewicz commented Sep 18, 2015

lukeyeager Sep 28, 2015

jmancewicz commented Sep 29, 2015

jmancewicz commented Oct 6, 2015

gheinrich commented Oct 6, 2015

lukeyeager commented Oct 6, 2015

gheinrich commented Oct 13, 2015

lukeyeager commented Oct 14, 2015

jmancewicz commented Oct 14, 2015

lukeyeager Oct 14, 2015

lukeyeager Oct 14, 2015

gheinrich Oct 14, 2015

gheinrich Oct 14, 2015

lukeyeager commented Oct 14, 2015

lukeyeager Oct 14, 2015

jmancewicz commented Oct 17, 2015

gheinrich Oct 17, 2015

lukeyeager Oct 19, 2015

jmancewicz Oct 19, 2015

gheinrich Oct 19, 2015

gheinrich commented Oct 19, 2015

gheinrich Oct 22, 2015

jmancewicz commented Oct 22, 2015

gheinrich Oct 22, 2015

gheinrich commented Oct 22, 2015

jmancewicz commented Oct 22, 2015

gheinrich commented Oct 23, 2015

gheinrich Oct 27, 2015

gheinrich commented Oct 27, 2015

Make it optional to subtract mean pixel or mean image #321

Make it optional to subtract mean pixel or mean image #321

Conversation

jmancewicz commented Sep 18, 2015

Choose a reason for hiding this comment

jmancewicz commented Sep 29, 2015

jmancewicz commented Oct 6, 2015

gheinrich commented Oct 6, 2015

lukeyeager commented Oct 6, 2015

gheinrich commented Oct 13, 2015

lukeyeager commented Oct 14, 2015

jmancewicz commented Oct 14, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukeyeager commented Oct 14, 2015

Choose a reason for hiding this comment

jmancewicz commented Oct 17, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gheinrich commented Oct 19, 2015

Choose a reason for hiding this comment

jmancewicz commented Oct 22, 2015

Choose a reason for hiding this comment

gheinrich commented Oct 22, 2015

jmancewicz commented Oct 22, 2015

gheinrich commented Oct 23, 2015

Choose a reason for hiding this comment

gheinrich commented Oct 27, 2015