Make unnecessary computations optional #368

gheinrich · 2015-10-15T15:58:44Z

Original (20 epochs LeNet on MNIST): 187s
Now: 142s

Helps with bug #339

lukeyeager · 2015-10-15T18:20:01Z

~~Dang your machine is slow.~~ My results:

Original (20 epochs LeNet on MNIST): 30s
Now: 30s

And now I'm seeing extra output in the log. I thought this PR turned OFF confusion matrices?

2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: ConfusionMatrix:
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [[     242       0       0       0       2       0       1       0       0       0]   98.776%   [class: 0]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       0     280       0       0       1       0       0       1       2       0]   98.592%    [class: 1]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       1       0     254       0       1       0       0       2       0       0]   98.450%    [class: 2]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       0       1       2     243       0       3       0       2       0       1]   96.429%    [class: 3]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       0       1       0       0     236       0       1       1       0       6]   96.327%    [class: 4]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       1       2       0       2       0     213       4       1       0       0]   95.516%    [class: 5]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       1       1       0       0       0       0     237       0       0       0]   99.163%    [class: 6]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       0       1       4       1       2       0       0     247       0       2]   96.109%    [class: 7]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       4       0       1       1       1       4       1       2     226       3]   93.004%    [class: 8]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: [       0       1       0       1       4       2       1       1       0     242]]  96.032%    [class: 9]
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: + average row correct: 96.839545369148%
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: + average rowUcol correct (VOC measure): 93.902345299721%
2015-10-15 11:17:51 [20151015-111721-72f4] [WARNING] Train Torch Model unrecognized output: + global correct: 96.877502001601%

gheinrich · 2015-10-15T18:42:17Z

Perhaps our datasets don't have the same size? My MNIST dataset has 45k training samples and 15k validation samples. So you're saying the patch doesn't provide any speedup? I'll double check on my end.
Before the patch, accuracy and confusion used to be computed during both training and validation. This patch makes both optional in the Lua wrapper and the caller in torch_train.py turns it ON for validation (so we can draw the validation accuracy curve + we get validation confusion for free).

lukeyeager · 2015-10-15T19:45:03Z

Oh, that was careless of me. I misnamed my dataset and didn't notice. Whoops!

	20 epochs	5 epochs
Original	202s	56s
Now	204s	52s

I'm definitely picking up the new code because I'm seeing the confusion matrix in my log.

gheinrich · 2015-10-15T20:02:06Z

It looks like I need to review my patch!

gheinrich · 2015-10-16T19:45:22Z

Closing as patch needs to be revisited. May re-open later.

Training accuracy is not displayed in DIGITS (not for Caffe either) so it is not necessary to compute training accuracy and confusion matrix. Disabling those computations speeds training up: LeNet (MNIST, 30 epochs): 1m54s -> 1m40s Alexnet (CIFAR10, 2 epochs): 5m14s -> 4m38s GoogLeNet (reduced CIFAR10, 1 epoch): 2m4s -> 2m2s

gheinrich · 2015-12-04T18:31:39Z

Re-opening with a new patch. These are the numbers I get:

	LeNet (MNIST, 30 epochs)	Alexnet (CIFAR10, 2 epochs)	GoogLeNet (reduced CIFAR10, 1 epoch)
original	1m54s	5m14s	2m4s
now	1m40s	4m38s	2ms2

lukeyeager · 2015-12-08T23:22:30Z

I've verified similar results on my machine. LGTM!

Make unnecessary computations optional

gheinrich closed this Oct 16, 2015

gheinrich reopened this Dec 4, 2015

gheinrich force-pushed the dev/torch-optional-confusion branch from 899da0b to 575255c Compare December 4, 2015 18:22

lukeyeager mentioned this pull request Dec 8, 2015

Show training accuracy for standard networks #293

Closed

lukeyeager added a commit that referenced this pull request Dec 8, 2015

Merge pull request #368 from gheinrich/dev/torch-optional-confusion

240eb85

Make unnecessary computations optional

lukeyeager merged commit 240eb85 into NVIDIA:master Dec 8, 2015

gheinrich deleted the dev/torch-optional-confusion branch April 14, 2016 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make unnecessary computations optional #368

Make unnecessary computations optional #368

gheinrich commented Oct 15, 2015

lukeyeager commented Oct 15, 2015

gheinrich commented Oct 15, 2015

lukeyeager commented Oct 15, 2015

gheinrich commented Oct 15, 2015

gheinrich commented Oct 16, 2015

gheinrich commented Dec 4, 2015

lukeyeager commented Dec 8, 2015

Make unnecessary computations optional #368

Make unnecessary computations optional #368

Conversation

gheinrich commented Oct 15, 2015

lukeyeager commented Oct 15, 2015

gheinrich commented Oct 15, 2015

lukeyeager commented Oct 15, 2015

gheinrich commented Oct 15, 2015

gheinrich commented Oct 16, 2015

gheinrich commented Dec 4, 2015

lukeyeager commented Dec 8, 2015