Review Request: Muellerklein #20

Closed
wants to merge 9 commits into
from

Conversation

Projects
None yet
8 participants
@FlorianMuellerklein

FlorianMuellerklein commented Jul 1, 2016

Muellerklein

Dear @ReScience/editors,

I request a review for the reproduction of the following paper:

  • He, Kaiming, et al. "Identity mappings in deep residual networks." arXiv preprint arXiv:1603.05027 (2016)

and one result from

  • Sergey Zagoruyko, Nikos Komodakis, "Wide Residual Neural Networks." arXiv preprint arXiv:1605.07146 (2016)

I believe the original results have been faithfully reproduced as explained in the accompanying article.

Repository lives at https://github.com/FlorianMuellerklein/ReScience-submission/tree/muellerklein


EDITOR

  • Editor acknowledgment (@emmanuelle 03 July 2016)
  • Reviewer 1 (@ogrisel 20 July 2016)
  • Reviewer 2 (@ozancaglayan 24 August 2016)
  • Review 1 decision [accept/reject]
  • Review 2 decision [accept] 15 March 2017
  • Editor decision [accept/reject]

@rougier rougier changed the title from Review Request to Review Request: Muellerklein Jul 1, 2016

@rougier rougier added the 01 - Request label Jul 1, 2016

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 1, 2016

AUTHOR

There are some formatting issues that I will resolve if this gets accepted.

FlorianMuellerklein commented Jul 1, 2016

AUTHOR

There are some formatting issues that I will resolve if this gets accepted.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Jul 18, 2016

Member

EDITOR-IN-CHIEF

@emmanuelle Any update on this ?

Member

rougier commented Jul 18, 2016

EDITOR-IN-CHIEF

@emmanuelle Any update on this ?

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 20, 2016

@emmanuelle asked me to review this submission. I need to reinstall my GTX 980Ti to be able to run the code but hopefully I will do it in the coming week. Meanwhile @FlorianMuellerklein is there any change you want to include in the submission? What are the formatting changes you plan?

ogrisel commented Jul 20, 2016

@emmanuelle asked me to review this submission. I need to reinstall my GTX 980Ti to be able to run the code but hopefully I will do it in the coming week. Meanwhile @FlorianMuellerklein is there any change you want to include in the submission? What are the formatting changes you plan?

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 20, 2016

@ogrisel I had some issues with the image sizes and some captions in the pdf that I need to fix.

But, after submitting this pull request someone commented on my github repo that I might have a mistake in the projection shortcut. They said that the shortcut should take the preactivation as it's input. I think they may be correct, should I update my pull request or wait for reviewer input before changing anything? I linked to the issue on my repo below.

FlorianMuellerklein/Identity-Mapping-ResNet-Lasagne#2

@ogrisel I had some issues with the image sizes and some captions in the pdf that I need to fix.

But, after submitting this pull request someone commented on my github repo that I might have a mistake in the projection shortcut. They said that the shortcut should take the preactivation as it's input. I think they may be correct, should I update my pull request or wait for reviewer input before changing anything? I linked to the issue on my repo below.

FlorianMuellerklein/Identity-Mapping-ResNet-Lasagne#2

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Jul 20, 2016

Member

@FlorianMuellerklein You can push your changes, they should be tracked here. For the formatting issues in the paper, I will handle it in the edition stage if accepted.

@ogrisel Thank you !

Member

rougier commented Jul 20, 2016

@FlorianMuellerklein You can push your changes, they should be tracked here. For the formatting issues in the paper, I will handle it in the edition stage if accepted.

@ogrisel Thank you !

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 21, 2016

Please @FlorianMuellerklein feel free to push the changes. If you launch a series of new runs please report the new numbers in a new column in your tables so that we can track the impact of that change on the final validation error of the models.

ogrisel commented Jul 21, 2016

Please @FlorianMuellerklein feel free to push the changes. If you launch a series of new runs please report the new numbers in a new column in your tables so that we can track the impact of that change on the final validation error of the models.

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 21, 2016

@ogrisel I pushed the changes so everything should be ready to go. I won't have time to check the new numbers for a little while (time to get some more GPUs) but I definitely will.

@ogrisel I pushed the changes so everything should be ready to go. I won't have time to check the new numbers for a little while (time to get some more GPUs) but I definitely will.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 22, 2016

It seems that the submission misses a file:

Traceback (most recent call last):
  File "train_nn.py", line 92, in <module>
    train_X, test_X, train_y, test_y = load_pickle_data_cv()
  File "/home/ogrisel/code/ReScience-submission/code/utils.py", line 58, in load_pickle_data_cv
    np.save('data/pixel_mean.npy', pixel_mean)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/lib/npyio.py", line 477, in save
    fid = open(file, "wb")
IOError: [Errno 2] No such file or directory: 'data/pixel_mean.npy'

ogrisel commented Jul 22, 2016

It seems that the submission misses a file:

Traceback (most recent call last):
  File "train_nn.py", line 92, in <module>
    train_X, test_X, train_y, test_y = load_pickle_data_cv()
  File "/home/ogrisel/code/ReScience-submission/code/utils.py", line 58, in load_pickle_data_cv
    np.save('data/pixel_mean.npy', pixel_mean)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/lib/npyio.py", line 477, in save
    fid = open(file, "wb")
IOError: [Errno 2] No such file or directory: 'data/pixel_mean.npy'
@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 22, 2016

@ogrisel That line should be saving the pre-pixel mean to be used to preprocess the testing data later on. I don't understand why it would give an error.

@ogrisel That line should be saving the pre-pixel mean to be used to preprocess the testing data later on. I don't understand why it would give an error.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 22, 2016

I ok I was in the wrong folder.

Edit: Actually no, when I run the script from the top level folder I get this other error:

$ python code/train_nn.py wide 16 4
Using gpu device 0: GeForce GTX 980 Ti (CNMeM is disabled, cuDNN 4007)
Using wide ResNet with depth 16 and width 4.
Traceback (most recent call last):
  File "code/train_nn.py", line 92, in <module>
    train_X, test_X, train_y, test_y = load_pickle_data_cv()
  File "/home/ogrisel/code/ReScience-submission/code/utils.py", line 19, in load_pickle_data_cv
    fo_1 = open('../data/cifar-10-batches-py/data_batch_1', 'rb')
IOError: [Errno 2] No such file or directory: '../data/cifar-10-batches-py/data_batch_1'

I believe the expectations wrt the folder structure have been broken when converting from the original repo to this submission PR.

ogrisel commented Jul 22, 2016

I ok I was in the wrong folder.

Edit: Actually no, when I run the script from the top level folder I get this other error:

$ python code/train_nn.py wide 16 4
Using gpu device 0: GeForce GTX 980 Ti (CNMeM is disabled, cuDNN 4007)
Using wide ResNet with depth 16 and width 4.
Traceback (most recent call last):
  File "code/train_nn.py", line 92, in <module>
    train_X, test_X, train_y, test_y = load_pickle_data_cv()
  File "/home/ogrisel/code/ReScience-submission/code/utils.py", line 19, in load_pickle_data_cv
    fo_1 = open('../data/cifar-10-batches-py/data_batch_1', 'rb')
IOError: [Errno 2] No such file or directory: '../data/cifar-10-batches-py/data_batch_1'

I believe the expectations wrt the folder structure have been broken when converting from the original repo to this submission PR.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 22, 2016

Could you please extend the readme to to give the command line parameters to reproduce each of the score reported in the table. I guess those are:

  • python code/train_nn.py wide 16 4
  • python code/train_nn.py normal 110
  • python code/train_nn.py normal 164

but it would be great if you could confirm this.

Also for the sake of reproducibility I think you should include the version number of the libraries you used (and the sha1 digest of the git repo for lasagne).

ogrisel commented Jul 22, 2016

Could you please extend the readme to to give the command line parameters to reproduce each of the score reported in the table. I guess those are:

  • python code/train_nn.py wide 16 4
  • python code/train_nn.py normal 110
  • python code/train_nn.py normal 164

but it would be great if you could confirm this.

Also for the sake of reproducibility I think you should include the version number of the libraries you used (and the sha1 digest of the git repo for lasagne).

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 22, 2016

@ogrisel Ok I'll update the README with the command lines and versions as soon as I can.

The depth is actually the multiplier for how many residual blocks to insert between each downsampling layer. So for ResNet-110 and ResNet-164 depth=18 and for the wide residual network depth=2.

python code/train_nn.py wide 2 4
python code/train_nn.py normal 18
python code/train_nn.py bottleneck 18

@ogrisel Ok I'll update the README with the command lines and versions as soon as I can.

The depth is actually the multiplier for how many residual blocks to insert between each downsampling layer. So for ResNet-110 and ResNet-164 depth=18 and for the wide residual network depth=2.

python code/train_nn.py wide 2 4
python code/train_nn.py normal 18
python code/train_nn.py bottleneck 18

@ogrisel

This comment has been minimized.

Show comment
Hide comment

ogrisel commented Jul 22, 2016

Thanks!

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 23, 2016

Ok I trained the "wide 2 4" configuration and could reach 0.957476 final validation accuracy. However the weights folder did not exist (probably because git does not checkout empty folders) so saving the weights failed. You might want to make sure to create the folders if they don't exists before writing files in this script :)

iter: 198 | TL: 0.192 | VL: 0.202 | Vacc: 0.956 | Ratio: 0.95 | Time: 157.3
iter: 199 | TL: 0.192 | VL: 0.2 | Vacc: 0.955 | Ratio: 0.96 | Time: 157.3
Final Acc: 0.957476
Traceback (most recent call last):
  File "train_nn.py", line 133, in <module>
    f = gzip.open('data/weights/%s%d_resnet.pklz'%(variant,depth), 'wb')
  File "/usr/lib/python2.7/gzip.py", line 34, in open
    return GzipFile(filename, mode, compresslevel)
  File "/usr/lib/python2.7/gzip.py", line 94, in __init__
    fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
IOError: [Errno 2] No such file or directory: 'data/weights/wide2_resnet.pklz'

ogrisel commented Jul 23, 2016

Ok I trained the "wide 2 4" configuration and could reach 0.957476 final validation accuracy. However the weights folder did not exist (probably because git does not checkout empty folders) so saving the weights failed. You might want to make sure to create the folders if they don't exists before writing files in this script :)

iter: 198 | TL: 0.192 | VL: 0.202 | Vacc: 0.956 | Ratio: 0.95 | Time: 157.3
iter: 199 | TL: 0.192 | VL: 0.2 | Vacc: 0.955 | Ratio: 0.96 | Time: 157.3
Final Acc: 0.957476
Traceback (most recent call last):
  File "train_nn.py", line 133, in <module>
    f = gzip.open('data/weights/%s%d_resnet.pklz'%(variant,depth), 'wb')
  File "/usr/lib/python2.7/gzip.py", line 34, in open
    return GzipFile(filename, mode, compresslevel)
  File "/usr/lib/python2.7/gzip.py", line 94, in __init__
    fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
IOError: [Errno 2] No such file or directory: 'data/weights/wide2_resnet.pklz'
@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 23, 2016

@ogrisel Oh no, I'm really sorry about that. I'll add that to my next update.

EDIT: Updated to include python package versions, commands to reproduce. train_nn.py now creates the necessary folders if they don't already exist.

FlorianMuellerklein commented Jul 23, 2016

@ogrisel Oh no, I'm really sorry about that. I'll add that to my next update.

EDIT: Updated to include python package versions, commands to reproduce. train_nn.py now creates the necessary folders if they don't already exist.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 24, 2016

I trained again a "wide 2 4" model and I got:

Accuracy on the testing set,  0.9513

So a 4.86% test error. This is significantly better than what you write in your report. Maybe because of d42627a?

I don't have the plot because of the matplotlib backend issue that I fixed in the mean time. You should probably store the train / val losses data into a file to decouple training from convergence monitoring but this is not a big deal. Here is the text log of that run if you are interested.

Also it seems that 2 runs don't yield the same results despite the numpy rng seeding at the top of the script. I suspect that the init is done by the theano rng which is probably no seeded.

ogrisel commented Jul 24, 2016

I trained again a "wide 2 4" model and I got:

Accuracy on the testing set,  0.9513

So a 4.86% test error. This is significantly better than what you write in your report. Maybe because of d42627a?

I don't have the plot because of the matplotlib backend issue that I fixed in the mean time. You should probably store the train / val losses data into a file to decouple training from convergence monitoring but this is not a big deal. Here is the text log of that run if you are interested.

Also it seems that 2 runs don't yield the same results despite the numpy rng seeding at the top of the script. I suspect that the init is done by the theano rng which is probably no seeded.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 25, 2016

Another missing folder for the script to work by default: plots

Traceback (most recent call last):
  File "train_nn.py", line 150, in <module>
    pyplot.savefig('plots/%s%d_resnet.png'%(variant,depth))
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 695, in savefig
    res = fig.savefig(*args, **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/figure.py", line 1533, in savefig
    self.canvas.print_figure(*args, **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 2233, in print_figure
    **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 521, in print_png
    filename_or_obj = open(filename_or_obj, 'wb')
IOError: [Errno 2] No such file or directory: 'plots/normal18_resnet.png'

Edit I had launched the run prior to your last change in 0134558. This specific issue should be fixed now. Please ignore this comment.

ogrisel commented Jul 25, 2016

Another missing folder for the script to work by default: plots

Traceback (most recent call last):
  File "train_nn.py", line 150, in <module>
    pyplot.savefig('plots/%s%d_resnet.png'%(variant,depth))
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 695, in savefig
    res = fig.savefig(*args, **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/figure.py", line 1533, in savefig
    self.canvas.print_figure(*args, **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 2233, in print_figure
    **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 521, in print_png
    filename_or_obj = open(filename_or_obj, 'wb')
IOError: [Errno 2] No such file or directory: 'plots/normal18_resnet.png'

Edit I had launched the run prior to your last change in 0134558. This specific issue should be fixed now. Please ignore this comment.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 25, 2016

Here is the result of another run (this time ResNet-110):

$ python train_nn.py normal 16  # ResNet-110
[...]
iter: 197 | TL: 0.159 | VL: 0.276 | Vacc: 0.943 | Ratio: 0.58 | Time: 347.6
iter: 198 | TL: 0.158 | VL: 0.28 | Vacc: 0.942 | Ratio: 0.56 | Time: 347.0
iter: 199 | TL: 0.157 | VL: 0.271 | Vacc: 0.943 | Ratio: 0.58 | Time: 346.8
Final Acc: 0.948774
$ python test_model.py normal 16
[...]
Accuracy on the testing set,  0.9385

So the test error for this run of ResNet-110 is 6.15% which is slightly better than the reported 6.37% in the Identity Mappings paper (median of 5 runs).

ogrisel commented Jul 25, 2016

Here is the result of another run (this time ResNet-110):

$ python train_nn.py normal 16  # ResNet-110
[...]
iter: 197 | TL: 0.159 | VL: 0.276 | Vacc: 0.943 | Ratio: 0.58 | Time: 347.6
iter: 198 | TL: 0.158 | VL: 0.28 | Vacc: 0.942 | Ratio: 0.56 | Time: 347.0
iter: 199 | TL: 0.157 | VL: 0.271 | Vacc: 0.943 | Ratio: 0.58 | Time: 346.8
Final Acc: 0.948774
$ python test_model.py normal 16
[...]
Accuracy on the testing set,  0.9385

So the test error for this run of ResNet-110 is 6.15% which is slightly better than the reported 6.37% in the Identity Mappings paper (median of 5 runs).

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 25, 2016

@ogrisel Wow that is interesting that the wide ResNet did so much better, how close were the two runs with the wide net?

Your ResNet-110 is after my latest commit as well? As soon as I get some spare GPU time I will re-run these and update everything on my end.

@ogrisel Wow that is interesting that the wide ResNet did so much better, how close were the two runs with the wide net?

Your ResNet-110 is after my latest commit as well? As soon as I get some spare GPU time I will re-run these and update everything on my end.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 25, 2016

My ResNet-110 run was run on the code as of d42627a. I don't have the weights of the first WRN run so I cannot compute its test accuracy.

ogrisel commented Jul 25, 2016

My ResNet-110 run was run on the code as of d42627a. I don't have the weights of the first WRN run so I cannot compute its test accuracy.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 25, 2016

@FlorianMuellerklein also: contrary to what the commit message says, the README file was not updated in 0134558.

ogrisel commented Jul 25, 2016

@FlorianMuellerklein also: contrary to what the commit message says, the README file was not updated in 0134558.

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 26, 2016

@ogrisel Whoops, I guess I forgot to add it to the commit. I'll fix that.

@ogrisel Whoops, I guess I forgot to add it to the commit. I'll fix that.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 26, 2016

Here is the result for ResNet-164 (trained with python train_nn.py bottleneck 18 on d42627a as well) and interrupted at iteration 183 because the training loss was not moving much anymore:

Accuracy on the testing set,  0.9482

=> 5.180% test error which is better than the reported 5.46% error in the Identity Mappings paper.

Note that each iteration would take 639s on my GTX 980 Ti. This is much slower (4 times slower) than the 16 deep wide resnet and the GPU use is the 40-50% range rather than 80%+ for WRN which confirms what has been stated in the WRN paper.

ogrisel commented Jul 26, 2016

Here is the result for ResNet-164 (trained with python train_nn.py bottleneck 18 on d42627a as well) and interrupted at iteration 183 because the training loss was not moving much anymore:

Accuracy on the testing set,  0.9482

=> 5.180% test error which is better than the reported 5.46% error in the Identity Mappings paper.

Note that each iteration would take 639s on my GTX 980 Ti. This is much slower (4 times slower) than the 16 deep wide resnet and the GPU use is the 40-50% range rather than 80%+ for WRN which confirms what has been stated in the WRN paper.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 26, 2016

I have very different speeds per-epochs than what you write in your report (for the preactivation resnets, the WRN speed is similar). I used Cuda 7.5 and CuDNN v4. Here is my .theanorc:

[global]
device=gpu
floatX=float32
allow_gc=True
optimizer_including="local_ultra_fast_sigmoid"

[nvcc]
use_fast_math=True
fastmath=True

What is yours?

ogrisel commented Jul 26, 2016

I have very different speeds per-epochs than what you write in your report (for the preactivation resnets, the WRN speed is similar). I used Cuda 7.5 and CuDNN v4. Here is my .theanorc:

[global]
device=gpu
floatX=float32
allow_gc=True
optimizer_including="local_ultra_fast_sigmoid"

[nvcc]
use_fast_math=True
fastmath=True

What is yours?

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 26, 2016

I upgraded to CuDNN 5.1 RC and get the same speed.

ogrisel commented Jul 26, 2016

I upgraded to CuDNN 5.1 RC and get the same speed.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 26, 2016

I went back to a more streamlined theano configuration and still get the same speed per epoch:

[global]
device=gpu
floatX=float32
#allow_gc=True
#optimizer_including="local_ultra_fast_sigmoid"

#[nvcc]
#use_fast_math=True
#fastmath=True

ogrisel commented Jul 26, 2016

I went back to a more streamlined theano configuration and still get the same speed per epoch:

[global]
device=gpu
floatX=float32
#allow_gc=True
#optimizer_including="local_ultra_fast_sigmoid"

#[nvcc]
#use_fast_math=True
#fastmath=True
@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Jul 26, 2016

I'm using.

[global]
mode=FAST_RUN
floatX=float32
device=gpu0
allow_gc=True
optimizer_including=cudnn

[dnn.conv]
algo_fwd=time_once
algo_bwd_data=time_once
algo_bwd_filter=time_once

[nvcc]
fastmath=True

[cuda]
root=/usr/local/cuda

[lib]
cnmem=0.45

FlorianMuellerklein commented Jul 26, 2016

I'm using.

[global]
mode=FAST_RUN
floatX=float32
device=gpu0
allow_gc=True
optimizer_including=cudnn

[dnn.conv]
algo_fwd=time_once
algo_bwd_data=time_once
algo_bwd_filter=time_once

[nvcc]
fastmath=True

[cuda]
root=/usr/local/cuda

[lib]
cnmem=0.45
@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jul 26, 2016

Thanks, that was it. I now get a 90s epoch on the small WRN model as you reported. I should have read the theano doc. I think it's a good idea to include that info along with your CuDNN version in the report.

ogrisel commented Jul 26, 2016

Thanks, that was it. I now get a 90s epoch on the small WRN model as you reported. I should have read the theano doc. I think it's a good idea to include that info along with your CuDNN version in the report.

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan Aug 25, 2016

Fine for me but I suppose that after the reviewing process the code folder should run out-of-the-box :)

Fine for me but I suppose that after the reviewing process the code folder should run out-of-the-box :)

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Aug 27, 2016

@ozancaglayan No problem, I'll make sure that I update it to run out of the box once the reviewing is over.

@ozancaglayan No problem, I'll make sure that I update it to run out of the box once the reviewing is over.

@emmanuelle

This comment has been minimized.

Show comment
Hide comment
@emmanuelle

emmanuelle Aug 28, 2016

thank you very much @ogrisel and @ozancaglayan !

thank you very much @ogrisel and @ozancaglayan !

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan Aug 29, 2016

So I finally finished training the 3 models mentioned. Below are the results on Test Set using test_model.py. I believe your results in your rescience paper are also for the test set right?

ResNet Type Original Paper Reported in Rescience Article Reviewer 1 Reviewer 2
ResNet-110 6.37 6.06 6.150 6.189
ResNet-164 5.46 5.64 5.180 5.820
WResNet-n2-k4 5.55 5. 11 4.86 4.73

ozancaglayan commented Aug 29, 2016

So I finally finished training the 3 models mentioned. Below are the results on Test Set using test_model.py. I believe your results in your rescience paper are also for the test set right?

ResNet Type Original Paper Reported in Rescience Article Reviewer 1 Reviewer 2
ResNet-110 6.37 6.06 6.150 6.189
ResNet-164 5.46 5.64 5.180 5.820
WResNet-n2-k4 5.55 5. 11 4.86 4.73
@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Aug 29, 2016

@ozancaglayan I went on and consolidated my own results that were scattered in the previous comments into your summary table.

I think the variation is entirely attributable to the the random init of the weight. The original pre-activation resnet paper did a median of 5 runs which makes me think that the reproduced test scores are inline. The standard deviation is quite high though: in the order of half a percent I would say.

The reproduced WResNet-n2-k4 test scores seem slightly better that the original paper.

ogrisel commented Aug 29, 2016

@ozancaglayan I went on and consolidated my own results that were scattered in the previous comments into your summary table.

I think the variation is entirely attributable to the the random init of the weight. The original pre-activation resnet paper did a median of 5 runs which makes me think that the reproduced test scores are inline. The standard deviation is quite high though: in the order of half a percent I would say.

The reproduced WResNet-n2-k4 test scores seem slightly better that the original paper.

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan Aug 29, 2016

Ah I actually fixed the seed to 1234, used deterministic CUDNN ops in theanorc, and my Theano is also patched to be more deterministic on GPU.

Ah I actually fixed the seed to 1234, used deterministic CUDNN ops in theanorc, and my Theano is also patched to be more deterministic on GPU.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Sep 13, 2016

Member

@ogrisel @ozancaglayan Any update on the review ?

Member

rougier commented Sep 13, 2016

@ogrisel @ozancaglayan Any update on the review ?

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Sep 27, 2016

Member

@ogrisel @ozancaglayan 🔔 Any update on the review ?

Member

rougier commented Sep 27, 2016

@ogrisel @ozancaglayan 🔔 Any update on the review ?

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan Sep 27, 2016

Should we comment on the layout or typography issues in the article?

Should we comment on the layout or typography issues in the article?

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Sep 27, 2016

Member

Yes, you can signal them. Note that @FlorianMuellerklein signaled at the top there were some formatting issue. But it would be great if you can report them here. @FlorianMuellerklein can correct most of them hopefully and I will correct others during publication.

Member

rougier commented Sep 27, 2016

Yes, you can signal them. Note that @FlorianMuellerklein signaled at the top there were some formatting issue. But it would be great if you can report them here. @FlorianMuellerklein can correct most of them hopefully and I will correct others during publication.

@rougier

This comment has been minimized.

Show comment
Hide comment
Member

rougier commented Oct 7, 2016

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Oct 19, 2016

Member

Dear @emmanuelle @ogrisel @ozancaglayan,

Is there anything we can do to have this review to move forward ?

Member

rougier commented Oct 19, 2016

Dear @emmanuelle @ogrisel @ozancaglayan,

Is there anything we can do to have this review to move forward ?

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Nov 2, 2016

Member

Dear @emmanuelle @ogrisel @ozancaglayan,

Any update on this review ?

Member

rougier commented Nov 2, 2016

Dear @emmanuelle @ogrisel @ozancaglayan,

Any update on this review ?

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan Nov 3, 2016

Sorry for the delay, I am really busy these days.

Code

  • There were some issues previously mentioned by me and @ogrisel about the folder mess. It would be nice if the code once cloned works out-of-the-box for maximum reproducibility.
  • Fixing the numpy rng seed can be good for improved reproducibility but this may change your reported results.

Article

  • In the first paragraph, the sentences starting with "main idea", "basic idea" and "the residual blocks are basically" should be reformulated/merged for ease of reading.
  • The first figure can be enhanced with a higher resolution one. The caption of the figure is not positioned correctly.
  • "Whenever the number of filters are increased [comma here] the first convolution layer..."
  • First sentence after the first table should be the caption of the table or it should be a bold title, etc.
  • The next sentence there is problematic. What is parallel to what? By saying "are added after each block", do you mean "are summed"?
  • The next paragraph, you can say "[1] also introduces a bottleneck architecture" instead of mentioning the full article name
  • When you cite the papers inside brackets in the whole article, I think they should be before the final dot of the sentence, e.g. "foo bar [1]." instead of "foo bar. [1]"
  • Preactivation Residual Networks [2] and Wide Residual Networks[3] may be better than "... [2][3]"
  • Figure title for preactivation resnet is not positioned correctly.
  • The tables need captions for clarity.
  • Paragraph is wrongly wrapped with the figure showing basic-wide and wide-dropout blocks.
  • I think there's no need to put a quoting from an article inside quotes since you cite it just after the sentence. An example is from the Methods section: "32x32 images, with the per-pixel mean subtracted" [1]
  • Data Augmentation last sentence suggestion: "Batch size of 64 is used instead of 128 because of hardware constraints."
  • Training and Regularization: The learning rate schedule between the curly braces can be explained textually for clarity.
  • Hardware and software last sentence -> "but with more memory"
  • Results last paragraph: "...in the cross-entropy during training and also much less of a quadratic trend than seen in other neural networks." is not very clear to me.

ozancaglayan commented Nov 3, 2016

Sorry for the delay, I am really busy these days.

Code

  • There were some issues previously mentioned by me and @ogrisel about the folder mess. It would be nice if the code once cloned works out-of-the-box for maximum reproducibility.
  • Fixing the numpy rng seed can be good for improved reproducibility but this may change your reported results.

Article

  • In the first paragraph, the sentences starting with "main idea", "basic idea" and "the residual blocks are basically" should be reformulated/merged for ease of reading.
  • The first figure can be enhanced with a higher resolution one. The caption of the figure is not positioned correctly.
  • "Whenever the number of filters are increased [comma here] the first convolution layer..."
  • First sentence after the first table should be the caption of the table or it should be a bold title, etc.
  • The next sentence there is problematic. What is parallel to what? By saying "are added after each block", do you mean "are summed"?
  • The next paragraph, you can say "[1] also introduces a bottleneck architecture" instead of mentioning the full article name
  • When you cite the papers inside brackets in the whole article, I think they should be before the final dot of the sentence, e.g. "foo bar [1]." instead of "foo bar. [1]"
  • Preactivation Residual Networks [2] and Wide Residual Networks[3] may be better than "... [2][3]"
  • Figure title for preactivation resnet is not positioned correctly.
  • The tables need captions for clarity.
  • Paragraph is wrongly wrapped with the figure showing basic-wide and wide-dropout blocks.
  • I think there's no need to put a quoting from an article inside quotes since you cite it just after the sentence. An example is from the Methods section: "32x32 images, with the per-pixel mean subtracted" [1]
  • Data Augmentation last sentence suggestion: "Batch size of 64 is used instead of 128 because of hardware constraints."
  • Training and Regularization: The learning rate schedule between the curly braces can be explained textually for clarity.
  • Hardware and software last sentence -> "but with more memory"
  • Results last paragraph: "...in the cross-entropy during training and also much less of a quadratic trend than seen in other neural networks." is not very clear to me.

@rougier rougier self-assigned this Nov 14, 2016

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Nov 14, 2016

Member

@FlorianMuellerklein Can you make the suggested modification ?
@ogrisel @ozancaglayan What is you decision, does the paper needs more review or can you make a decision about acceptation/rejection ?

Member

rougier commented Nov 14, 2016

@FlorianMuellerklein Can you make the suggested modification ?
@ogrisel @ozancaglayan What is you decision, does the paper needs more review or can you make a decision about acceptation/rejection ?

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan Nov 17, 2016

Should I wait for the modifications to decide?

Should I wait for the modifications to decide?

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Nov 17, 2016

Member

No, but you can condition your decision (e.g. provided the author makes the suggested changes...)

Member

rougier commented Nov 17, 2016

No, but you can condition your decision (e.g. provided the author makes the suggested changes...)

@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Nov 17, 2016

@rougier Sorry, I've been pretty busy lately as well. I'll have these edits soon.

@rougier Sorry, I've been pretty busy lately as well. I'll have these edits soon.

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan Nov 25, 2016

OK, provided the author makes the suggested changes, my decision is to accept the paper.

OK, provided the author makes the suggested changes, my decision is to accept the paper.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Nov 29, 2016

Member

@ogrisel Can you tell you decision about the paper or do you need more time ?

Member

rougier commented Nov 29, 2016

@ogrisel Can you tell you decision about the paper or do you need more time ?

@rougier rougier added the Stalled label Dec 18, 2016

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Jan 11, 2017

Member

Dear @FlorianMuellerklein @ogrisel @emmanuelle @ozancaglayan,

This submission is now 6 months old and we need to try to move things forwards.
Could you have a look at past messages and possibly answer questions? Thanks.

Member

rougier commented Jan 11, 2017

Dear @FlorianMuellerklein @ogrisel @emmanuelle @ozancaglayan,

This submission is now 6 months old and we need to try to move things forwards.
Could you have a look at past messages and possibly answer questions? Thanks.

@rougier

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

This comment has been minimized.

Show comment
Hide comment
@FlorianMuellerklein

FlorianMuellerklein Mar 15, 2017

I just pushed an update with most of the changes that Ozan requested.

I just pushed an update with most of the changes that Ozan requested.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Mar 15, 2017

Member

@FlorianMuellerklein Thanks for the update!
@ozancaglayan Can you make your decision based on the new update?

Member

rougier commented Mar 15, 2017

@FlorianMuellerklein Thanks for the update!
@ozancaglayan Can you make your decision based on the new update?

@rougier rougier referenced this pull request in ReScience/ReScience Apr 20, 2017

Open

Replication of preprints #50

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier May 3, 2017

Member

@ReScience/reviewers We need some new reviewers here if anyone interested!

Member

rougier commented May 3, 2017

@ReScience/reviewers We need some new reviewers here if anyone interested!

@ozancaglayan

This comment has been minimized.

Show comment
Hide comment
@ozancaglayan

ozancaglayan May 3, 2017

As I mentioned before, I accept the paper.

As I mentioned before, I accept the paper.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier May 3, 2017

Member

@ozancaglayan Sorry, I didn't see it.

Member

rougier commented May 3, 2017

@ozancaglayan Sorry, I didn't see it.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier May 29, 2017

Member

🛎 @grisel Can you tell me if you're able to handle this review or if we need to look for another reviewer ? No problem if you can't but I need to know to try to move this review forward.

Member

rougier commented May 29, 2017

🛎 @grisel Can you tell me if you're able to handle this review or if we need to look for another reviewer ? No problem if you can't but I need to know to try to move this review forward.

@rth

This comment has been minimized.

Show comment
Hide comment
@rth

rth Jun 1, 2017

🛎 @grisel Can you tell me if you're able to handle this review or if we need to look for another reviewer ? No problem if you can't but I need to know to try to move this review forward.

With an "o" @ogrisel ..

rth commented Jun 1, 2017

🛎 @grisel Can you tell me if you're able to handle this review or if we need to look for another reviewer ? No problem if you can't but I need to know to try to move this review forward.

With an "o" @ogrisel ..

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jun 9, 2017

Sorry for the lack of reply. I will try to find some time next week do review the updated paper / code.

ogrisel commented Jun 9, 2017

Sorry for the lack of reply. I will try to find some time next week do review the updated paper / code.

@remram44

This comment has been minimized.

Show comment
Hide comment
@remram44

remram44 Jul 21, 2017

It seems that this uses an unreleased version of Lasagne (which requires a development version of Theano as well). It would be great if the precise Git hashes used to run this could be included in the README. I applaud your including the precise version of numpy and Theano, but you left out sklearn and matplotlib as well.

The pip freeze command might help!

It seems that this uses an unreleased version of Lasagne (which requires a development version of Theano as well). It would be great if the precise Git hashes used to run this could be included in the README. I applaud your including the precise version of numpy and Theano, but you left out sklearn and matplotlib as well.

The pip freeze command might help!

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Oct 31, 2017

Member

@FlorianMuellerklein Are you still following this submission ? We might need to assign a new reviewer to complete the process but first I wanted to know if you're still interested.

Member

rougier commented Oct 31, 2017

@FlorianMuellerklein Are you still following this submission ? We might need to assign a new reviewer to complete the process but first I wanted to know if you're still interested.

@FedericoV

This comment has been minimized.

Show comment
Hide comment
@FedericoV

FedericoV Oct 31, 2017

If @FlorianMuellerklein is not available, I can jump in - but not sure if it's ideal to have an additional pair of eyes at this point in the process.

If @FlorianMuellerklein is not available, I can jump in - but not sure if it's ideal to have an additional pair of eyes at this point in the process.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Oct 31, 2017

Member

Thanks @FedericoV. @FlorianMuellerklein is the author of the replication. @ozancaglayan alread accepted the paper and @ogrisel has not responded yet. At this point, if you can review it, that would be great.

Member

rougier commented Oct 31, 2017

Thanks @FedericoV. @FlorianMuellerklein is the author of the replication. @ozancaglayan alread accepted the paper and @ogrisel has not responded yet. At this point, if you can review it, that would be great.

@rougier rougier removed the Stalled label Oct 31, 2017

@rougier rougier closed this Feb 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment