reset_weights is true by default #136

jpdoyle · 2016-04-30T03:56:27Z

I realize this may be intended behavior for the use case that the authors have in mind, but I just wasted ~30 CPU-days because I didn't realize this was the default. I was training on a dataset that couldn't fit on disk uncompressed, so I did what seemed reasonable -- extract a batch, read the NN from a file, run the batch, write the NN back to the file, repeat for the next batch. But since this option is true by default, it deleted anything from those rounds once a later round started. I have fixed the problem now, but I think this is an unreasonable default.

DozerTheCat · 2016-05-01T04:39:16Z

Also be careful that the solvers are not cleared of their persistent optimization data, otherwise they will not train efficiently. You may have also worked around the data size issue by swapping the data vectors in the callback after the epoch.

jpdoyle · 2016-05-01T12:58:01Z

Luckily, I am only doing vanilla gradient descent for this -- I am
attempting to replicate the same type of training that was done for
AlphaGo, and they didn't do anything very stateful. But that's good to keep
in mind.
On May 1, 2016 00:39, "DozerTheCat" notifications@github.com wrote:

Also be careful that the solvers are not cleared of their persistent
optimization data, otherwise they will not train efficiently. You may have
also worked around the data size issue by swapping the data vectors in the
callback after the epoch.

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#136 (comment)

nyanp · 2016-05-05T12:53:29Z

@Ginto8
Thank you for your feedback :) Actually it is an intended behavior (mainly for backward-compatibility), but now I think your opinion is reasonable. I'll change this behavior in next release.

jpdoyle · 2016-05-08T14:32:58Z

@nyanp Awesome, thanks. Thanks for making this library btw, it's really nice to have easy-to-use, fairly fast CNNs without having to go through driver hell.

- Now we can handle non-sequential model as ```network<graph>``` #108 #153 - Catch up the latest format of caffe's proto #162 - Improve the default behaviour of re-init weight #136 - Add more tests and documents #73 - Remove dependency of OpenCV in MNIST example

- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example

* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` #108 #153 - Catch up the latest format of caffe's proto #162 - Improve the default behaviour of re-init weight #136 - Add more tests and documents #73 - Remove dependency of OpenCV in MNIST example * Create DeviceAbstraction.puml Create test.md Update test.md Update and rename test.md to device-abstraction-uml.md Create device-abstraction-uml.puml Update readme.md reorganize cmake reorganize cmake remove test warnings remove mock files Update device-abstraction-uml.puml * add tests&comments * cleanup codes currently we don't need hand-written ctor/dtors in dropout * Remove OpenCV dependency closes #2 * fix comments * update build instruction file * support CNN_USE_OPENCV build config * fix build configuration file * change the location of stb_image header files * follows the comment #167 (comment) relocate stb_image files * Update layer.h * Implement on deconvolution and unpooling layer. * core module skeletons * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * update first convolutional layer abstraction version with NNPACK support base methods for new API update find_package NNPACK update tiny_backend fix segfault by adding move constructor to conv layer refactor nnp_backend update UML with new design fix required libs in cmake, set float_t as float and set activation vector as input in nnpack fix padding and add assertion after nnp_convolution_inference fix CMake warnings and reorganize modules fix data race fix broken tests Reorganize CMake modules add epsilon for broken tests fix broken tests. float_t was missing in some layers. fix clang errors

nyanp · 2016-06-20T14:15:26Z

From v0.1.0 default value of reset_weight is changed. see: https://github.com/nyanp/tiny-cnn/blob/master/doc/Changing-from-v0_0_1.md

* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` #108 #153 - Catch up the latest format of caffe's proto #162 - Improve the default behaviour of re-init weight #136 - Add more tests and documents #73 - Remove dependency of OpenCV in MNIST example * Update layer.h * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * Removed a repeated include. * Bug Fix: network::test was broken in multi-thread. closes #185 * fix test_with_stb_image.cpp and typos (tiny_cnn_hrds => tiny_cnn_hdrs) in CMakeLists.txt (#189) * fix a compile error and warnings when the type float_t is a typedef of float (#191) * fix README.md (#194) * Refactor layer to handle minibatch at once (#181) * get rid of unnecessary compiler warnings (C4018: '>=': signed/unsigned mismatch) * refactor layer to handle a minibatch at once (previously individual input samples only) * move new functions apply_cost_if_defined and add_sample_gradient to an anonymous namespace in order to hopefully make AppVeyor happy * revert the changes to test_network.h * minor type fixes to get rid of compiler warnings * make quick changes to deconvolutional_layer so that the examples at least compile * fix backward_activation to use all samples * remove unused variables * update picotest to fix operator precedence * fix input data ordering * fix overriding prev_delta * change gradients to be per-sample as well * remove unused variables * fix indexing in convolutional_layer * also in dropout_layer, have a different mask vector for each sample of a minibatch * deconvolution_layer: fix allocating space for prev_out_ and cur_out_padded_ * add gradient check for minibatch * minor: change types to fix some compiler warnings * Add application links to doc, #158 * Fixed typo (#205) * Add a contribution document * fixed typo (#216) * Add batch normalization #147 * Add batch normalization prototype & remove batch-level parallelsim #147 * add backward pass & formatting * Add unit test for forward pass * Add numerical check for batchnorm * Fix convolutional::brop for pointing correct storage * Fix bprop in batchnorm * Change an order of arguments in ctor to keep consistency to other layers * add batch normalization layer * fix compiler error on deconvolutional-layer * Implement caffe importer * Revert changes around calc_delta * Fix backprop of bias factor in conv layer * Fix compilation error in MSVC2013, close #218 #231 * Add slice layer (#233) * Bug Fix #234 * Add BSD-3 license file #228 * Fix handling non-square input data in caffemodel #227 * Add power layer #227 * Generalization of loss functions + Correcting MSE (#232) * generalization of loss function to vectors (solves wrong MSE) * removed unnecessary function due to loss function generalization * loss function df operating on vec_t * correct df of mse * missin brackets * fix compile errors in conv-layer * remove sample_count

* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example * Create DeviceAbstraction.puml Create test.md Update test.md Update and rename test.md to device-abstraction-uml.md Create device-abstraction-uml.puml Update readme.md reorganize cmake reorganize cmake remove test warnings remove mock files Update device-abstraction-uml.puml * add tests&comments * cleanup codes currently we don't need hand-written ctor/dtors in dropout * Remove OpenCV dependency closes #2 * fix comments * update build instruction file * support CNN_USE_OPENCV build config * fix build configuration file * change the location of stb_image header files * follows the comment tiny-dnn#167 (comment) relocate stb_image files * Update layer.h * Implement on deconvolution and unpooling layer. * core module skeletons * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * update first convolutional layer abstraction version with NNPACK support base methods for new API update find_package NNPACK update tiny_backend fix segfault by adding move constructor to conv layer refactor nnp_backend update UML with new design fix required libs in cmake, set float_t as float and set activation vector as input in nnpack fix padding and add assertion after nnp_convolution_inference fix CMake warnings and reorganize modules fix data race fix broken tests Reorganize CMake modules add epsilon for broken tests fix broken tests. float_t was missing in some layers. fix clang errors

* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example * Update layer.h * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * Removed a repeated include. * Bug Fix: network::test was broken in multi-thread. closes tiny-dnn#185 * fix test_with_stb_image.cpp and typos (tiny_cnn_hrds => tiny_cnn_hdrs) in CMakeLists.txt (tiny-dnn#189) * fix a compile error and warnings when the type float_t is a typedef of float (tiny-dnn#191) * fix README.md (tiny-dnn#194) * Refactor layer to handle minibatch at once (tiny-dnn#181) * get rid of unnecessary compiler warnings (C4018: '>=': signed/unsigned mismatch) * refactor layer to handle a minibatch at once (previously individual input samples only) * move new functions apply_cost_if_defined and add_sample_gradient to an anonymous namespace in order to hopefully make AppVeyor happy * revert the changes to test_network.h * minor type fixes to get rid of compiler warnings * make quick changes to deconvolutional_layer so that the examples at least compile * fix backward_activation to use all samples * remove unused variables * update picotest to fix operator precedence * fix input data ordering * fix overriding prev_delta * change gradients to be per-sample as well * remove unused variables * fix indexing in convolutional_layer * also in dropout_layer, have a different mask vector for each sample of a minibatch * deconvolution_layer: fix allocating space for prev_out_ and cur_out_padded_ * add gradient check for minibatch * minor: change types to fix some compiler warnings * Add application links to doc, tiny-dnn#158 * Fixed typo (tiny-dnn#205) * Add a contribution document * fixed typo (tiny-dnn#216) * Add batch normalization tiny-dnn#147 * Add batch normalization prototype & remove batch-level parallelsim tiny-dnn#147 * add backward pass & formatting * Add unit test for forward pass * Add numerical check for batchnorm * Fix convolutional::brop for pointing correct storage * Fix bprop in batchnorm * Change an order of arguments in ctor to keep consistency to other layers * add batch normalization layer * fix compiler error on deconvolutional-layer * Implement caffe importer * Revert changes around calc_delta * Fix backprop of bias factor in conv layer * Fix compilation error in MSVC2013, close tiny-dnn#218 tiny-dnn#231 * Add slice layer (tiny-dnn#233) * Bug Fix tiny-dnn#234 * Add BSD-3 license file tiny-dnn#228 * Fix handling non-square input data in caffemodel tiny-dnn#227 * Add power layer tiny-dnn#227 * Generalization of loss functions + Correcting MSE (tiny-dnn#232) * generalization of loss function to vectors (solves wrong MSE) * removed unnecessary function due to loss function generalization * loss function df operating on vec_t * correct df of mse * missin brackets * fix compile errors in conv-layer * remove sample_count

nyanp added a commit that referenced this issue Jun 4, 2016

change the default value of reset_weights #136

f741b3a

edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Jun 13, 2016

change the default value of reset_weights tiny-dnn#136

5c60871

nyanp closed this as completed Jun 20, 2016

edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Aug 8, 2016

change the default value of reset_weights tiny-dnn#136

1f5232a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reset_weights is true by default #136

reset_weights is true by default #136

jpdoyle commented Apr 30, 2016

DozerTheCat commented May 1, 2016

jpdoyle commented May 1, 2016

nyanp commented May 5, 2016

jpdoyle commented May 8, 2016

nyanp commented Jun 20, 2016

reset_weights is true by default #136

reset_weights is true by default #136

Comments

jpdoyle commented Apr 30, 2016

DozerTheCat commented May 1, 2016

jpdoyle commented May 1, 2016

nyanp commented May 5, 2016

jpdoyle commented May 8, 2016

nyanp commented Jun 20, 2016