Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reset_weights is true by default #136

Closed
jpdoyle opened this issue Apr 30, 2016 · 5 comments
Closed

reset_weights is true by default #136

jpdoyle opened this issue Apr 30, 2016 · 5 comments

Comments

@jpdoyle
Copy link

jpdoyle commented Apr 30, 2016

I realize this may be intended behavior for the use case that the authors have in mind, but I just wasted ~30 CPU-days because I didn't realize this was the default. I was training on a dataset that couldn't fit on disk uncompressed, so I did what seemed reasonable -- extract a batch, read the NN from a file, run the batch, write the NN back to the file, repeat for the next batch. But since this option is true by default, it deleted anything from those rounds once a later round started. I have fixed the problem now, but I think this is an unreasonable default.

@DozerTheCat
Copy link

Also be careful that the solvers are not cleared of their persistent optimization data, otherwise they will not train efficiently. You may have also worked around the data size issue by swapping the data vectors in the callback after the epoch.

@jpdoyle
Copy link
Author

jpdoyle commented May 1, 2016

Luckily, I am only doing vanilla gradient descent for this -- I am
attempting to replicate the same type of training that was done for
AlphaGo, and they didn't do anything very stateful. But that's good to keep
in mind.
On May 1, 2016 00:39, "DozerTheCat" notifications@github.com wrote:

Also be careful that the solvers are not cleared of their persistent
optimization data, otherwise they will not train efficiently. You may have
also worked around the data size issue by swapping the data vectors in the
callback after the epoch.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#136 (comment)

@nyanp
Copy link
Member

nyanp commented May 5, 2016

@Ginto8
Thank you for your feedback :) Actually it is an intended behavior (mainly for backward-compatibility), but now I think your opinion is reasonable. I'll change this behavior in next release.

@jpdoyle
Copy link
Author

jpdoyle commented May 8, 2016

@nyanp Awesome, thanks. Thanks for making this library btw, it's really nice to have easy-to-use, fairly fast CNNs without having to go through driver hell.

nyanp added a commit that referenced this issue Jun 6, 2016
- Now we can handle non-sequential model as ```network<graph>``` #108
#153
- Catch up the latest format of caffe's proto #162
- Improve the default behaviour of re-init weight #136
- Add more tests and documents #73
- Remove dependency of OpenCV in MNIST example
edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Jun 12, 2016
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108
tiny-dnn#153
- Catch up the latest format of caffe's proto tiny-dnn#162
- Improve the default behaviour of re-init weight tiny-dnn#136
- Add more tests and documents tiny-dnn#73
- Remove dependency of OpenCV in MNIST example
edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Jun 13, 2016
edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Jun 13, 2016
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108
tiny-dnn#153
- Catch up the latest format of caffe's proto tiny-dnn#162
- Improve the default behaviour of re-init weight tiny-dnn#136
- Add more tests and documents tiny-dnn#73
- Remove dependency of OpenCV in MNIST example
edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Jun 13, 2016
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108
tiny-dnn#153
- Catch up the latest format of caffe's proto tiny-dnn#162
- Improve the default behaviour of re-init weight tiny-dnn#136
- Add more tests and documents tiny-dnn#73
- Remove dependency of OpenCV in MNIST example
wangyida pushed a commit to wangyida/tiny-cnn that referenced this issue Jun 14, 2016
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108
tiny-dnn#153
- Catch up the latest format of caffe's proto tiny-dnn#162
- Improve the default behaviour of re-init weight tiny-dnn#136
- Add more tests and documents tiny-dnn#73
- Remove dependency of OpenCV in MNIST example
nyanp pushed a commit that referenced this issue Jun 15, 2016
* v0.0.1 -> v.0.1.0

- Now we can handle non-sequential model as ```network<graph>``` #108
#153
- Catch up the latest format of caffe's proto #162
- Improve the default behaviour of re-init weight #136
- Add more tests and documents #73
- Remove dependency of OpenCV in MNIST example

* Create DeviceAbstraction.puml

Create test.md

Update test.md

Update and rename test.md to device-abstraction-uml.md

Create device-abstraction-uml.puml

Update readme.md

reorganize cmake

reorganize cmake

remove test warnings

remove mock files

Update device-abstraction-uml.puml

* add tests&comments

* cleanup codes

currently we don't need hand-written ctor/dtors in dropout

* Remove OpenCV dependency closes #2

* fix comments

* update build instruction file

* support CNN_USE_OPENCV build config

* fix build configuration file

* change the location of stb_image header files

* follows the comment #167 (comment)
relocate stb_image files

* Update layer.h

* Implement on deconvolution and unpooling layer.

* core module skeletons

* Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files.

* Fix linker error due to duplicate symbols.

* Update README.md

- Update layer catalogue
- Upgrade example codes to v0.1.0

* update first convolutional layer abstraction version with NNPACK support

base methods for new API

update find_package NNPACK

update tiny_backend

fix segfault by adding move constructor to conv layer

refactor nnp_backend

update UML with new design

fix required libs in cmake, set float_t as float and set activation vector as input in nnpack

fix padding and add assertion after nnp_convolution_inference

fix CMake warnings and reorganize modules

fix data race

fix broken tests

Reorganize CMake modules

add epsilon for broken tests

fix broken tests. float_t was missing in some layers.

fix clang errors
@nyanp
Copy link
Member

nyanp commented Jun 20, 2016

From v0.1.0 default value of reset_weight is changed. see: https://github.com/nyanp/tiny-cnn/blob/master/doc/Changing-from-v0_0_1.md

@nyanp nyanp closed this as completed Jun 20, 2016
nyanp added a commit that referenced this issue Jul 21, 2016
* v0.0.1 -> v.0.1.0

- Now we can handle non-sequential model as ```network<graph>``` #108
#153
- Catch up the latest format of caffe's proto #162
- Improve the default behaviour of re-init weight #136
- Add more tests and documents #73
- Remove dependency of OpenCV in MNIST example

* Update layer.h

* Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files.

* Fix linker error due to duplicate symbols.

* Update README.md

- Update layer catalogue
- Upgrade example codes to v0.1.0

* Removed a repeated include.

* Bug Fix: network::test was broken in multi-thread. closes #185

* fix test_with_stb_image.cpp and typos (tiny_cnn_hrds => tiny_cnn_hdrs) in CMakeLists.txt (#189)

* fix a compile error and warnings when the type float_t is a typedef of float (#191)

* fix README.md (#194)

* Refactor layer to handle minibatch at once (#181)

* get rid of unnecessary compiler warnings (C4018: '>=': signed/unsigned mismatch)

* refactor layer to handle a minibatch at once (previously individual input samples only)

* move new functions apply_cost_if_defined and add_sample_gradient to an anonymous namespace in order to hopefully make AppVeyor happy

* revert the changes to test_network.h

* minor type fixes to get rid of compiler warnings

* make quick changes to deconvolutional_layer so that the examples at least compile

* fix backward_activation to use all samples

* remove unused variables

* update picotest to fix operator precedence

* fix input data ordering

* fix overriding prev_delta

* change gradients to be per-sample as well

* remove unused variables

* fix indexing in convolutional_layer

* also in dropout_layer, have a different mask vector for each sample of a minibatch

* deconvolution_layer: fix allocating space for prev_out_ and cur_out_padded_

* add gradient check for minibatch

* minor: change types to fix some compiler warnings

* Add application links to doc, #158

* Fixed typo (#205)

* Add a contribution document

* fixed typo (#216)

* Add batch normalization #147

* Add batch normalization prototype & remove batch-level parallelsim #147

* add backward pass & formatting

* Add unit test for forward pass

* Add numerical check for batchnorm

* Fix convolutional::brop for pointing correct storage

* Fix bprop in batchnorm

* Change an order of arguments in ctor

to keep consistency to other layers

* add batch normalization layer

* fix compiler error on deconvolutional-layer

* Implement caffe importer

* Revert changes around calc_delta

* Fix backprop of bias factor in conv layer

* Fix compilation error in MSVC2013, close #218 #231

* Add slice layer (#233)

* Bug Fix #234

* Add BSD-3 license file #228

* Fix handling non-square input data in caffemodel #227

* Add power layer #227

* Generalization of loss functions + Correcting MSE (#232)

* generalization of loss function to vectors (solves wrong MSE)

* removed unnecessary function due to loss function generalization

* loss function df operating on vec_t

* correct df of mse

* missin brackets

* fix compile errors in conv-layer

* remove sample_count
edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Aug 8, 2016
edgarriba added a commit to edgarriba/tiny-cnn that referenced this issue Aug 8, 2016
* v0.0.1 -> v.0.1.0

- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108
tiny-dnn#153
- Catch up the latest format of caffe's proto tiny-dnn#162
- Improve the default behaviour of re-init weight tiny-dnn#136
- Add more tests and documents tiny-dnn#73
- Remove dependency of OpenCV in MNIST example

* Create DeviceAbstraction.puml

Create test.md

Update test.md

Update and rename test.md to device-abstraction-uml.md

Create device-abstraction-uml.puml

Update readme.md

reorganize cmake

reorganize cmake

remove test warnings

remove mock files

Update device-abstraction-uml.puml

* add tests&comments

* cleanup codes

currently we don't need hand-written ctor/dtors in dropout

* Remove OpenCV dependency closes #2

* fix comments

* update build instruction file

* support CNN_USE_OPENCV build config

* fix build configuration file

* change the location of stb_image header files

* follows the comment tiny-dnn#167 (comment)
relocate stb_image files

* Update layer.h

* Implement on deconvolution and unpooling layer.

* core module skeletons

* Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files.

* Fix linker error due to duplicate symbols.

* Update README.md

- Update layer catalogue
- Upgrade example codes to v0.1.0

* update first convolutional layer abstraction version with NNPACK support

base methods for new API

update find_package NNPACK

update tiny_backend

fix segfault by adding move constructor to conv layer

refactor nnp_backend

update UML with new design

fix required libs in cmake, set float_t as float and set activation vector as input in nnpack

fix padding and add assertion after nnp_convolution_inference

fix CMake warnings and reorganize modules

fix data race

fix broken tests

Reorganize CMake modules

add epsilon for broken tests

fix broken tests. float_t was missing in some layers.

fix clang errors
edgarriba pushed a commit to edgarriba/tiny-cnn that referenced this issue Aug 8, 2016
* v0.0.1 -> v.0.1.0

- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108
tiny-dnn#153
- Catch up the latest format of caffe's proto tiny-dnn#162
- Improve the default behaviour of re-init weight tiny-dnn#136
- Add more tests and documents tiny-dnn#73
- Remove dependency of OpenCV in MNIST example

* Update layer.h

* Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files.

* Fix linker error due to duplicate symbols.

* Update README.md

- Update layer catalogue
- Upgrade example codes to v0.1.0

* Removed a repeated include.

* Bug Fix: network::test was broken in multi-thread. closes tiny-dnn#185

* fix test_with_stb_image.cpp and typos (tiny_cnn_hrds => tiny_cnn_hdrs) in CMakeLists.txt (tiny-dnn#189)

* fix a compile error and warnings when the type float_t is a typedef of float (tiny-dnn#191)

* fix README.md (tiny-dnn#194)

* Refactor layer to handle minibatch at once (tiny-dnn#181)

* get rid of unnecessary compiler warnings (C4018: '>=': signed/unsigned mismatch)

* refactor layer to handle a minibatch at once (previously individual input samples only)

* move new functions apply_cost_if_defined and add_sample_gradient to an anonymous namespace in order to hopefully make AppVeyor happy

* revert the changes to test_network.h

* minor type fixes to get rid of compiler warnings

* make quick changes to deconvolutional_layer so that the examples at least compile

* fix backward_activation to use all samples

* remove unused variables

* update picotest to fix operator precedence

* fix input data ordering

* fix overriding prev_delta

* change gradients to be per-sample as well

* remove unused variables

* fix indexing in convolutional_layer

* also in dropout_layer, have a different mask vector for each sample of a minibatch

* deconvolution_layer: fix allocating space for prev_out_ and cur_out_padded_

* add gradient check for minibatch

* minor: change types to fix some compiler warnings

* Add application links to doc, tiny-dnn#158

* Fixed typo (tiny-dnn#205)

* Add a contribution document

* fixed typo (tiny-dnn#216)

* Add batch normalization tiny-dnn#147

* Add batch normalization prototype & remove batch-level parallelsim tiny-dnn#147

* add backward pass & formatting

* Add unit test for forward pass

* Add numerical check for batchnorm

* Fix convolutional::brop for pointing correct storage

* Fix bprop in batchnorm

* Change an order of arguments in ctor

to keep consistency to other layers

* add batch normalization layer

* fix compiler error on deconvolutional-layer

* Implement caffe importer

* Revert changes around calc_delta

* Fix backprop of bias factor in conv layer

* Fix compilation error in MSVC2013, close tiny-dnn#218 tiny-dnn#231

* Add slice layer (tiny-dnn#233)

* Bug Fix tiny-dnn#234

* Add BSD-3 license file tiny-dnn#228

* Fix handling non-square input data in caffemodel tiny-dnn#227

* Add power layer tiny-dnn#227

* Generalization of loss functions + Correcting MSE (tiny-dnn#232)

* generalization of loss function to vectors (solves wrong MSE)

* removed unnecessary function due to loss function generalization

* loss function df operating on vec_t

* correct df of mse

* missin brackets

* fix compile errors in conv-layer

* remove sample_count
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants