New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reset_weights is true by default #136
Comments
Also be careful that the solvers are not cleared of their persistent optimization data, otherwise they will not train efficiently. You may have also worked around the data size issue by swapping the data vectors in the callback after the epoch. |
Luckily, I am only doing vanilla gradient descent for this -- I am
|
@Ginto8 |
@nyanp Awesome, thanks. Thanks for making this library btw, it's really nice to have easy-to-use, fairly fast CNNs without having to go through driver hell. |
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example
- Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example
* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` #108 #153 - Catch up the latest format of caffe's proto #162 - Improve the default behaviour of re-init weight #136 - Add more tests and documents #73 - Remove dependency of OpenCV in MNIST example * Create DeviceAbstraction.puml Create test.md Update test.md Update and rename test.md to device-abstraction-uml.md Create device-abstraction-uml.puml Update readme.md reorganize cmake reorganize cmake remove test warnings remove mock files Update device-abstraction-uml.puml * add tests&comments * cleanup codes currently we don't need hand-written ctor/dtors in dropout * Remove OpenCV dependency closes #2 * fix comments * update build instruction file * support CNN_USE_OPENCV build config * fix build configuration file * change the location of stb_image header files * follows the comment #167 (comment) relocate stb_image files * Update layer.h * Implement on deconvolution and unpooling layer. * core module skeletons * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * update first convolutional layer abstraction version with NNPACK support base methods for new API update find_package NNPACK update tiny_backend fix segfault by adding move constructor to conv layer refactor nnp_backend update UML with new design fix required libs in cmake, set float_t as float and set activation vector as input in nnpack fix padding and add assertion after nnp_convolution_inference fix CMake warnings and reorganize modules fix data race fix broken tests Reorganize CMake modules add epsilon for broken tests fix broken tests. float_t was missing in some layers. fix clang errors
From v0.1.0 default value of reset_weight is changed. see: https://github.com/nyanp/tiny-cnn/blob/master/doc/Changing-from-v0_0_1.md |
* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` #108 #153 - Catch up the latest format of caffe's proto #162 - Improve the default behaviour of re-init weight #136 - Add more tests and documents #73 - Remove dependency of OpenCV in MNIST example * Update layer.h * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * Removed a repeated include. * Bug Fix: network::test was broken in multi-thread. closes #185 * fix test_with_stb_image.cpp and typos (tiny_cnn_hrds => tiny_cnn_hdrs) in CMakeLists.txt (#189) * fix a compile error and warnings when the type float_t is a typedef of float (#191) * fix README.md (#194) * Refactor layer to handle minibatch at once (#181) * get rid of unnecessary compiler warnings (C4018: '>=': signed/unsigned mismatch) * refactor layer to handle a minibatch at once (previously individual input samples only) * move new functions apply_cost_if_defined and add_sample_gradient to an anonymous namespace in order to hopefully make AppVeyor happy * revert the changes to test_network.h * minor type fixes to get rid of compiler warnings * make quick changes to deconvolutional_layer so that the examples at least compile * fix backward_activation to use all samples * remove unused variables * update picotest to fix operator precedence * fix input data ordering * fix overriding prev_delta * change gradients to be per-sample as well * remove unused variables * fix indexing in convolutional_layer * also in dropout_layer, have a different mask vector for each sample of a minibatch * deconvolution_layer: fix allocating space for prev_out_ and cur_out_padded_ * add gradient check for minibatch * minor: change types to fix some compiler warnings * Add application links to doc, #158 * Fixed typo (#205) * Add a contribution document * fixed typo (#216) * Add batch normalization #147 * Add batch normalization prototype & remove batch-level parallelsim #147 * add backward pass & formatting * Add unit test for forward pass * Add numerical check for batchnorm * Fix convolutional::brop for pointing correct storage * Fix bprop in batchnorm * Change an order of arguments in ctor to keep consistency to other layers * add batch normalization layer * fix compiler error on deconvolutional-layer * Implement caffe importer * Revert changes around calc_delta * Fix backprop of bias factor in conv layer * Fix compilation error in MSVC2013, close #218 #231 * Add slice layer (#233) * Bug Fix #234 * Add BSD-3 license file #228 * Fix handling non-square input data in caffemodel #227 * Add power layer #227 * Generalization of loss functions + Correcting MSE (#232) * generalization of loss function to vectors (solves wrong MSE) * removed unnecessary function due to loss function generalization * loss function df operating on vec_t * correct df of mse * missin brackets * fix compile errors in conv-layer * remove sample_count
* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example * Create DeviceAbstraction.puml Create test.md Update test.md Update and rename test.md to device-abstraction-uml.md Create device-abstraction-uml.puml Update readme.md reorganize cmake reorganize cmake remove test warnings remove mock files Update device-abstraction-uml.puml * add tests&comments * cleanup codes currently we don't need hand-written ctor/dtors in dropout * Remove OpenCV dependency closes #2 * fix comments * update build instruction file * support CNN_USE_OPENCV build config * fix build configuration file * change the location of stb_image header files * follows the comment tiny-dnn#167 (comment) relocate stb_image files * Update layer.h * Implement on deconvolution and unpooling layer. * core module skeletons * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * update first convolutional layer abstraction version with NNPACK support base methods for new API update find_package NNPACK update tiny_backend fix segfault by adding move constructor to conv layer refactor nnp_backend update UML with new design fix required libs in cmake, set float_t as float and set activation vector as input in nnpack fix padding and add assertion after nnp_convolution_inference fix CMake warnings and reorganize modules fix data race fix broken tests Reorganize CMake modules add epsilon for broken tests fix broken tests. float_t was missing in some layers. fix clang errors
* v0.0.1 -> v.0.1.0 - Now we can handle non-sequential model as ```network<graph>``` tiny-dnn#108 tiny-dnn#153 - Catch up the latest format of caffe's proto tiny-dnn#162 - Improve the default behaviour of re-init weight tiny-dnn#136 - Add more tests and documents tiny-dnn#73 - Remove dependency of OpenCV in MNIST example * Update layer.h * Added test case in separate cpp file to force link errors if there are duplicate symbols when including tiny_cnn.h in multiple .cpp files. * Fix linker error due to duplicate symbols. * Update README.md - Update layer catalogue - Upgrade example codes to v0.1.0 * Removed a repeated include. * Bug Fix: network::test was broken in multi-thread. closes tiny-dnn#185 * fix test_with_stb_image.cpp and typos (tiny_cnn_hrds => tiny_cnn_hdrs) in CMakeLists.txt (tiny-dnn#189) * fix a compile error and warnings when the type float_t is a typedef of float (tiny-dnn#191) * fix README.md (tiny-dnn#194) * Refactor layer to handle minibatch at once (tiny-dnn#181) * get rid of unnecessary compiler warnings (C4018: '>=': signed/unsigned mismatch) * refactor layer to handle a minibatch at once (previously individual input samples only) * move new functions apply_cost_if_defined and add_sample_gradient to an anonymous namespace in order to hopefully make AppVeyor happy * revert the changes to test_network.h * minor type fixes to get rid of compiler warnings * make quick changes to deconvolutional_layer so that the examples at least compile * fix backward_activation to use all samples * remove unused variables * update picotest to fix operator precedence * fix input data ordering * fix overriding prev_delta * change gradients to be per-sample as well * remove unused variables * fix indexing in convolutional_layer * also in dropout_layer, have a different mask vector for each sample of a minibatch * deconvolution_layer: fix allocating space for prev_out_ and cur_out_padded_ * add gradient check for minibatch * minor: change types to fix some compiler warnings * Add application links to doc, tiny-dnn#158 * Fixed typo (tiny-dnn#205) * Add a contribution document * fixed typo (tiny-dnn#216) * Add batch normalization tiny-dnn#147 * Add batch normalization prototype & remove batch-level parallelsim tiny-dnn#147 * add backward pass & formatting * Add unit test for forward pass * Add numerical check for batchnorm * Fix convolutional::brop for pointing correct storage * Fix bprop in batchnorm * Change an order of arguments in ctor to keep consistency to other layers * add batch normalization layer * fix compiler error on deconvolutional-layer * Implement caffe importer * Revert changes around calc_delta * Fix backprop of bias factor in conv layer * Fix compilation error in MSVC2013, close tiny-dnn#218 tiny-dnn#231 * Add slice layer (tiny-dnn#233) * Bug Fix tiny-dnn#234 * Add BSD-3 license file tiny-dnn#228 * Fix handling non-square input data in caffemodel tiny-dnn#227 * Add power layer tiny-dnn#227 * Generalization of loss functions + Correcting MSE (tiny-dnn#232) * generalization of loss function to vectors (solves wrong MSE) * removed unnecessary function due to loss function generalization * loss function df operating on vec_t * correct df of mse * missin brackets * fix compile errors in conv-layer * remove sample_count
I realize this may be intended behavior for the use case that the authors have in mind, but I just wasted ~30 CPU-days because I didn't realize this was the default. I was training on a dataset that couldn't fit on disk uncompressed, so I did what seemed reasonable -- extract a batch, read the NN from a file, run the batch, write the NN back to the file, repeat for the next batch. But since this option is true by default, it deleted anything from those rounds once a later round started. I have fixed the problem now, but I think this is an unreasonable default.
The text was updated successfully, but these errors were encountered: