-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batched OCR training? #91
Comments
There are a bunch of different reasons. The code was ported from Python, where batching wouldn't have helped with speed, so it was easiest and safest to leave it as is. In addition, for other networks, batching tends to result in higher test set error rates, so there wasn't much motivation to add it (it seems to have no effect either way on error rates for LSTMs). Eigen Matrix also didn't support GPU, so there wasn't much motivation for that anyway. Now that the code uses Eigen Tensor, batching would make more sense. But Eigen turns out not to be such a convenient framework for multicore or GPU computations anyway. In addition, LSTMs on GPUs are probably best implemented using fused kernels implementing multiple time steps at a time. So, the upshot is that I've started working on a separate project for OCR similar to clstm, but with a focus on parallelization and GPU support, including support for batching. |
i was sufferring from the speed of clstm trainning.how could community contribute to the new project |
Give me a few weeks; I just moved from Google to NVIDIA. GPU support will be much better now :-) |
Tom, |
Currently the
CLSTMOCR
class that is defined inclstmhl.h
can only train on single line images. Due to this, optimizations like Eigen's multi-threaded tensor operations and the GPU support have little effect, since the task size for single samples is too small for them to make a difference.From a cursory reading of the code I could gather that batched training is supported by the lower-level API, so my question is what would have to be done to have batched training for the high-level
CLSTMOCR
(and ideallyCLSTMText
as well) API?The text was updated successfully, but these errors were encountered: