Updated training? #37

Open
drzax opened this Issue Mar 17, 2016 · 7 comments

Projects

None yet

4 participants

@drzax
drzax commented Mar 17, 2016

I'm very new to the whole world of neural networks so please forgive any silly questions.

Is there a way to update a network by training it on new text? The undocumented -init_from flag looks like it might do that, but I can't quite be sure.

@AlekzNet

I would rephrase the question as follows: when initializing from a checkpoint, what parameters/data can be changed, and what parameters will/must stay the same?

@jcjohnson
Owner

When using -init_from the model options (model type, wordvec size, rnn size, rnn layers, dropout, batchnorm) will be ignored and the architecture from the existing checkpoint will be used instead.

In theory you could use a new dataset when training with -init_from, but you would have to make sure that it had the same vocabulary as the original dataset. To support that use case, we could change the preprocessing script to take an input vocabulary as an argument, allowing multiple datasets to be encoded with the same vocabulary.

@AlekzNet

I noticed, that "checkpoint_every" also does not change.

@drzax
drzax commented Mar 17, 2016

Just so I'm clear then, the current purpose of -init_from is maybe to re-commence a failed or aborted training session, but with all the same details?

@jcjohnson
Owner

Yes that's correct; the learning rate and learning rate decay could be
different though

On Thursday, March 17, 2016, Simon Elvery notifications@github.com wrote:

Just so I'm clear then, the current purpose of -init_from is maybe to
re-commence a failed or aborted training session, but with all the same
details?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#37 (comment)

@AlekzNet

Just noticed, that the letters/digits may be reassigned (idx_to_token) after subsequent running preprocess.py (unless I miss/confuse something else). I do not have the previous version of .json file (will save it next time), but now I'm getting lots of "don1t"s and "it1s" instead of "don't" ad "it's" in the output. So it looks like "1" took the place of "'", "'" took the place of "!", etc.

Can the tokens be lexicographically ordered by preprocess.py, so such issues can be avoided?

@gwern
gwern commented Jun 15, 2016

I would appreciate it if the data generation were deterministic. When your RNN takes weeks to train and you decide you need to change something in the data, it'd be nice if you didn't have to start over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment