Crazy idea about reusing pre-trained char-based RNNLM #747

coodoo · 2016-04-04T00:01:15Z

This might be a crazy idea, normally we have limited amount of labelled data set to train a char-based RNN LM for sentiment analysis, I'm wondering would it be possible to train a RNNLM with large data set like Penn Tree Bank or billion words from wikipedia first, then base on that model (possibly as a nn.LookupTable) to further train with our labelled data for sentiment analysis task?

Has anyone done this before?

The text was updated successfully, but these errors were encountered:

soumith · 2016-04-04T03:21:29Z

Yes, using pretrained language model embeddings is a very standard practice.

coodoo · 2016-04-04T04:19:03Z

Mind shedding a light on how to implement it with torch and/or nn.LookupTable? :)

Ps. After a quick scan I didn't see any api to read out or load weights into LookupTable.

coodoo · 2016-04-04T08:49:18Z

@soumith I hacked together a modified version of nn.LookupTable to take in pre-trained weights here, mind having a look?

The goal here is to later use that LookupTable in a model structure like below to run sentiment analysis, would really love to know your thought about it, thanks!

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
  (1): nn.LookupTable
  (2): nn.SplitTable
  (3): nn.Sequencer @ nn.LSTM(100 -> 100)
  (4): nn.SelectTable
  (5): nn.Linear(100 -> 7)
  (6): nn.LogSoftMax
}

fmassa · 2016-04-04T09:01:11Z

@coodoo one usually simply copy the weights from a pre-trained model. For example,

old_lookup = torch.load(...) -- can be a full model
lookup = nn.LookupTable(...)
lookup.weight:copy(old_lookup.weight)
-- add lookup in your network

Another way (if you only want to finetune the last layer for example) is to load the full network, removing the last layer and adding a new randomly-initialized layer on top of the network.

net = torch.load(...)
net:remove() -- remove last layer
net:add(nn.Linear(4096,21)) -- add new layer replacing the old one

But your approach also seems to work, but would require adding an extra entry to every module that one wants to reuse the weights.

coodoo · 2016-04-04T11:29:33Z

@fmassa Thanks for the detailed instructions, obviously those are much better ways to do the job, appreciate it! 😀

coodoo · 2016-04-05T09:02:23Z

Verified the approach is working and uploaded an updated version here, @fmass thanks again for your kind instructions, those were very helpful!

coodoo closed this as completed Apr 5, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crazy idea about reusing pre-trained char-based RNNLM #747

Crazy idea about reusing pre-trained char-based RNNLM #747

coodoo commented Apr 4, 2016

soumith commented Apr 4, 2016

coodoo commented Apr 4, 2016

coodoo commented Apr 4, 2016

fmassa commented Apr 4, 2016

coodoo commented Apr 4, 2016

coodoo commented Apr 5, 2016

Crazy idea about reusing pre-trained char-based RNNLM #747

Crazy idea about reusing pre-trained char-based RNNLM #747

Comments

coodoo commented Apr 4, 2016

soumith commented Apr 4, 2016

coodoo commented Apr 4, 2016

coodoo commented Apr 4, 2016

fmassa commented Apr 4, 2016

coodoo commented Apr 4, 2016

coodoo commented Apr 5, 2016