Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load pre-trained weights into LookupTable? #746

Closed
coodoo opened this issue Apr 3, 2016 · 6 comments
Closed

Load pre-trained weights into LookupTable? #746

coodoo opened this issue Apr 3, 2016 · 6 comments

Comments

@coodoo
Copy link

coodoo commented Apr 3, 2016

Just wondering is it possible or practical to load pre-trained weights into nn.LookupTable and use it for text classification tasks like sentiment analysis? Thanks.

@MTSranger
Copy link

Supposed you have tbl = nn.LookupTable(vocabSize, vectorDims) then tbl.weight would be the matrix of weights, and tbl.weight[i][j] would be the j-th dimension i-th weight vector. You can modify them directly.

@coodoo
Copy link
Author

coodoo commented Apr 5, 2016

Thanks, ended up solved it with #747

@coodoo coodoo closed this as completed Apr 5, 2016
@octavian-ganea
Copy link

What if the weight tensor is too big and I get cuda out of memory :

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-2656/cutorch/lib/THC/generic/THCStorage.cu line=41 error=2 : out of memory

The issue is that this tensor has to be allocated twice , once at reading time and once at nn.LookupTable initiatlization.

@fmassa
Copy link
Contributor

fmassa commented Jun 10, 2016

@octavian-ganea just convert the weights to float while at reading time, and then at nn.LookupTable you can copy the float weights directly to the cuda module

@octavian-ganea
Copy link

@fmassa : In my case I have a 4 million x 100 lookup table, so would like to load pre-initialized weights from a file directly. Currently, I am doing the following:

m = nn.LookupTable(4M,100)
m.weight = torch.load(t7_filename)

This is quite slow and occupies twice the memory of a 4M x 100 tensor, since the first line is allocating a huge tensor in memory, and the second one loads another big tensor in memory and afterwards it copies it to m.weight. I would like to have one single huge tensor allocated at a time in memory. Thanks.

@hashbangCoder
Copy link

Hi,

As a related question, this doesn't seem to work :

b = torch.ones(9,5)
l = nn.LookupTable(10,5)
l.weight:narrow(1,2,9):set(b)  --unchanged weights 

Which, I can kinda understand (the reference changes twice). My question is there any easy way of changing a subset of weights without memcopy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants