Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Theano in Word Embedding Method #19

Closed
lucas0 opened this issue Nov 5, 2015 · 4 comments
Closed

Using Theano in Word Embedding Method #19

lucas0 opened this issue Nov 5, 2015 · 4 comments

Comments

@lucas0
Copy link

lucas0 commented Nov 5, 2015

Hello,

I'm trying to use your pre-processing methods to feed my LSTM RNN, and I cannot find a way to use the method for creating the Word Embeddings matrixes:

import theano, numpy
from theano import tensor as T

# nv :: size of our vocabulary
# de :: dimension of the embedding space
# cs :: context window size

nv, de, cs = 1000, 50, 5
embeddings = theano.shared(0.2 * numpy.random.uniform(-1.0, 1.0, \
(nv+1, de)).astype(theano.config.floatX)) # add one for PADDING at the end
idxs = T.imatrix() # as many columns as words in the context window and as many lines as words in the sentence
x = self.emb[idxs].reshape((idxs.shape[0], de*cs))

I can do the context window, but the code provided for the Word Embedding generation gives me this error:

NameError: name 'self' is not defined

I've tried to put the code in another .py file but is of no use, I'm sure it has a simple solution but I can't figure it out. Would you please give me a hand in this?

@mesnilgr
Copy link
Owner

mesnilgr commented Nov 5, 2015

@mesnilgr mesnilgr closed this as completed Nov 5, 2015
@lucas0
Copy link
Author

lucas0 commented Nov 5, 2015

Seems like

x = self.emb[idxs].reshape((idxs.shape[0], de*cs))

should be:

x = embeddings[idxs].reshape((idxs.shape[0], de*cs))

Am I right? @mesnilgr

@Helicqin
Copy link

@lucas0 i met the same question.i want to run the code:

import theano, numpy
from theano import tensor as T

# nv :: size of our vocabulary
# de :: dimension of the embedding space
# cs :: context window size

nv, de, cs = 1000, 50, 5
emb = theano.shared(0.2 * numpy.random.uniform(-1.0, 1.0, \
(nv+1, de)).astype(theano.config.floatX)) # add one for PADDING at the end
idxs = T.imatrix() # as many columns as words in the context window and as many lines as words in the sentence
x = self.emb[idxs].reshape((idxs.shape[0], de*cs))

but errors happened in emb[idxs].i do not understand why emb[idxs] is correct?i think emb is a matrix,idxs is also a matrix.So why?

I would appreciate it if you could give me some tips.thanks.

@lucas0
Copy link
Author

lucas0 commented May 12, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants