Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support a matrix of indexes to implement usual embedding layer. #25

Open
avostryakov opened this issue Sep 14, 2015 · 1 comment
Open

Comments

@avostryakov
Copy link

I've tried to implement embedding layer like in lasagne (https://github.com/Lasagne/Lasagne/blob/master/lasagne/layers/embedding.py) and I discovered that I can't do like this:

import numpy as np
import cgt

def main():
input_var = cgt.matrix('input', dtype=np.int64)
w_glove = cgt.shared(np.zeros((1000, 300), dtype=np.float32))
output = w_glove[input_var]
f = cgt.function([input_var], [output])

input = np.ones((3, 3), dtype=np.int32)
print f(input)

if name == 'main':
main()

w_glove[input_var] should return tensor3. Other words it replaces sequences of word's indexes in sequences of word's vectors. Usual NLP operation.

As result I have to replace output = w_glove[input_var] on:
output = w_glove[cgt.flatten(input_var)]
output = cgt.reshape(output, (3, 3, 300))

to flat indexes in indexes' list and reshape result to tensor3 after it. I believe that cgt can do it more effectively.

@avostryakov
Copy link
Author

Moreover, I work on Mac OS X 64 bit. If I write above:
input_var = cgt.matrix('input', dtype=np.int32)

As result, I'll have a following error in this line: output = w_glove[cgt.flatten(input_var)]:

File "/Users/vostryakov/Downloads/cgt-master/cgt/core.py", line 1899, in typ_apply
assert input_types[1] == TensorType('i8', 1)
AssertionError

It looks like cgt.flatten(input_var) automatically cast int32 to int64 if operation system is 64 bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant