New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomly initialising word vectors #32

Closed
Henry-E opened this Issue May 2, 2017 · 3 comments

Comments

Projects
None yet
2 participants
@Henry-E

Henry-E commented May 2, 2017

There doesn't seem to be the option to initialise word vectors without using pretrained embeddings. There's an option to fill in vectors for tokens missing from the pretrained embeddings with normally distributed values. It would be cool if there was a built in option to initialise embeddings from a uniform distribution without having to specify a word embedding file.

@jekbradbury

This comment has been minimized.

Show comment
Hide comment
@jekbradbury

jekbradbury May 3, 2017

Collaborator

You'd do that in your model class's __init__ by using nn.init or some other weight initializer rather than passing the weight matrix from torchtext.

Collaborator

jekbradbury commented May 3, 2017

You'd do that in your model class's __init__ by using nn.init or some other weight initializer rather than passing the weight matrix from torchtext.

@Henry-E

This comment has been minimized.

Show comment
Hide comment
@Henry-E

Henry-E May 3, 2017

Thanks for the recommendation, a bit new to this. The initialised vectors would still go into TEXT.vocab.vectors?

Henry-E commented May 3, 2017

Thanks for the recommendation, a bit new to this. The initialised vectors would still go into TEXT.vocab.vectors?

@jekbradbury

This comment has been minimized.

Show comment
Hide comment
@jekbradbury

jekbradbury May 3, 2017

Collaborator

No, you don’t need to put them there -- the only place where your embeddings actually need to be is in your model; TEXT.vocab.vectors offers a way to get pretrained vectors corresponding to your vocabulary and then use those to initialize your model's embeddings.

Collaborator

jekbradbury commented May 3, 2017

No, you don’t need to put them there -- the only place where your embeddings actually need to be is in your model; TEXT.vocab.vectors offers a way to get pretrained vectors corresponding to your vocabulary and then use those to initialize your model's embeddings.

@Henry-E Henry-E closed this May 3, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment