Vocab Limited Pretrained Embedding [2/5] #1248

geof90 · 2020-02-13T21:02:48Z

Summary: In local bento experiments, often nearest neighbors / items nearby in the embedding space tended to be misspellings of the original word. This isn't really useful for spoken language since there won't be many misspellings, so instead this diff adds a subclass of PretrainedEmbeddings that restricts the embedding space to only contain known vocab words. From local experiments, the results here seem much more consistent with what is expected from kNN in the embedding space.

Reviewed By: geof90

Differential Revision: D19818803

Summary: In local bento experiments, often nearest neighbors / items nearby in the embedding space tended to be misspellings of the original word. This isn't really useful for spoken language since there won't be many misspellings, so instead this diff adds a subclass of `PretrainedEmbeddings` that restricts the embedding space to only contain known vocab words. From local experiments, the results here seem much more consistent with what is expected from kNN in the embedding space. Reviewed By: geof90 Differential Revision: D19818803 fbshipit-source-id: bfac18887990f7a816e30000f8fbbfad37788fd3

facebook-github-bot · 2020-02-13T21:03:13Z

This pull request was exported from Phabricator. Differential Revision: D19818803

facebook-github-bot · 2020-02-20T06:26:08Z

This pull request has been merged in f907783.

facebook-github-bot added CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported labels Feb 13, 2020

facebook-github-bot closed this in f907783 Feb 20, 2020

facebook-github-bot added the Merged label Feb 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vocab Limited Pretrained Embedding [2/5] #1248

Vocab Limited Pretrained Embedding [2/5] #1248

geof90 commented Feb 13, 2020

facebook-github-bot commented Feb 13, 2020

facebook-github-bot commented Feb 20, 2020

Vocab Limited Pretrained Embedding [2/5] #1248

Vocab Limited Pretrained Embedding [2/5] #1248

Conversation

geof90 commented Feb 13, 2020

facebook-github-bot commented Feb 13, 2020

facebook-github-bot commented Feb 20, 2020