Feature request: Weighted average for EmbeddingBag #4068

kunaldahiya · 2017-12-07T07:46:15Z

Right now 'torch.nn.EmbeddingBag' supports only 'sum' and 'mean'. What do you think about providing an option for weights to compute 'weighted average'? This would be more memory efficient than using current alternative.

For instance something like 'sp_weights' in 'tf.nn.embedding_lookup_sparse' [1].

References:
[1] https://www.tensorflow.org/api_docs/python/tf/nn/embedding_lookup_sparse

glample · 2019-03-13T18:49:34Z

Hi, any updates on this? Being able to provide weights (on top of the indices) would be really useful.

snakers4 · 2019-04-02T05:33:17Z

This would be very helpful for my work with the Russian language.
I understand that I can use some kind of attention + small char level CNN, but I really doubt that my PyTorch implementation will not be 10x slower.
Many thanks!

snakers4 · 2019-04-02T05:36:46Z

As some kind of motivation, I will just post a link to my post, where in many applications for Russian EmbeddingBags were superior to BPE =)

zou3519 · 2019-04-02T14:03:07Z

API Bikeshedding: which of these two APIs would be better?

nn.EmbeddingBag's forward pass accepts a per_input_weights argument. When mode='sum', this makes it do a weighted sum; when mode='mean', this does a weighted mean, when mode='max', this does a weighted max, like the TF API.
nn.EmbeddingBag's forward pass accepts a per_input_weights argument and a new mode='weighted_sum'. mode='weighted_sum' weights the output of the embedding according to the weights. No weighted mean / weighted max are implemented.

I'm leaning towards (2) because I haven't been able to find use cases for "weighted mean" (which can be emulated via weighted sum) and "weighted max".

ezyang · 2019-04-02T14:08:04Z

FWIW, you don't have to actually implemented weighted mean and weighted max if you implement (1); you can just make them raise errors. (This is not necessarily in favor of (1), but it's a comment on the reasoning.)

snakers4 · 2019-04-02T14:34:33Z

nn.EmbeddingBag's forward pass accepts a per_input_weights argument

Most likely these weights will be calculated using some sort of attention mechanism
The simplest attention mechanism would be something like this (just a linear layer + softmax)

I wonder whether something like this can be implemented inside of this layer

On the way to pytorch#4068. Adds a new per_sample_weights argument to nn.EmbeddingBag's forward pass and embedding_bag. This is only supported for mode='sum' and is intepreted as scaling the output of the embedding before applying the reduction. i.e., indices: 0, 3, 7 ; 1, 2 per_sample_weights: 0.1, 0.2, 0.4 ; 0.7, -0.8 offsets: 0, 3 weights (embeddings): e_0, e_1, e_2, ..., e_7 return 2 vectors: 0.1 * e_0 + 0.2 * e_3 + 0.4 * e_7 0.7 * e_1 - 0.8 * e_2 Future: - CPU backward, - CUDA forward, - CUDA backward, - CPU differentiable per_sample_weights - CUDA differentiable per_sample_weights Test Plan: - New tests

On the way to #4068. Adds a new per_sample_weights argument to nn.EmbeddingBag's forward pass and embedding_bag. This is only supported for mode='sum' and is intepreted as scaling the output of the embedding before applying the reduction. i.e., indices: 0, 3, 7 ; 1, 2 per_sample_weights: 0.1, 0.2, 0.4 ; 0.7, -0.8 offsets: 0, 3 weights (embeddings): e_0, e_1, e_2, ..., e_7 return 2 vectors: 0.1 * e_0 + 0.2 * e_3 + 0.4 * e_7 0.7 * e_1 - 0.8 * e_2 Future: - CPU backward, - CUDA forward, - CUDA backward, - CPU differentiable per_sample_weights - CUDA differentiable per_sample_weights Test Plan: - New tests

On the way to #4068. Adds a new per_sample_weights argument to nn.EmbeddingBag's forward pass and embedding_bag. This is only supported for mode='sum' and is intepreted as scaling the output of the embedding before applying the reduction. i.e., indices: 0, 3, 7 ; 1, 2 per_sample_weights: 0.1, 0.2, 0.4 ; 0.7, -0.8 offsets: 0, 3 weights (embeddings): e_0, e_1, e_2, ..., e_7 return 2 vectors: 0.1 * e_0 + 0.2 * e_3 + 0.4 * e_7 0.7 * e_1 - 0.8 * e_2 Future: - CPU backward, - CUDA forward, - CUDA backward, - CPU differentiable per_sample_weights - CUDA differentiable per_sample_weights Test Plan: - New tests gh-metadata: pytorch pytorch 18735 gh/zou3519/26/head

EmbeddingBag CPU forward with per_sample_weights. On the way to #4068. Adds a new per_sample_weights argument to nn.EmbeddingBag's forward pass and embedding_bag. This is only supported for mode='sum' and is intepreted as scaling the output of the embedding before applying the reduction. i.e., indices: 0, 3, 7 ; 1, 2 per_sample_weights: 0.1, 0.2, 0.4 ; 0.7, -0.8 offsets: 0, 3 weights (embeddings): e_0, e_1, e_2, ..., e_7 return 2 vectors: 0.1 * e_0 + 0.2 * e_3 + 0.4 * e_7 0.7 * e_1 - 0.8 * e_2 Future: - CPU backward, - CUDA forward, - CUDA backward, - CPU differentiable per_sample_weights - CUDA differentiable per_sample_weights Test Plan: - New tests gh-metadata: pytorch pytorch 18735 gh/zou3519/26/head

EmbeddingBag CPU forward with per_sample_weights. On the way to #4068. Adds a new per_sample_weights argument to nn.EmbeddingBag's forward pass and embedding_bag. This is only supported for mode='sum' and is intepreted as scaling the output of the embedding before applying the reduction. i.e., ``` indices: 0, 3, 7 ; 1, 2 per_sample_weights: 0.1, 0.2, 0.4 ; 0.7, -0.8 offsets: 0, 3 weights (embeddings): e_0, e_1, e_2, ..., e_7 ``` return 2 vectors: ``` 0.1 * e_0 + 0.2 * e_3 + 0.4 * e_7 0.7 * e_1 - 0.8 * e_2 ``` Future: - CPU backward, - CUDA forward, - CUDA backward, - CPU differentiable per_sample_weights - CUDA differentiable per_sample_weights Test Plan: - New tests gh-metadata: pytorch pytorch 18735 gh/zou3519/26/head

zou3519 · 2019-04-10T15:01:05Z

Added the feature in #18957.

snakers4 · 2019-04-11T05:20:09Z

Many thanks!
We will try adding this to our next project!

drevicko · 2021-03-23T06:07:33Z

Is there currently a plan to implement per_sample_weights on CUDA for max aggregation?

elanmart mentioned this issue Jan 2, 2018

Implement CPU EmbeddingBag in C #4441

Closed

zou3519 self-assigned this Mar 14, 2019

zou3519 mentioned this issue Apr 2, 2019

EmbeddingBag CPU forward with per_sample_weights. #18735

Closed

ezyang added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: nn Related to torch.nn high priority labels Apr 2, 2019

zou3519 closed this as completed Apr 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Weighted average for EmbeddingBag #4068

Feature request: Weighted average for EmbeddingBag #4068

kunaldahiya commented Dec 7, 2017

glample commented Mar 13, 2019

snakers4 commented Apr 2, 2019

snakers4 commented Apr 2, 2019

zou3519 commented Apr 2, 2019

ezyang commented Apr 2, 2019

snakers4 commented Apr 2, 2019

zou3519 commented Apr 10, 2019

snakers4 commented Apr 11, 2019 •

edited

drevicko commented Mar 23, 2021

Feature request: Weighted average for EmbeddingBag #4068

Feature request: Weighted average for EmbeddingBag #4068

Comments

kunaldahiya commented Dec 7, 2017

glample commented Mar 13, 2019

snakers4 commented Apr 2, 2019

snakers4 commented Apr 2, 2019

zou3519 commented Apr 2, 2019

ezyang commented Apr 2, 2019

snakers4 commented Apr 2, 2019

zou3519 commented Apr 10, 2019

snakers4 commented Apr 11, 2019 • edited

drevicko commented Mar 23, 2021

snakers4 commented Apr 11, 2019 •

edited