PyTorch embedding bag module and factorization machine models for multi-value fields with weights per value. As opposed to torch.nn.EmbeddingBag
, the implementations are fully vectorized for mini-batches of data. For example, imagine a data-set of movies where the "genres" column may contain a list of genres with corresponding weights representing a measure of confidence in the movie belonging to the genre.
The basic component of this package is the WeightedEmbeddingBag
class, which is similar to the PyTorch torch.nn.EmbeddingBag
class, but supports weights and bag aggregation in mini-batches. It receives three parameters: indices, offsets, and weights, depicted below.
The indices array selects embeddings, the weights array is used to multiply embeddings by a corresponding weight, and the offsets array defines the endpoints of each bag. In the example above, we select the embedding vectors
Since offsets point to the end-point of each bag, we can use mini-batches with a variable number of embeddings, and make sure that the offsets array specifies the last bag to end before the padding of each sample.
On top, we have implemented the classical Factorization Machine model in the WeightedFM
class, and a fully-vectorized version of a field-aware factorization machine in the WeightedFFM
class.
Example:
We implement the visualization in the previous section in one of the batches. The following creates a Weighted Embedding Bag with a 6x4
learnable matrix (the embeddings), where 4
is the embedding dimension:
emb_bag = WeightedEmbeddingBag(6, 4)
Then we create the forward parameters: indices
, weights
and offsets
. indices
is of dimension 3x5
, where the 3
indicates that we are working with three batches. In the first batch we make the same selection as visualized in a previous section:
indices = torch.IntTensor([[0, 4, 2, 5], [1, 1, 4, 5], [1, 2, 3, 5]])
weights = torch.Tensor([[0.2, 0.1, 0.9, 0.8], [0.5, 0.2, 0.3, 0.3], [2., -2., 0.3, 0.6]])
offsets = torch.IntTensor([[0, 1, 3], [0, 2, 3], [1, 2, 3]])
Finally, we can pass them as the forward parameters:
output = emb_bag(indices, offsets, weights)
We provide a more detailed explanation of the previous example
The embedding parameters can be seen by printing the weight
of the embeddings:
>>> emb_bag.emb.weight
Parameter containing:
tensor([[-0.4172, -0.1485, 0.9262, -0.4852],
[ 0.1526, -0.5395, 0.0916, -1.2741],
[-0.3022, -0.0705, -0.1614, -0.2234],
[ 0.8110, -0.2389, 0.7278, -0.1804],
[ 1.3034, 1.2308, -0.3032, 1.7542],
[ 0.2870, -3.3745, 1.4647, -0.4888]], requires_grad=True)
We can print the forwarded parameters to see their dimensions more clearly:
>>> print("indices: \n", indices, indices.shape)
... print("weights: \n", weights, weights.shape)
... print("offsets: \n", offsets, offsets.shape)
indices:
tensor([[0, 4, 2, 5],
[1, 1, 4, 5],
[1, 2, 3, 5]], dtype=torch.int32) torch.Size([3, 4])
weights:
tensor([[ 0.2000, 0.1000, 0.9000, 0.8000],
[ 0.5000, 0.2000, 0.3000, 0.3000],
[ 2.0000, -2.0000, 0.3000, 0.6000]]) torch.Size([3, 4])
offsets:
tensor([[0, 1, 3],
[0, 2, 3],
[1, 2, 3]], dtype=torch.int32) torch.Size([3, 3])
Recall that indices
will make selections from the embeddings (i.e. weight
). Let us take a look at the output, and you can verify that the first vector equals to the visualization above:
>>> print("actual: \n", actual, actual.shape)
actual:
tensor([[[-0.0834, -0.0297, 0.1852, -0.0970],
[ 0.1303, 0.1231, -0.0303, 0.1754],
[-0.0424, -2.7630, 1.0265, -0.5921]],
[[ 0.0763, -0.2697, 0.0458, -0.6370],
[ 0.4215, 0.2613, -0.0726, 0.2714],
[ 0.0861, -1.0124, 0.4394, -0.1466]],
[[ 0.9096, -0.9380, 0.5059, -2.1014],
[ 0.2433, -0.0717, 0.2183, -0.0541],
[ 0.1722, -2.0247, 0.8788, -0.2933]]], grad_fn=<SubBackward0>) torch.Size([3, 3, 4])
TODO - we will fill it when we publish the package to PyPI