Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Sparse and structured attention mechanisms

Build Status PyPI version

Efficient implementation of structured sparsity inducing attention mechanisms: fusedmax, oscarmax and sparsemax.

Note: If you are just looking for sparsemax, I recommend the implementation in the entmax.

Currently available for pytorch >= 0.4.1. (For older versions, use a previous release of this package.) Requires python >= 2.7, cython, numpy, scipy.

Usage example:

In [1]: import torch
In [2]: import torchsparseattn
In [3]: a = torch.tensor([1, 2.1, 1.9], dtype=torch.double)
In [4]: lengths = torch.tensor([3])
In [5]: fusedmax = torchsparseattn.Fusedmax(alpha=.1)
In [6]: fusedmax(a, lengths)
Out[6]: tensor([0.0000, 0.5000, 0.5000], dtype=torch.float64)

For details, check out our paper:

Vlad Niculae and Mathieu Blondel A Regularized Framework for Sparse and Structured Neural Attention In: Proceedings of NIPS, 2017.

See also:

André F. T. Martins and Ramón Fernandez Astudillo From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification In: Proceedings of ICML, 2016

X. Zeng and M. Figueiredo, The ordered weighted L1 norm: Atomic formulation, dual norm, and projections. eprint