Skip to content

v0.3.2

Choose a tag to compare

@christopher-w-murphy christopher-w-murphy released this 21 Nov 14:15
8eba4ad

There was a request on LinkedIn from Rami to give XLNet the "one-line treatment."

What that means is someone would be able to convert their XLNet model from softmax_0 to softmax_n by adding just one line of code to their script (or two lines if you count the import statement). For example,

import transformers

from flash_attention_softmax_n.surgery import apply_attention_softmax_n


model = transformers.AutoModel.from_pretrained('xlnet-base-cased')
apply_attention_softmax_n(model=model, softmax_n_param=1.)
...
On the backend, the _xlnet subpackage contains the modified rel_attn_core method code. It also adds the XLNetRelativeAttention class to the policy registry such that apply_attention_softmax_n and AttentionSoftmaxN know to operate on it.

As far as testing, the unit tests in tests/cpu/surgery/test_xlnet.py all pass. The tests in the cpu subpackage will run on a cpu or gpu. (The gpu subpackage tests are gpu-only.)