Release v0.3.0 · softmax1/Flash-Attention-Softmax-N

Perform "surgery" on existing models. Take a pretrained model with softmax_0 in its attention mechanism and "operate" on it to replace softmax_0 with softmax_n. Based on MosaicML's composer.

Optionally install via:

$ pip install flash-attention-softmax-n[surgery]

New Features:

Functional API: add one line of code to your script, flash_attention_n.surgery.apply_attention_softmax_n.
Object-oriented API for use with the MosaicML composer trainer, flash_attention_n.surgery.AttentionSoftmaxN.
Use flash_attention_n.surgery.surgery_functions. policy_registry to register your model!
See the README for sample usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.0