Skip to content

LanDisen/FlashMorph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Morphing into Hybrid Attention Models

FlashMorph: Fast LAyer Selection for Hybrid MORPHing

Disen Lan1,2,*, Jianbin Zheng2, Yuxi Ren2, Xin Xia2, Xuanda Wang2, Xuefeng Xiao2, Xipeng Qiu1,†, Yu Cheng3,†

*Work done at ByteDance Seed. Corresponding authors.

1Fudan University, 2ByteDance Seed, 3The Chinese University of Hong Kong

Method

FlashMorph is an effective, efficient, and scalable pipeline for converting pretrained Transformers into hybrid attention models, performing optimization-based layer selection under a global hybrid configuration with a fixed full-attention budget.

FlashMorph method overview

Results

Long-context Retrieval

FlashMorph achieves strong Needle-in-a-Haystack performance with only 20M layer-selection tokens.

NIAH performance table

Commonsense Reasoning and Recall-intensive Tasks

FlashMorph maintains commonsense reasoning ability and improves recall-intensive performance across different linear-attention backbones.

Zero-shot performance table

Inference Efficiency

Hybrid architecture improves long-context prefill and decode efficiency while using less GPU memory than the full-attention Transformer baseline.

Prefill and decode efficiency comparison

Layer-selection Efficiency

FlashMorph substantially reduces layer-selection cost compared with prior methods.

Layer selection efficiency scaling

Citation

If you find this repo useful in your research or applications, please consider starring and citing our work:

@article{lan2026morphing,
  title={Morphing into Hybrid Attention Models},
  author={Lan, Disen and Zheng, Jianbin and Ren, Yuxi and Xia, Xin and Wang, Xuanda and Xiao, Xuefeng and Qiu, Xipeng and Cheng, Yu},
  journal={arXiv preprint arXiv:2606.30562},
  year={2026}
}

Contact

For questions or discussion, please contact Disen Lan at disenlan1002@gmail.com.

About

Morphing into Hybrid Attention Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors