Gaooooosh📧 yonggaoxiao@bupt.edu.cn
🎓 Ph.D. Student
Beijing University of Posts and Telecommunications
Efficient large language model architectures for long-context modeling and stability in ultra-long contexts.
Current work studies sequence-length-wise hybridization of linear and sliding-window attention to reduce computational and memory costs, while mitigating context-length-induced degradation through training-free, parameter-preserving inference-time modification.
|
|
|