Skip to content

Support quant for meituan-longcat/LongCat-Flash-Lite#1388

Merged
Kaihui-intel merged 1 commit intomainfrom
kaihui/skip_ngram
Feb 4, 2026
Merged

Support quant for meituan-longcat/LongCat-Flash-Lite#1388
Kaihui-intel merged 1 commit intomainfrom
kaihui/skip_ngram

Conversation

@Kaihui-intel
Copy link
Copy Markdown
Contributor

@Kaihui-intel Kaihui-intel commented Feb 3, 2026

Description

#1380

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>
Copilot AI review requested due to automatic review settings February 3, 2026 09:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for quantization of the meituan-longcat/LongCat-Flash-Lite model by filtering out NgramEmbedding modules during the block search process.

Changes:

  • Modified the _search_block function to skip NgramEmbedding modules when identifying target modules for quantization

@chensuyue chensuyue requested a review from xin3he February 4, 2026 02:16
@Kaihui-intel Kaihui-intel merged commit a12e3f5 into main Feb 4, 2026
28 checks passed
@Kaihui-intel Kaihui-intel deleted the kaihui/skip_ngram branch February 4, 2026 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants