Skip to content

Conversation

@jianyuh
Copy link
Member

@jianyuh jianyuh commented Jul 10, 2019

Summary:
In the same spirit of D16085552, we do the following in this Diff:

  • Refactor the pack/unpack code for PackB: use the same pack_unpack_ function for both pack and unpack function.
  • Add a unit test.

Reviewed By: dskhudia

Differential Revision: D16160767

Summary:
In the same spirit of D16085552, we do the following in this Diff:
- Refactor the pack/unpack code for PackB: use the same ```pack_unpack_``` function for both ```pack``` and ```unpack``` function.
- Add a unit test.

Reviewed By: dskhudia

Differential Revision: D16160767

fbshipit-source-id: 800482b652c7010ff60506df7f2032ce7c8bc152
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in f080393.

q10 pushed a commit to q10/FBGEMM that referenced this pull request Apr 10, 2025
Summary:
X-link: pytorch#3008

Pull Request resolved: facebookresearch/FBGEMM#103

This diff does quite a bit of facelifting to our [Marlin](https://github.com/IST-DASLab/marlin) BF16 X I4 kernels. These improvements include:

* Upgrading the kernel with the latest improvements from VLLM. This helps quite a bit with stability and fixes issues with group scaling.
* Adds template specializations so that the marlin kernel supports both BF16 and FP16 using a single implementation.
* Fixes BF16 Dequantization issue.
* Exposes a simplified torch custom op `torch.ops.marlin.marlin_gemm` and convenient helpers for quantizing to the marlin format `marlin_quantize`.
* Adds these new ops to our quantize benchmarks.
* New tests and better directory structure.

One downside of this work is that we have diverged a bit from VLLM so it may be harder to stay in sync going forward. However, I think the benefits of the improvements in this diff outweigh potential sync costs.

Reviewed By: jianyuh, jiawenliu64

Differential Revision: D61408771

fbshipit-source-id: 66b651ce794309a408f30244cac20a3c9ab0ce5a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants