Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use non-reference blocking parameters for fp32 weights in quantized inference kernel #1224

Closed
wants to merge 1 commit into from

Commits on Jul 27, 2022

  1. Use non-reference blocking parameters for fp32 weights in quantized i…

    …nference kernel
    
    Summary: While this is technically a reference implementation, I'm sure it'll show up somewhere and this at least gives a ~2-3x improvement in perf for parameters I looked at.
    
    Differential Revision: D38186856
    
    fbshipit-source-id: 91c11c19d1320be18e1a4be0f138a7c00f92bf3f
    Andrew Tulloch authored and facebook-github-bot committed Jul 27, 2022
    Configuration menu
    Copy the full SHA
    1523425 View commit details
    Browse the repository at this point in the history