Skip to content

[optimization] help us know which kernels we should integrate in Diffusers #12990

@sayakpaul

Description

@sayakpaul

This issue is for knowing which kernels we should integrate into the library through kernels.

Currently, we leverage kernels for different attention backends (FA2, FA3, and SAGE). However, other layers can be optimized as well (RMS Norm, for example), depending on the model size and input payload being used to benchmark that.

I did take a crack at this once, i.e., replacing the norm layers with their optimized counterparts, but didn't realize any noticeable gains. But maybe this is different now.

Resources / notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceAnything related to performance improvements, profiling and benchmarking

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions