-
Notifications
You must be signed in to change notification settings - Fork 67
Open
Labels
Description
Right now the development model is people should copy and paste configs from autotuning output into their code. This is nice and predicable for a single device type, but if one is using many different types of hardware it may not be ideal.
We should create some sort of caching mechanism for autotuning, that allows a team with a collection Helion kernels to autotune all kernels for a new set of hardware easily.
Perhaps we can reuse some of the inductor caching infra for this.
We should have a way to provide example inputs for autotuning.
A key invariant we should maintain is an ahead-of-time autotuning model. Autouning should happen before deployment, not after deployment.