Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[XLA:GPU] Experimental: Add --xla_gpu_per_fusion_autotune_cache_dir o…
…ption If the option is set, we will maintain (read/write) a per-fusion autotune cache in the given directory. The directory must exist. Cache invalidation has to be handled by the user (e.g. please use an empty directory if you want to start with an empty cache). XLA version checks must be done by the user (e.g. if you want to cache fusions created with different versions of XLA, please use different directories). (If the using library already has a version handling mechanism, like JAX, then it shouldn't be difficult for them to create separate directories based on that version (and all the parameters which matter to them).) Default: no file based cache. There is minimal support for multiple processes using the same cache - the rename trick is used to avoid writing the same file by multiple processes at the same time or reading incomplete files. We use SHA256 hashes in the filenames and assume that no collisions occur. This is a simple implementation to allow people to test it and find good use-cases. If needed we can refine it later. Considered use case: People running [multiple] [similar] models [through JAX]. For example there are 2 similar HLOs that we want to run with JAX (using the same "XLA binary") and it would be nice to reuse the autotune results from the first, if some kernels appear in both. Similarly: Consider the use case of a researcher sitting at a Colab session and making small changes to their model. They should mostly get cache hits! Limitations: It is not recommended to change the cache directory during the run of a process, because then the in-memory and the file based cache can become inconsistent. At least clear the in-memory cache if you change it. When loading results with LoadAutotuneResults[FromFile], they are not written into the cache directory. PiperOrigin-RevId: 644406688
- Loading branch information