Add helper ops to support cache conflict misses #2571

Summary: This diff adds helper operators for the cache conflict miss support enablement in SSD TBE. Changes include: - Extend `get_unique_indices_cuda` to compute and return inverse linear indices (the tensor that contains the original positions of lienar indices before sorting) - Extend `lru_cache_find_uncached_cuda` to compute and return the inverse cache sets (the tensor that contains the original positions of cache sets of unique indices before sorting) - Update SSD backend to support cache conflict misses instead of failing. The rows that experience conflict misses will be stored in a scratch pad for TBE kernels to consume. They will be evicted to SSD once the backward+optimizer step of TBE is completed. - Add `ssd_generate_row_addrs` for generating row addresses of data that is fetched from SSD (data can be in either a scratch pad or LXU cache). Reviewed By: q10 Differential Revision: D55926421

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add helper ops to support cache conflict misses #2571

Add helper ops to support cache conflict misses #2571

Commits on May 10, 2024