Cache System hangs when configured with fewer ways than threads #165

wobanator · 2018-05-30T14:57:55Z

Image we got the following scenario: The L1 data cache is configured to have less ways than threads_per_core. All threads are executing a data load with an address leading to the same cache set with different tags. The requested tags can be found in the L2 cache.

The L1 cache system will get in to a dead lock. The first thread generates a miss, get rolled back and requests a fill from the L2 cache. The corresponding response is written into L1 cache after few cycles. The same happens for the remaining threads. So when the first gets scheduled, the load will again produce a miss, because the correct cache line has already been replaced before it could be read once.

Any idea how to fixed this problem? Sure another replacement strategy would fix this specific cause, but I think it is more general problem.

jbush001 · 2018-05-31T03:15:33Z

The easiest fix is to make sure there aren't more threads than cache ways. :)

But, yes, you are correct about the deadlock potential in this configuration. It's fairly easy to reproduce with the existing design by setting the number of cache ways to 1 (effectively making the cache direct mapped). The lockup is generally unlikely, but when it does happen, it's catastrophic, so the processor needs to handle this properly. I talk about the problem in the section on 'livelock' here:

https://jbush001.github.io/2014/07/04/messy-details.html

I discuss a few possible solutions in that post.

Another simpler (albeit hackier) fix would be to have logic to detect the lockup and resolve it. Since it is infrequent, this should have negligible performance impact. For example, the writeback stage could have a counter of the number of rollbacks from the memory pipeline. When threads successfully retires an instruction, the counter would be reset to zero. If the counter reaches some threshold, it would temporarily enable only one thread to issue instructions. I know some hardware designs have 'starvation counters' to handle degenerate cases with dynamic scheduling, which is not exactly the same thing, but similar.

An even simpler approach would be to make scheduling of threads be random instead of round-robin. Assuming a decent PRNG, there would be a chance each cycle that the next scheduled thread would be the most recently filled line and would break the livelock.

jbush001 · 2018-12-09T07:46:02Z

But the constraint that cache_ways >= threads doesn't necessarily seem unreasonable. I think it would be necessary to implement the solutions above if this was suboptimal for a valid configuration. Given that the thread count is generally limited, this doesn't currently seem to be the case.

jbush001 changed the title ~~Cache System hangs~~ Cache System hangs when configured with fewer ways than threads Oct 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache System hangs when configured with fewer ways than threads #165

Cache System hangs when configured with fewer ways than threads #165

wobanator commented May 30, 2018 •

edited

jbush001 commented May 31, 2018 •

edited

jbush001 commented Dec 9, 2018

Cache System hangs when configured with fewer ways than threads #165

Cache System hangs when configured with fewer ways than threads #165

Comments

wobanator commented May 30, 2018 • edited

jbush001 commented May 31, 2018 • edited

jbush001 commented Dec 9, 2018

wobanator commented May 30, 2018 •

edited

jbush001 commented May 31, 2018 •

edited