-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache and BP Disable Possible Way #85
Comments
Hello, Basically, it is difficult to disable the I-cache, D-cache, or branch predictor in a simple way, and manual modification is required. Simply disabling the I-cache and D-cache will greatly reduce the speed of RSD, since each access will reach the main memory. It is possible to replace the cache with a simple, one-cycle-accessible memory with some manual modifications. Disabling branch prediction is more complicated. It is not obvious what is meant by disabling branch prediction. It is relatively easy to rewrite the predictor so that it always predicts untaken. It is difficult to modify the fetcher to halt instruction fetching at every branch instruction until the branch resolution, and that causes significant performance degradation. If you can tell me what your goal is, I may be able to suggest a better alternative. |
Hello, Thanks for your reply. I am developing a performance simulator based on the timing database. The accurate cycle latency information obtained by the verilator will be used to develop this database. The estimated cycle consumption from my simulator will finally compare to the result from the verilator. Unfortunately, because of the limited time, I have no time to add the cache processing mechanism to my simulator. Thus the final performance result is not comparable with the verilator result. Adding a memory and reconnecting some ports will be a possible way. I am writing to ask if there are some configurations that can disable the cache simply like writing some control registers. If yes, then less effort needed for me. Thank you so much! :-) |
In your case, a possible solution is to significantly increase the cache size. With such a setup, cache misses will not occur except for the first time for each line. Especially, for benchmarks like CoreMark and Dhrystone that repeatedly run the same loop, they will almost always hit the cache from the second run onwards. By comparing the performance of running the loop twice with that of running it once, you should achieve results similar to a scenario where every access is a cache hit. |
Thank you for your suggestion, but it doesn't work in my case because my timing database granularity is only a instruction blocks with hundreds of instructions . It doesn't depends on the real context of the full program execution. If you have any other ideas, please tell me. If not, I can close it :-) |
Thank you for your suggestions. now I will close it :) |
I'm sorry for the late reply. Have you tried pipeline visualization with Konata? The logs for the pipeline visualizer contain most of the information for each cycle of the core pipeline. If you want per-instruction-block statistics, analyzing the log may help you. From this log, you can see when each instruction was fetched and committed. The following is the format of the log. By running Coremark more than once and extracting the information from the second run, as I suggested at the beginning, you may be able to determine the number of execution cycles in each instruction block when most accesses hit the caches. By the way, I think, how to define the number of "executed" cycles consumed in a fine-grained instruction block is not easy. For example, if you use the difference in commit cycles, it may not reflect the effect of instruction cache misses. |
Hello, I want to ask if there is a easy way to disable the I/D cache and BP? It seems some parameters of cache and bp can be adjusted in the configuration file but they cannot be totally disabled. Thank you so much :-)
The text was updated successfully, but these errors were encountered: