-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Larger simulations are not deterministic #32
Comments
I want to know if this bug has been fixed. I found that the results of parallel engine are also different from those of serial engine. |
The problem is still there. If parallel simulation is used, the simulation will be underministic for sure. The scope of making the simulation deterministic only applies to single-kernel serial simulation. Also, for parallel simulations, how different are they from serial simulations? |
In the fir with 4096 * 32 samples to filter, the parallel simulations may slower 3% than serial simulation. In addition, I print all the events and their scheduled time. In parallel simulation, the first event of mmu is scheduled at 0.0000000120, but in serial simulation, it is scheduled at 0.0000000350. |
I wonder what the possible cause of this problem is, Golang or mgpusim itself. If I know the possible reason, I may be able to try fixing this bug myself. |
Well, we cannot blame Go for this. There are definitely some features will cause non-deterministic execution, we should avoid those. Here are some good discussion on how to avoid non-deterministic behavior in Go golang/go#33702. They also point to the potential source of non-deterministic behavior. One thin I am thinking about is to try to create super simple simulations. The root of the problem may be on the Akita side. The difference in parallel and serial simulations is another problem. I have created #45 for this problem. For now, can you mainly use the serial simulation? |
I can use the serial simulation currently. Thanks for your reply. |
@MaxKev1n Looks great! Can you start a pull request, and I can look deeper into it? |
BTW, there is a deterministic test script under |
@MaxKev1n Thanks for the PR. I am merging it. However, I do not think this problem is fully resolved given the small difference. Being fully deterministic is more about debugging. When we find a bug, we want to rerun the program and the bug takes place at the exact same location. We will keep looking into the problem. I think we are close. |
To Reproduce
MGPUSim version of commit ID:
40c4cd4
Command that recreates the problem
Current behavior
The estimated execution time differs from execution to execution
Expected behavior
The estimated execution time should be the same.
The text was updated successfully, but these errors were encountered: