New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Another one MP desyncs debugging #9768
Comments
there is some helpful info here: #8345 |
So, after days of debugging, I did some progress on desyncs investigation.
Filled as PRs:
The way of how I'm testing it right now (in case if you are testing solo) (maybe it will be useful for someone in future):
In case if you are not going to buy second civ5 and so on, you can just play with your friends until some crash or heavy desync and then collect logs from all of them and compare everything in the same way. After #9767 it plays ok at least until midgame. So, I'm continuing on desync debugging, looking into some potential possible causes right now. Just posting here for the history and possible feedback. |
After some debugging I noticed another one rootcause for desyncs:
i.e. it's confirmed while debugging desync with automated workers (elements with 'Evaluating' are sorted with I see there are a lot of |
yes that seems like a very good idea |
Agreed. |
I'm not sure what can cause such behavior but if we just replace all the |
can you create a PR for your changes then i'll take a look |
@ilteroi I could but there were just |
ok so i did a search and replace, now it's stable_sort everywhere and it's running fine. no idea what the problem is on your end. |
Strong work! For the great work desync, you might take a look at the network code for great work swapping, as we had to hijack it partially to get events to work in MP. may have unintentionally broken something else, as is tradition. |
that part should already be fixed: f0832f0 |
Oh nice |
I’ll buy everyone drinks if MP desyncs actually stop. |
@ilteroi Can you post a dll? Just tested another one dll where I again replaced all the edit: It defenitely caused by edit2: Actually we loaded into 4 players MP session with dll with |
ilteroi code is on the oleole branch: 73a7cbb |
Well, it is definitely crashing and it is definitely |
@ilteroi I see you merged |
i think this stable_sort business is a red herring; while regular sort does not guarantee the order of equal elements the result is deterministic. we run the same code on both ends, so it should not matter. (side note: unless we are sorting raw pointers, but we don't do that and they would typically be unequal) but there is a comment here which sounded quite positive: https://forums.civfanatics.com/threads/new-version-3-4-1-may-9-2023.683656/page-2#post-16454134 -- so something seems to have improved with your other changes! |
Hi, I did a 200 turns game with 2 players and 4 AI. Also I found a Desync at Turn 0 using the Inca, If it can help here the link of the issue: |
Hello! I'm doing some step-by-step debugging right now and after fixing desyncs caused by #9767 there were desyncs caused by
CvPlayer::m_iLastSliceMoved
on CS AI players. I did some code digging and I don't see whether this variable used for something except of some local "stuck checking" (CvGame.cpp line 9854). It is getting set for each player withgetTurnSlice()
value and it differs for host and non-host players. I'm not sure if it is causing loading screens and so on, but at least state is desynced becauseCvPlayer::m_iLastSliceMoved
on CS AI players are different for host and non-host players. Maybe we should remove this value from synchronization check at all?Another interesting point is that this variable (according to code prior the very first commit) was synced in original dll too. I'm not tested it but I think it was so. If it is so, then such desync is caused by our new (?)
m_iTurnSlice
incrementation logic for host and non-host players (I tested and it really updates differently) and it can affect some other variables too. In this case it is better to find the change which causedm_iTurnSlice
desynced.Anyway I got attempt on just removing
SYNC_ARCHIVE_VAR
andvisitor
onCvPlayer::m_iLastSliceMoved
and looks like there were no "Variable Out Of Sync" messages for the first 30 turns but I'm not sure if it is corrent way of removing value from synchronization check. I will do bigger test later in days.The text was updated successfully, but these errors were encountered: