Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOS when playing cross plattform without Fog Of War #102

Closed
titiger opened this issue Jan 1, 2016 · 15 comments
Closed

OOS when playing cross plattform without Fog Of War #102

titiger opened this issue Jan 1, 2016 · 15 comments

Comments

@titiger
Copy link
Member

@titiger titiger commented Jan 1, 2016

Game gets OOS if you play multiplayer with Fog Of War disabled.
This is not a new thing! 3.11.1 has this problem too.

@titiger
Copy link
Member Author

@titiger titiger commented Jan 4, 2016

Ok we tested and it looks like this does NOT happen with git version! Maybe something changed and this is working now. Sorry for the false alarm, I close this ticket for now.

@titiger titiger closed this Jan 4, 2016
@andy5995
Copy link
Contributor

@andy5995 andy5995 commented Feb 10, 2016

This may need to be re-opened.

Today while connected to MG server:
http://postimg.org/image/d5t0wxt05/
That crash happened for everybody at the same time.

Earlier today while connected to my server, but not sure if it's out of sync (Jammy reports):
https://forum.megaglest.org/index.php?topic=9801

Both instances FOW was disabled.

I'm using my own build, but I haven't modified any of the code. ;)

@andy5995
Copy link
Contributor

@andy5995 andy5995 commented Feb 11, 2016

I saw this in my terminal. I noticed the time stamp was the same as in the image I posted above. (I forgot to look at my term output after the crash)

Game unique identifier is: a606ea36-d016-11e5-81be-b1749e6f3e3e
*ERROR* [2016-02-10 15:03:16] In [commander.cpp::buildCommand Line: 1010]
Can not find command type for network command = [networkCommandType = 0
unitId = 400054
commandTypeId = 5
positionX = 223
positionY = 32
unitTypeId = 19
targetId = 0
wantQueue= 0
fromFactionIndex = -1
unitFactionUnitCount = 48
unitFactionIndex = 4, commandStateType = 0, commandStateValue = -1, unitCommandGroupId = -1]
Commands:  id = 0 id = 1
for unit = 400054
[pig]
[

HP: 300/300 (Regeneration: 2)
Armor: 0 (organic)
Sight: 10
Produce: 5 food
stop]
actual local factionIndex = 4.
Unit Type Info:
[Unit Name: [pig] id = 15 maxHp = 300 hpRegeneration = 2 maxEp = 0 startEpValue = 0 startEpPercentage = 0 epRegeneration = 0 maxUnitCount = 0 fields index = 0 value = 1 fields index = 1 value = 0 properties index = 0 value = 0 properties index = 1 value = 0 armor = 0 armorType Name: [organic id = 0 light = 0 lightColor = x [0] y [0] z [0] multiSelect = 1 commandable = 1 sight = 10 size = 1 height = 1 rotatedBuildPos = 0.000000 rotationAllowed = 1 skillTypes: [3] i = 0 Stop i = 1 Move i = 2 Die commandTypes: [2] i = 0 Stop i = 1 Move storedResources: [0] levels: [0] meetingPoint = 0 countInVictoryConditions = 0]
Network unit type:
[worker]
isCancelPreMorphCommand: 0
Game out of synch.
*ERROR* [2016-02-10 15:03:17] In [game.cpp::update Line: 2783] Error [Error [#3]: Game is out of sync, please check log files for details.
Stack Trace:
megaglest:Shared::Platform::megaglest_runtime_error::megaglest_runtime_error(std::string const&, bool)address [0x7fdef244ca01] line: 0
megaglest:Glest::Game::Commander::buildCommand(Glest::Game::NetworkCommand const*) constaddress [0x7fdef1db19ea] line: 0
megaglest:Glest::Game::Commander::giveNetworkCommand(Glest::Game::NetworkCommand*) constaddress [0x7fdef1db4236] line: 0
megaglest:Glest::Game::Commander::updateNetwork(Glest::Game::Game*)address [0x7fdef1db78b7] line: 0
megaglest:Glest::Game::Game::update()address [0x7fdef1e50a54] line: 0
megaglest:Glest::Game::Program::loopWorker()address [0x7fdef1f82634] line: 0
megaglest:Glest::Game::glestMain(int, char**)address [0x7fdef1f6e1ff] line: 0
megaglest:Glest::Game::glestMainSEHWrapper(int, char**)address [0x7fdef1f727d2] line: 0
/lib/x86_64-linux-gnu/libc.so.6:__libc_start_main()address [0x7fdeed459b45] line: 0
megaglest:()address [0x7fdef1d3dbaa] line: 0
]
** #2 Socket peek error for sock = -1 err = -1 lastSocketError = 104 mustGetData = 0
** Disconnecting sock = -1

@tomreyn
Copy link
Member

@tomreyn tomreyn commented Feb 14, 2016

Master server statistics for this game

Had you verified that you were all using a build produced from the same git revision?

@andy5995
Copy link
Contributor

@andy5995 andy5995 commented Feb 14, 2016

tomreyn, I built using the 3.12.0 source and embedded tarballs. The data I'm using has been copied from the directory where the 3.12.0 installation package installed files to.

Yesteday I hosted 4 games. 36:41, 9:24, 27:53, and 15:57. All ran with FOW enabled and completed with no crashes.

And all the games I played on MG servers yesterday were played with FOW enabled, and there were no crashes. Two games lasting 28:29 & 56:59.

I had really bad lag the first game of the day, but I was using a different WiFi adapter. After I swapped that out, I played the other five games with no problem.

Hmmm... going back to your original question... There's only a very small chance that emi or aurel were using different revisions. I played a game with emi and Jammy earlier that day, with FOW enabled, lasted for ~30 minutes and completed with no errors. And we've all played many games with aurel with no OOS trouble.

@tomreyn
Copy link
Member

@tomreyn tomreyn commented Feb 14, 2016

Thanks for providing these details. I'm not yet convinced this is a general issue, but let's reopen it for now just so we remember we should test cross platform games without FOW for OOS more before the next release.

@andy5995
Copy link
Contributor

@andy5995 andy5995 commented Feb 15, 2016

You are welcome, Tom. I have more details. More interesting details.

I reproduced this bug 5/5 times. The first time, I tried my own build. The last 4 times, I used the build from the installer.

I hosted a game on my LAN. In another terminal, l set my home directory to 'temp' (HOME=temp/ ./start_megaglest) and ran another instance of MG. I connected to the host for a game.

The set-up I used for the last 4 games was Conflict 4player map. Team 1: CPU and me. Team 2: CPU and me(2). Tileset Autumn. All 4 players were Roman.

This is from the game listed in the stats at 2016-02-15 10:07:48 with a duration of 00:06:48

A screenshot from the client with the OOS message:
http://s2.postimg.org/60h9d6yrd/screen3.jpg

The 5th time I ran the server and the client with the verbose option.

There are 2 verbose text files inside the following archive:
20160215_FOW_test_andy5995.zip
https://drive.google.com/file/d/0B-Ixr8t8mjDsWUJPdzVMZk4yaWM/view?usp=sharing

Each OOS crash happened within the first 10 minutes of the game. I believed it might have something to do with Roman archers morphing into fire archers, but I do not have any real evidence of that. The only reason it occurred to me was the 'iscancelpremorph' message in the error output. (morph being the keyword), and it doesn't happen in the first battles necessarily, but still very early in the game, after temples are built apparently and archers are able to get fire.

@andy5995
Copy link
Contributor

@andy5995 andy5995 commented Feb 16, 2016

I've run some more tests.

I've played all Indian factions, and the outcome is the same. OOS.

I've played a game using all seven factions. OOS.

In all the tests, including the ones I mentioned in my previous ticket entry, the client doesn't actually 'crash', but it spits out the OOS message (as shown in the screenshots), and disconnects from the server. I can access the menu and end the program normally.

The server stays running.

This reproduces for me every time. Titi said in his Jan 3 comment that the error can't be reproduced. Could it be related to the libs megaglest is linked to on my system? Debian 8 x64.

@andy5995
Copy link
Contributor

@andy5995 andy5995 commented Apr 23, 2016

titi and I played 2 games to test, but no crash.

I mentioned during my tests (every one crashed) I was playing on my computer. The client (using a separate userDatadir) was running on the same computer as the host. I have 16G 4 core cpu @ 3.20 Ghz.

Titi suggested I test my RAM. I did that yesterday. I ran memtest from a boot CD, and got zero errors.

Today I tried testing by myself again. The first 2 games didn't crash. The I tried a couple different set-ups and started crashing.

What seems to reproduce an OOS crash every time is when my network opponent is on a different team than I.

This is one set-up that is crashing every time FOW is disabled.

Conflict map. Tileset: Autumn

Team 1: CPU (Mega) Indian
Team 1: Human Indian

Team 2: CPU (Mega) Magic
Team 2: Human Indian

@andy5995
Copy link
Contributor

@andy5995 andy5995 commented Apr 23, 2016

Custom_95 reproduced this bug with the set-up mentioned above. 5 times in row. We played with me hosting, then he hosted. Then we played on headless.

The odd thing is, it did reproduce once when he and I were on the same team. But the first few games, he and I were on opposing teams, as outlined in the set-up in my previous ticket entry.

All games crashed in under 7 minutes. Most crashed in under 5 minutes.

@titiger
Copy link
Member Author

@titiger titiger commented Apr 24, 2016

This OOS happens for 2 Linux players too!
Here are the log files from a game of me and my son, both running the same linux distribution on very similar hardware:
http://titi.megaglest.org/logs/

@titiger
Copy link
Member Author

@titiger titiger commented May 23, 2016

I am getting very close to the problem now.
My setup to hunt the bug where client and server run on the same computer.

Map is Conflict and setup i like this:
slot1: Server Team1
slot2: CPU-Ultra Team1
slot3: CPU-Ultra Team2
slot4: Client Team2
And of course with disabled FOW .

What I found is that the client gives a lot more commands than the server in
void UnitUpdater::updateStop(Unit *unit, int frameIndex) {
line 517.
Where the command is:
unit->giveCommand(new Command(ct, sighted->getPos()));

@softcoder
Copy link
Member

@softcoder softcoder commented May 27, 2016

I am working on a fix for this, i have discovered the issues causing this and am working on correcting them.

  • Bad calcs in world.cpp for cell visible and explored
  • Unsafe use in some cases of threaded randoimization
  • floating point rounding issues

@filux
Copy link
Contributor

@filux filux commented May 27, 2016

On my eye when this will be fixed then also those two reports should be closed:
https://forum.megaglest.org/index.php?topic=9165.0
https://forum.megaglest.org/index.php?topic=9110.0
because it is highly probable this is the same bug, known commonly as "isCancelPreMorphCommand OOS".

@titiger
Copy link
Member Author

@titiger titiger commented Jun 7, 2016

this is fixed now

@titiger titiger closed this Jun 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants