Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash Or Hung Core While Npc Try Move To Water #1936

Closed
datchannin opened this issue Apr 23, 2019 · 13 comments
Closed

Crash Or Hung Core While Npc Try Move To Water #1936

datchannin opened this issue Apr 23, 2019 · 13 comments
Labels
Expansion: TBC (2.4.3) Issues relating to the TBC Expansion (2.4.3). Info: Needs Replication Issue needs replication before further action. System: Creatures nodes, waypoints, spawn_group, texts, etc. Type: Client / Core Crash Issue causes the client or core to crash.

Comments

@datchannin
Copy link

馃悰 Bugreport

For the first, I don't understand if this crash or not. I've never seen such behavior before.
First time I faced this hung was 08 feb 2019, I've spend 10 weeks to reproduce it and confirm.

Several kinds of npc (I confirm only one, but I think there are more) hungs core when try move to water. You should find place, where the earth is gradually moving into the water. And the you should agro npc. After - move to water, but only with jump. Before your jump, npc should not be near you. See the video to understand. May be someone can explain move correctly.

https://radikal.ru/video/eeeImyeKE1l

Npc starts move to you, and when he trys go into water - core will hungs. Core will not crash, but you can not do console commands, can not do something in the game or login to the game. I can not do CTRL+C too, only close window with console. May be it is type of crash?

Expected behavior

Should not happen.

Version & Environment

Client Version: ["2.4.3" (TBC)]

CMaNGOS Repo & Commit Hash: fc792319768322ee4937f7dcf6d7c57fe56df0bd
Database Repo & Commit Hash: a5d6c50ec4d99270b642ae3a30e2ee774cf8d3ec

Operating System: [Win 64]

Steps to reproduce

  1. I found one place and one npc to reproduce with almost 100%

  2. .go creature id 1159
    image

  3. move to the ship and use .add npc 1159 to add npc on the ship
    image

  4. Stay near water as on video and aggro him.
    image

  5. After aggro you should move to the water, jump and swim down.

  6. Profit

Crashlog

  • None
@datchannin
Copy link
Author

I found that it can be not only npc=1159.
Checked with npc=1157 that was added to the same place. And core crash/hung again.
So may be it connected with distance between spawn coords of npc and jump coords of player.

@evil-at-wow
Copy link

The way you describe the issue, it sounds like there's some infinite loop in the core somewhere. As you mention, that is not a crash (the world process keeps running), but it's not a good thing either because the core makes no progress any more. I can't read Russian, but it looks like your client disconnects from the server at the end of your video, which would confirm that the server is no longer send anything to your client (because it's stuck in the infinite loop).

Do you build the server yourself, or do you download a release and use that?

@datchannin
Copy link
Author

I download a release and used it.

Yes, at the end of video client was disconnected. But not after CTRL+C, I closed the console manual.

@evil-at-wow
Copy link

I download a release and used it.

Ok, no problem. In that case it's probably not very useful to try to explain how to use the debugger to get an idea of what the core is doing when it hangs. I'll see if I can find the time this weekend to try and reproduce the issue. At least we have good instructions on how to do that 馃槃

Yes, at the end of video client was disconnected. But not after CTRL+C, I closed the console manual.

Actually, I think the client already disconnects before you close the server console. At 1:03 you can still see the console, but the client already seems to disconnect. Which also suggests the server is no longer responding to the client (but we know that's probably the case, because it's not responding to console commands either).

@BlessedCammi
Copy link

BlessedCammi commented Apr 24, 2019

I cant seem to replicate the issue. Tried a good number of times following your instructions / video.

@datchannin
Copy link
Author

Did you extract vmaps with "-l" option and maps with "-f 0" option?
I've reproduced several times on the latest release. So if you can't, could it be related to map extraction?

@datchannin
Copy link
Author

I've run on debug mode. But C++ is not by base language, I work with SystemVerilog.

As I see, there are infinite loop in the dtStatus dtNavMeshQuery::findPath(), may be this is root of issue.

image

Update recastnavigation with this file (DetourNavMeshQuery.cpp) was done on 24 Jun.
First time I faced this hung - 08 feb. May be this commit inserted the issue (commit 1e89c78913fa9c8541426f6064de6e1a1c147131).

And I think this can be connected with this report
#1862
about npc can not leave water after aggro (faced 05 feb).

@evil-at-wow
Copy link

After several attempts and almost giving up thinking it was working fine on my system, I eventually managed to reproduce this (all the time trying it the way you show in the video). So it might take some attempts to hit the problem. I'm guessing here, but the trick might be to jump in the water immediately after pulling and then quickly swimming relatively far and deep. That seems to give the best "success" rate in my case.

For reference, this was my setup, although it's not that critical I think (see later):
cmangos/mangos-tbc@fc79231
cmangos/tbc-db@dcf6ebf
I'm on Linux (64-bit). Maps were extracted with the -f 0 option, vmaps were extracted with -l option. In the server configuration, vmaps and mmaps are on.

I had a look in my debugger, and you're correct: dtNavMeshQuery::findPath() goes on forever, so the call never finishes. As a result, the core doesn't make any more progress, which practially hangs the server. The loop you outline should clearly take/remove nodes from m_openList - the pop() below - and thus eventually make it empty, but that doesn't seem to happen. I put a breakpoint with commands after the m_openList.pop() line and let my gdb print bestNode details while running. The same nodes keep returning all the time, so they are removed and inserted again later all the time. There's not really a clearly repeating pattern I can see, but it's pretty close. Blocks of up to 6-7 nodes keep coming back again and again, interspersed with a bunch of other nodes in what seems an irregular pattern.

I'm not familiar at all with the mmap/recast code, so I can't really say what's going on without spending a lot of time on it, but I think you are correct that the update of recast is what introduced this problem.

I also had a look at the upstream recast project, and I found something very interesting there: recastnavigation/recastnavigation#343. It's a known issue, and I also noticed @cyberium commenting in that thread, so it looks like he's aware as well. There's also some good discussion in that issue by @jackpoz from TrinityCore, and a link to a similar issue reported in TrinityCore (see TrinityCore/TrinityCore#23028) with more discussion and useful info. It looks like several changes have been made to fix this, but without a more in depth understanding I'm not inclined to suggest patches here. I'll let @cyberium (or someone else) handle this from here.

@datchannin
Copy link
Author

I've already tryed different types of map extraction. It is not affected hung. Path Find freezes with high/low resolution maps of all type.
And I've checked the same place in Trinity 3.3.5. It works without freeze.

@jackpoz
Copy link

jackpoz commented Apr 27, 2019

This issue was discussed at recastnavigation/recastnavigation#373 too which resulted in recastnavigation/recastnavigation#374 and recastnavigation/recastnavigation#381

It's basically caused by some new math optimization in recast resulting in NaN in particular conditions but it has been fixed on latest recast commit. Just update recast and you are all set.

@datchannin
Copy link
Author

I've tested latest version of recastnavigation. findPath() works correctly, core doesn't hung.
Npc start run through texturex after evade, but it is a minor trouble.
https://radikal.ru/video/e9RrLsLf6NT

@jimmybrancaccio jimmybrancaccio added System: Creatures nodes, waypoints, spawn_group, texts, etc. Type: Client / Core Crash Issue causes the client or core to crash. Expansion: TBC (2.4.3) Issues relating to the TBC Expansion (2.4.3). labels May 27, 2019
@jimmybrancaccio jimmybrancaccio added the Info: Needs Replication Issue needs replication before further action. label Jun 9, 2019
@berserkingyadis
Copy link

berserkingyadis commented Jun 9, 2019

Did we update to the latest recastnavigation?
@cyberium

@killerwife
Copy link

Recast was updated tons of times, closing this, if it persists, make a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Expansion: TBC (2.4.3) Issues relating to the TBC Expansion (2.4.3). Info: Needs Replication Issue needs replication before further action. System: Creatures nodes, waypoints, spawn_group, texts, etc. Type: Client / Core Crash Issue causes the client or core to crash.
Projects
None yet
Development

No branches or pull requests

7 participants