Skip to content
This repository has been archived by the owner on Apr 17, 2022. It is now read-only.

dereference of a pointer to freed memory durring level load in event.c #1656

Closed
wzdev-ci opened this issue Mar 2, 2010 · 16 comments
Closed

Comments

@wzdev-ci
Copy link
Contributor

wzdev-ci commented Mar 2, 2010

keyword_AI_events_scripting_triggers resolution_fixed type_bug | by Ai_Tak


This happens around on line 1046 of event.c in function eventFireCallbackTrigger after you start loading a new game after the 11th time interpRunScript is called from line 1046 (3rd time after startGameLoop). psCurr (reletive to eventFireCallbackTrigger) has already been unlinked from the list (on line 1041), in a call to eventRemoveTriggerFromList (called from interpRunScript) psCurr->psNext (as seen from eventFireCallbackTrigger) is freed (line 1255) and unlinked from the list, but since psCurr is ALREADY unlinked from the list, psCurr->psNext->psNext is linked to psPrev (psPrev->psNext is updated) rather than psCurr->psNext being updated. psCurr->psNext and psNext now both point to freed memory and could potentially lead to an access violation on line 989. This happens more often debug builds and almost always with standard page heap verification enabled. In release builds it may also result in a crash, although since the freed memory may still be accessible, it would more often result in the (freed) event at psCurr->psNext (maybe?) being fired a second time in this one instance.

I've attached a diff that corrects this problem, but it is more of a workaround. Without this patch I can only start a (skirmish) game running under the debugger (msvc) about 1 out of 20 times. With it I can't get it to crash under any load conditions.


Issue migrated from trac:1656 at 2022-04-15 21:11:21 -0700

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 2, 2010

Ai_Tak uploaded file event.c.diff (0.8 KiB)

diff against beta11a (r10033)

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 3, 2010

Per commented


Ouch. That trigger code is really dangerous. Instead of trying a desperate workaround, I would like to do deferred deletion here, by marking nodes as fit for deletion, then only deleting when safely outside the event firing iteration.

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 3, 2010

Ai_Tak commented


That's what I was thinking would be best for such a situation. Given this is the first time I've reviewed this section of code I wasn't about attempt a comprehensive (proper) fix, at least with this workaround I can run under the debugger. It's especially hard to follow the execution process through the script interpreter (interpRunScript) calls.

I wonder which version was first to trigger this fault (is there a way to review edits to a specific file?). I also wonder about the nature of specific trigger that's getting iterated over (and possibly fired) a second time after it's freed, but I don't have enough information about ACTIVE_TRIGGER or SCRIPT_CONTEXT type objects. Wait, quick debug session and a handy function called "eventGetEventID" and I now see that psCurr->psNext (first to be deleted, and possibly re-fired) is "doResearch" and psCurr is "startLevel".

Is the deferred deletion fix something that would be aimed for the 2.3.0 release?

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 3, 2010

Per changed priority from critical to blocker

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 3, 2010

Per changed keywords from `` to AI events scripting triggers

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 3, 2010

Per changed operating_system which not transferred by tractive

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 3, 2010

Per commented


This needs to go into 2.3, I think.

The bug has probably always been there, the code has been unchanged for a very long time, but it is only recently that I've added frequent use of setEventTrigger() during AI script execution, which I believe is why this bug is becoming a real issue now.

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Mar 7, 2010

Buginator commented


Yeah, I ran into this today, too bad I was debugging something else.

Will try to find what the deal is with this next time I have some more free time.

@wzdev-ci
Copy link
Contributor Author

Buginator commented


Replying to [#1656 Ai_Tak]:

I've attached a diff that corrects this problem, but it is more of a workaround. Without this patch I can only start a (skirmish) game running under the debugger (msvc) about 1 out of 20 times. With it I can't get it to crash under any load conditions.

When I run under MSVC, I rarely get that issue, but I have seen it pop up every now or then. I know it is (yet another) dangling pointer, but I just don't have enough time to find the sucker (yet).

@wzdev-ci
Copy link
Contributor Author

Ai_Tak commented


Enable standard page heap verification (only for warzone of course) with gflags.exe, any freed memory will be overwritten with the F0F0F0F0 pattern when freed. If you have enough memory you can enable full page heap verification, all freed memory will be inaccessible and fault on access.

Getting past the mapFloodFill function with full page heap verification it kinda tough as it has a linked list more than 150,000 entries long, each link gets its own page of memory... I'm not quite sure what mapFloodFill does or why it needs to make such a huge linked list (so many small allocations).

@wzdev-ci
Copy link
Contributor Author

Buginator changed priority from blocker to critical

@wzdev-ci
Copy link
Contributor Author

Buginator commented


I won't have any time to find a fix (or see if the my current fix is good enough) for this until April or later, and it isn't a 'blocker' since it only happens on MSVC builds.
The reason for that is, after the pointer becomes dangling, it is marked 0xddddddd on MSVC builds, and 0x0 on other platforms, so it don't crash.

@wzdev-ci
Copy link
Contributor Author

Ai_Tak commented


If psNext (in the freed structure) was marked as 0x0 then it would skip processing the rest of the linked list. Either the freed memory is still accessible (and points to the real next entry) or its the last item in that block (allocated by the OS) and when its freed the next dereference leads to an access violation. MSVC is just overwriting the freed memory so it happens (almost) every time freed memory is accessed.

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Nov 8, 2010

Buginator changed status from new to closed

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Nov 8, 2010

Buginator changed resolution from `` to fixed

@wzdev-ci
Copy link
Contributor Author

wzdev-ci commented Nov 8, 2010

Buginator commented


Ai_Tak, thanks for the patch & detective work.

Sorry about the delay.

Closing this ticket to prevent you from getting spammed by all the attachments / commit messages, so if you still want to follow-up, see #2300

Thanks again!

@wzdev-ci wzdev-ci closed this as completed Nov 8, 2010
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant