Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent crashes on levels with Medusa #483

Open
tpoppins opened this issue Dec 16, 2014 · 20 comments
Open

Silent crashes on levels with Medusa #483

tpoppins opened this issue Dec 16, 2014 · 20 comments

Comments

@tpoppins
Copy link
Contributor

tpoppins commented Dec 16, 2014

This is in addition to the still-open #66, #178 and #363.

WAD:
SUDTIC.WAD E2M1
demos:

  • Sudtic1.lmp from sudticmt.zip - crash
  • SUD21NAF.LMP from sudtic_na.zip - crash
  • s2t1-238.lmp from sudt-ryb.zip - no crash
  • s2t1-136.lmp from sudticnm_na.zip - no crash

This is due to Medusa caused by linedefs 457, 834 and 836 having BROWN1 as main texture.
Results produced running -timedemo. Adding "-mb 8" and/or setting snd_samplerate to as high as 96000 doesn't make a difference (unlike with older versions). Adding "-nosound" fixes SUD21NAF but not Sudtic1.

WinXP SP3, Chocolate Doom v2.1.0.
I played this map dozens of times in vanilla under Win98 in the 90s, and many dozens more in 2003 while working on the -fast demopak. Vanilla never once crashed there, there wasn't even a slowdown (the above linedefs' main textures should not be visible under any circumstances anyway).

edit: DSDA link updated

@haleyjd
Copy link
Contributor

haleyjd commented Dec 19, 2014

Most of these issues have been getting closed. I wonder if it's time that we investigate the actual plausibility of doing structured emulation of the out-of-bounds reads that occur in the process of the Medusa Effect phenomenon and add this to Chocolate Doom? Vanilla virtually never crashes because of the Medusa Effect, rendering many wads unplayable that loaded fine in the original engine. This is the type of case where the "same" binary behavior is not the "proper" program behavior, because of the essential non-portable nature of what's going on.

@haleyjd
Copy link
Contributor

haleyjd commented Feb 25, 2015

http://eternity.mancubus.net/pics/medusaemu/

Preliminary work on a Medusa Effect emulator. I'd like to push this onto a new branch but I don't know how to do that in git yet (or knew, and then forgot at some point). The look of it is about down perfectly, but the way it changes dynamically is not. Note the first shot is from an earlier build though; I tweaked the generator afterward to prefer numbers from 0 to 16 at a statistically higher rate, which creates more authentic looking output.

@tpoppins
Copy link
Contributor Author

tpoppins commented Feb 27, 2015

Nice work, Quasar; but I start wondering if Medusa is the real culprit behind these silent crashes.
Check out Armadosia MAP04 linedef 1576
Armadosia MAP04 screenshot 1
Armadosia MAP04 screenshot 2
Bona-fide Medusa, isn't it? These were taken with Choco after you let the wall that reveals it lower normally. If you noclip to it you get a silent crash. So maybe it's not about just Medusa, after all. And that case of SUDTiC E2M1 - the offending linedefs are not even supposed to be visible. Maybe something else is going on there.
Regardless of how this develops, the current state is intolerable. No program should crash without at least an error message of some sort.

edit: screenshot links updated

@haleyjd
Copy link
Contributor

haleyjd commented Feb 27, 2015

SUDTIC is crashing because of Medusa. There is a 1 pixel gap created by roundoff error with the potential to cause out-of-bounds access. I caught this in the debugger so it is not a matter of conjecture. With my emulator in place, SUDTIC runs normally.

@tpoppins
Copy link
Contributor Author

The logical question then is why does a single-pixel Medusa cause a crash on SUDTIC E2M1, whereas a nearly full-screen one on Armadosia MAP04 doesn't?

@fragglet
Copy link
Member

Medusa is essentially a buffer overrun. Normally reading into random chunks of memory would cause an immediate crash, but Doom allocates everything as a single block of memory (the zone memory subsystem). In Vanilla Doom under DOS it also doesn't cause a crash because the DPMI system doesn't treat invalid memory accesses as fatal bugs.

Under Chocolate Doom, the Medusa effect usually doesn't cause a crash because of the zone memory subsystem. But it depends where in the zone you start reading from. If you're unlucky you can read all the way past the end of the heap and then you get a legitimate crash.

So the simplified answer is that it's non-deterministic and there are cases where Medusa can crash the game. It all depends on where in the zone heap you start reading from, which in turn depends on what's stored in the heap - level data, sound effects, etc. Some levels / texture combinations just happen to set up the perfect conditions to cause a crash.

@tpoppins
Copy link
Contributor Author

tpoppins commented Mar 1, 2015

I was afraid it might be non-determenistic. Even in vanilla the visual appearance of Medusa is not predictable: some examples.
Thank you for the explanation, fraggle

edit: link updated

@AXDOOMER
Copy link
Contributor

AXDOOMER commented Mar 1, 2015

Does the medusa emulator takes in account the case where the medusa appearance changes because the player opened the menu?

@fabiangreffrath
Copy link
Member

Maybe if, instead of allocating a dedicated memory buffer for this effect, we point the maxextent variable to the start of the zone heap buffer?

@AXDOOMER
Copy link
Contributor

AXDOOMER commented Mar 2, 2015

Is it possible to start reading memory where the medusa originally starts reading memory and then limit the buffer so it doesn't cross the end of the heap? That way, it wouldn't crash and you'd be using actual data from the heap instead of emulating the data that could be there.

@fragglet
Copy link
Member

fragglet commented Mar 2, 2015

Yes, that is pretty much exactly what @haleyjd's Medusa emulation code does, except that it reads from a large buffer filled with a simulated heap, rather than the actual heap. Never mind, missed the point.

The Medusa emulation reads from a simulated heap rather than the real thing. I'd much rather do that just because knowingly allowing the renderer code to walk through random chunks of the heap is really not good behavior. Ideally I would like for it to be possible to swap out the zone memory allocator entirely and use the z_native.c implementation.

@AXDOOMER
Copy link
Contributor

AXDOOMER commented Mar 2, 2015

Why not read from the actual heap instead of simulating one?

@fragglet
Copy link
Member

fragglet commented Mar 2, 2015

Sorry, I misunderstood. Please see edited comment.

@haleyjd
Copy link
Contributor

haleyjd commented Mar 6, 2015

Native heap will never work without introducing referential integrity for thinker_t - note using one necessitated the other for SVE. This is because the Doom engine regularly accesses thinkers after they have been through Z_Free. A naive adaptation of native heap without also enforcing referential integrity will render all of the Chocolate ports totally unplayable under OSes like BSD that have a secure heap - the first read of a Z_Free'd thinker will segv the game. While I am not using this as an argument for using the zone heap for Medusa emulation (bad idea - my "random" heap is very carefully constructed as an actual valid albeit gigantic column_t structure, so that it cannot overrun), I am warning against this as a potential major catastrophe.

@jmtd
Copy link
Contributor

jmtd commented Jul 12, 2015

Hi folks, I hit a medusa crash bug in an old test WAD of mine that I found and I remembered that Quasar had been working on medusa emulation so I had a look around, found this bug and then found the medusa_emu branch. It seems good to me! I was just wondering what further work was necessary to make it merge-worthy? I thought about pushing a rebased branch as a convenience but it merges fine with master at the moment anyway.

@fabiangreffrath
Copy link
Member

There was some discussion whether it would make sense to also properly emulate the slowdown that accompanies this effect: dbf71aa#commitcomment-9929844

@jmtd jmtd added the Medusa label Sep 8, 2015
Azarien pushed a commit to Azarien/chocolate-doom that referenced this issue Apr 11, 2020
@haleyjd
Copy link
Contributor

haleyjd commented Jan 24, 2023

It's worth noting that this continues to arise periodically (see: https://www.doomworld.com/forum/post/2599048 ) and remains one of the larger compatibility footprint divergences between Choco and vanilla thanks to the fact that the current behavior in Choco virtually always causes an illegal read exception, whereas under vanilla the game chugs onward virtually all of the time.

I would consider my earlier pull request dead at this point and would have different ideas on how to accomplish it at this point (copy off some of the actual zone heap info, and just use a small buffer and forget about attempting to emulate the slowdown aspect because it is machine dependent anyways). My hope would be that someone would take this up as I really don't have time for it.

@fabiangreffrath
Copy link
Member

There is another PR that covers this issue, but I am not sure about the performance penalties for non-medusa columns:
#1498

@jmtd
Copy link
Contributor

jmtd commented Jan 24, 2023

I'd split this into two issues: preventing the crash, and emulating the behaviour of vanilla doom when medusa is visible.

The former should be the priority.

For the latter, what I would do (I'm not working on this) is start by backporting @AlexMax's uncapped framerate patches from Crispy Doom, and then ratcheting down the framerate when medusa is visible. Finally, file off the user-knobs to change the framerate.

It's not perfect because the game logic wouldn't be slowed down which IIRC it is in vanilla medusa.

@haleyjd
Copy link
Contributor

haleyjd commented Jan 27, 2023

There is another PR that covers this issue, but I am not sure about the performance penalties for non-medusa columns: #1498

As written, I think that's going to have a very large performance impact. My old emulation PR barely had any impact in that regard. IIRC it required one branch to detect when the overflow condition began, and only then did it run any emulator code. It definitely was not memsetting a buffer every gametic either. Medusa doesn't normally change that often, and I don't think providing the ability to debug maps for mappers is appropriate for Choco Doom; that's more something that would belong in a fork like chocorenderlimits. This probably isn't the best place to comment on a PR but I didn't want to engage in a full review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants