Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] Fixing the "out of memory handles" crash #22

Closed
Pyrdacor opened this issue Jan 31, 2022 · 65 comments
Closed

[Documentation] Fixing the "out of memory handles" crash #22

Pyrdacor opened this issue Jan 31, 2022 · 65 comments
Assignees
Labels
bug Something isn't working documentation Improvements or additions to documentation fixed Fixed in upcoming release
Milestone

Comments

@Pyrdacor
Copy link
Owner

Pyrdacor commented Jan 31, 2022

The quick fix

Here is the final quick fix with a hex editor. This example uses english 1.07 AM2_BLIT.

Search for hex 10 28 00 10. You will find two occurences. Start with the first one. It should be at 00006E7E.

Cut out the 4 bytes and insert them before the preceeding 12 bytes which are 4E B9 00 02 7F 22 4E B9 00 02 13 8A. Or cut out the 12 preceeding bytes and insert them after the mentioned 4 bytes. Do as you wish.

Before:
    4E B9 00 02 7F 22 4E B9 00 02 13 8A 10 28 00 10
After:
    10 28 00 10 4E B9 00 02 7F 22 4E B9 00 02 13 8A

Now go to the second occurence of 10 28 00 10. Should be at 0000BDF8. Cut the 4 bytes but this time insert before the preceeding 6 bytes (4E B9 00 02 7F 22).

Before:
    4E B9 00 02 7F 22 10 28 00 10
After:
    10 28 00 10 4E B9 00 02 7F 22

Now search for the first occurence of 4E F9 in the file. It should be at 00000038 in AM2_BLIT. A search for 4E F9 00 00 00 2A should also work and also return a single hit. This marks the beginning of the first code hunk btw. ;)

Now take a calculator. Use the initial found offsets:

  • 00006E7E
  • 0000BDF8

Subtract the offset you found just a second ago:

  • 00006E7E - 00000038 = 00006E46
  • 0000BDF8 - 00000038 = 0000BDC0

Now subtract 0A and 04 from the first and 04 from the second value:

  • 00006E46 - 0A = 00006E3C
  • 00006E46 - 04 = 00006E42
  • 0000BDC0 - 04 = 0000BDBC

Now search for those 3 values. There should be only 1 occurence if you include the zero bytes into the search! The first two values should be found right next to each other: 00 00 6E 3C 00 00 6E 42.

Replace each long (4-bytes value) with the value + 4:

Before:
    00 00 6E 3C
After:
    00 00 6E 40

Before:
    00 00 6E 42
After:
    00 00 6E 46

Before:
    00 00 BD BC
After:
    00 00 BD C0

This is it.


Old stuff following

The fix

Open AM2_CPU and AM2_BLIT into a hex editor.

Search for the hex sequence "10 28 00 10 4E B9 00". If you find two occurences, go to the last one. Cut out the 6 bytes just in front of the found sequence and paste it back in after the "10 28 00 10".

Then search for the hex sequence "00 00 BD BC 00 00 BD C6". If you find more than one, pick the first one. Replace the "BC" with a "C0" and you are done.

This will work for english and german.

Explanation

The first change fixes the order of the first two instructions:

image

The function FUN_FreeMemoryHandleById releases a memory handle by its id. The id must be in D0 which is performed by the move instruction. Unfortunately there is a function call just above which clears (0x10,A0) internally. This way the move instruction will then only move the value 0 to D0 and thus FUN_FreeMemoryHandleById tries to free the memory handle with id 0.

Valid memory handle ids are 200 to 224 (there are 25 slots). With the above bug, the slot is not freed and can no longer be used. If this is called often enough, all 25 slots will be occupied and the error is raised. Moreover FUN_FreeMemoryHandleById will clear some long-word at the address 800 bytes before the memory handle slots as the function just subtracts 200 from the given id (for 0 this will be -200), multiplies this by 4 and adds it to the start of the memory handle section. Then it clears the long-word there. So this will be -800 bytes relative to the memory handle section.

The above bugged code is called by the "Call Eagle" spell function. This is why calling the eagle will mess up the memory handles.


The second change adjusts the relocation table as we moved a function call which uses a global 32 bit address. As we moved the move instruction before the function call and the instruction has 4 bytes, the offset inside the RELOC32 hunk must be increased by 4. This is the BC to C0 part.


Btw the function we moved (FUN_00246f22) looks exactly like this, so you see that it clears our memorized memory handle id.

image

EDIT 04-07-2022:

There is a second bug. The above bug avoided freeing the eagle sprite memory handle when calling the eagle. But when unmounting the eagle, there is similar code with the exact same issue. To fix it, again search for 10 28 00 10. This time take the other spot. Move those bytes before the function call (4e b9 00 02 7f 22) again. This will fix the whole issue once and for all!

@Pyrdacor Pyrdacor added the documentation Improvements or additions to documentation label Jan 31, 2022
@Hexaae
Copy link

Hexaae commented May 1, 2022

When calling the eagle/unmounting I still have some random crashes even with 1.13 patch applied to the WHDLoad installation of the game (already double-checked the files) 😔
Not easy to reproduce because I have to play for at least 10min... but still happens unfortunately.
WHD crash dump with readable output: https://1drv.ms/u/s!ApMUGr0cuN39gotVpE_rCGXGIKQXsA?e=IMmIjj

P.S.
I use some very old save games (2001, Ambermoon 1.08)... can it make this patch useless in this case?

@Pyrdacor
Copy link
Owner Author

Can you please try to just mount/unmount the eagle for about 15 times in a row?

I am not that familiar with WHDLoad slaves but don't they change the code inside AM2_CPU and AM2_BLIT? Did you replace those files?

@Hexaae
Copy link

Hexaae commented May 13, 2022

Yes, I replaced the exe files too. No problem after 15 times, but going on it will still happen.
For my test I usually try to fly with the eagle 🦅 over the border of the map when 0, 0 turns to 600, 600 (or so, can't remember max X, Y) and fly for 2-3 days, landing & sleeping, and flying again... and well, after I try to do so for a while and after 20-25 mount/unmount the moment I call the giant eagle again it will pop-up the usual "A majestic eagle blah blah..." then black screen to Workbench with mem error I guess).
Will now test just mount/unmount in the same place for 30 times, let's see if I can reproduce it just increasing this numbers...

@Hexaae
Copy link

Hexaae commented May 13, 2022

Confirmed it happened after 25-26 times without moving (so it's not a map border issue or a consequence of sleep day/night at least... we can exclude them)!
26 seems the exact number.
P.S.
In the explanation above I've read you were talking about 25 slots indeed for mem handles...
In the meanwhile I'll try also without WHDLoad but I doubt the installer changes something in the EXE files in memory.

@Hexaae
Copy link

Hexaae commented May 13, 2022

Ok, happened also running the patched game without WHDLoad:

An error has occurred :

There are not enough memory handles available.

This is an internal error. Our most sincere apologies.

You can try to play on simply by rebooting your computer
and restarting the game. The error may not appear every
time.

Please press RETURN.

@a1exh
Copy link

a1exh commented May 13, 2022

I am not that familiar with WHDLoad slaves but don't they change the code inside AM2_CPU and AM2_BLIT?

Yes or sometime replace the exes altogether.

It is unlikely a WHDload slave will work correctly with a modified compressed executable like Ambermoon (but not impossible).

@Hexaae
Copy link

Hexaae commented May 13, 2022

As far as I've tested, only "Ambermoon" file has to be Imploder compressed like the original, then you can use modified files for the exes (but AM2_BLIT and AM2_CPU must be decompressed to work with WHDLoad. I use xfddecrunch), data, savegames etc. (indeed I imported my WB game into a WHDLoad installation).

In short, to convert it to WHDLoad you can overwrite the data/amberfiles/ subdir of the installed game with the patch files (PyrdacorFixEnglish1.13.lha), and then just have to decompress Imploder files AM2_BLIT and AM2_CPU. WHDLoad installer will still work fine.

Are the exe files in PyrdacorFixEnglish1.13.lha already patched, right?

@Hexaae
Copy link

Hexaae commented May 13, 2022

@a1exh
Can you please try to mount/unmount 26times in a row from the eagle to confirm?

@a1exh
Copy link

a1exh commented May 13, 2022

Are the exe files in PyrdacorFixEnglish1.13.lha already patched, right?

They are supposed to be but there is always a possibility that something has gone wrong.

This is the commit

980464b

I'm a little nervous that English AM2_CPU says +0 bytes but AM2_CPU for German version says -4 Bytes.

So does AM2_BLT for both English and German say -4 Bytes.

This may be nothing. If I had time I would binary diff them and the changes.

@Pyrdacor
Copy link
Owner Author

It is also possible that there is a second crash issue. I will check that soon. Can you try with 1.14 as well? It has a complete new exe. But please use without WHDLoad and the full version as a clean state. I didn't have the time to release only the patch nor LHA files so only zip and tar.gz at the moment.

@Hexaae
Copy link

Hexaae commented May 13, 2022

Still the same auto-quit issue after 26 Eagle mounts, ran from WB no WHDLoad:

An error has occurred :

There are not enough memory handles available.

This is an internal error. Our most sincere apologies.

You can try to play on simply by rebooting your computer
and restarting the game. The error may not appear every
time.

Please press RETURN.

Could be something in the savegame or it can't be? Here is it anyway (use the MEGAHERO savegame): https://1drv.ms/u/s!ApMUGr0cuN39gowEjNyqUlDiqliexg?e=tXiZMq

P.S.
Yep, doesn't work anymore with WHDLoad. I force-downloaded the 1.14 archive of course

@a1exh
Copy link

a1exh commented Jun 13, 2022

If this is not 100% fixed can we add a bug LABEL?

@Hexaae
Copy link

Hexaae commented Jun 13, 2022

BTW, I have to report to WHDLoad team this new single exe change for a new slave...
EDIT: DONE.

@a1exh
Copy link

a1exh commented Jun 13, 2022

@Pyrdacor maybe we (and by that I mean you) should work towards removing any need for a WHDload slave? The source is included in the WHDload slave. I think they only patch 2 places.

http://whdload.de/games/Ambermoon.lha

Ambermoon Install\src\ambermoon.asm

I'm not sure there is really a need for Ambermoon WHDload but I haven't used real 68060 for a long time.

@Hexaae
Copy link

Hexaae commented Jun 13, 2022

Yes... I prefer the WHDLoad version as I've had some instability in the past after running Ambermoon from the OS.
It also reports some strange memory errors at launch on my 060 WinUAE system when I use 8MB CHIP: as I remember well it used some kind of fake fastmem tool, and gets confused if finds enough mem to contain the whole game in CHIPMEM (yep, will load & run the game from CHIP only!).
Or maybe Pyracord could just implement the same WHD fixes directly in the new exe.

@Pyrdacor Pyrdacor added the bug Something isn't working label Jun 13, 2022
@Pyrdacor
Copy link
Owner Author

I can try to do this tomorrow.

@Hexaae
Copy link

Hexaae commented Jun 29, 2022

News?

@Pyrdacor
Copy link
Owner Author

Sorry I will have a look at it asap.

@Pyrdacor
Copy link
Owner Author

I'm in contact with Wepl from WHDLoad now. He said he can create a slave for the recent release but I don't know when he will do so.

@Hexaae
Copy link

Hexaae commented Jul 1, 2022

Good. Can he also help us fixing the "26 giant eagle mounts" bug? Whd team is very skilled in fixing original games

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 1, 2022

I finally had time to have a look. And unfortunately my fix was no longer present in 1.14. I guess when I moved over to Nico's Ghidra project and re-installed the fixes there, I forgot to add the eagle fix. :( So I fixed it now again in Ghidra and will release a new version in the next days.

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 1, 2022

@Hexaae Can you please test with this AM2_CPU?

AM2_CPU.zip

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 1, 2022

I released 1.15 with the fix. Hopefully it works. Otherwise feel free to reopen this issue.

@Pyrdacor Pyrdacor closed this as completed Jul 1, 2022
@Hexaae
Copy link

Hexaae commented Jul 1, 2022

Still unfixed. I used PyrdacorFixEnglish1.15.lha.

An error has occurred :

There are not enough memory handles available.

This is an internal error. Our most sincere apologies.

You can try to play on simply by rebooting your computer
and restarting the game. The error may not appear every
time.

Please press RETURN.

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 1, 2022

Can you test to use the AM2_CPU from the 7z I provided earlier?

@a1exh
Copy link

a1exh commented Jul 7, 2022

I never encountered a inventory crash for example.

This was only in DE v1.01 which we've never really used.

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 7, 2022

Apart some WHDLoad related fixes, some access faults and more...

I read the description. And I read the source code for the patch. I read the documentation about the macros/functions used in the patch.

The "fixed access fault due vbr access" these are not bugs, they are an incompatibility between WHDload and Ambermoon.

some insufficient interrupt acknowledgements fixed

This appears to be the only "real" bug fix but I don't understand what the cause is.

That's what I thought too.

@Hexaae
Copy link

Hexaae commented Jul 7, 2022

As mentioned before the original behaves strangely with CHIPmem allocation causing random not enough mem issues at launch or quitting the game on 060 and advanced systems... (Imploder, used for some files, also has no reliable compatibility with 060 AFAICR, but in this case an xfddecrunch can decompress them and should solve the problems...). The WHDLoad edition workarounds also all these annoyances...

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 7, 2022

The memory allocation in Ambermoon is a bit strange and can easily cause an Out of Memory error. I just recently decoded it.

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 7, 2022

It works like this:

For each memory type it does 10 iterations. Each iteration asks the OS for the biggest available chunk of memory. If it is at least 5120 bytes, it is allocated. So this is very greedy. It will take all the memory if it can. But there is a value which states how much memory should remain free after all allocations. So after the 10 iterations Ambermoon again asks the OS how much free space there is. If there is not enough left (10240 bytes is the value in Ambermoon for both mem types), the allocated blocks are freed until enough memory is available for the OS. This can also be a partial free (which is a bad thing but works).

First this is done for chip mem. Then for fast mem. 280000 bytes is minimum for chip, and 75000 for fast. If not enough chip the error is raised. If not enough fast, chip is checked for the total amount (355000). If not enough the error is also raised.

Note that all memory blocks below 5120 are not considered so even if you have enough free memory but in a bad distribution, you will get the error.

@Hexaae
Copy link

Hexaae commented Jul 7, 2022

The problem is that, with the WB version, sometimes on quitting back to WB a nice pop-up tells me the "Ram Disk is full" 🤔
I suppose it's too fast to quit and return to WB while RAM is not fully released.

image

@Hexaae
Copy link

Hexaae commented Jul 8, 2022

It's unusable on my system the non-WHDLoad version: it's a jackpot running it from WB. It works most of the times, but sometimes freezes with black screen, sometimes I have strange Ram Disk full warning after I quit to WB, some others it refuses to load like the picture below...

image

I tried to manually apply the fixes to AM2_CPU and BLT as described (to run it with WHDLoad), but 1.13 binaries (decompressed with XFDDecrunch) look different in contents.... For example there is no "4e b9 00 24 6f 22" for the second patch in the 1.12/1.13 Eng exes:
image

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 8, 2022

I just released 1.16. First try this please. There was a bug which corrupted the seglist so that the game crashed at quit. Maybe it also caused your RAM disk issues? Give it a try.

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 8, 2022

@a1exh FYI: The eagle-related crash should now really be fixed in 1.16.

@a1exh
Copy link

a1exh commented Jul 8, 2022

Can we split this bug into two? Close the eagle crash bug. Open a Native vs WHDload 060 compatibility bug?

@Hexaae
Copy link

Hexaae commented Jul 8, 2022

I just released 1.16. First try this please. There was a bug which corrupted the seglist so that the game crashed at quit. Maybe it also caused your RAM disk issues? Give it a try.

No, the RamDisk full error requester on quit has always been an original game bug (when running from WB) due to its hacky mem allocations... It just happened again also with 1.16. Will continue testing it to see if besides that is now a bit more reliable (launch & quit, launch & quit... many times).

@Hexaae
Copy link

Hexaae commented Jul 8, 2022

mmmh, got 2 gurus: 8000 0025 and 8000 0009

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 8, 2022

Can we split this bug into two? Close the eagle crash bug. Open a Native vs WHDload 060 compatibility bug?

Done in #52

@Pyrdacor Pyrdacor closed this as completed Jul 8, 2022
@Hexaae
Copy link

Hexaae commented Jul 8, 2022

EDIT 04-07-2022:

There is a second bug. The above bug avoided freeing the eagle sprite memory handle when calling the eagle. But when unmounting the eagle, there is similar code with the exact same issue. To fix it, again search for 10 28 00 10. This time take the other spot. Move those bytes before the function call (4e b9 00 24 6f 22) again. This will fix the whole issue once and for all!

Can you double-check this fix description? Can't even find 4eb900246f22 in the DE or EN exes of 1.13÷1.16 or in the original AM2_CPU and AM2_BLIT...

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 8, 2022

Sorry these are Ghidra addresses.

Try 4eb900027f22 instead.

In Ghidra the start address is fixed to 0021f000. In the binary, addresses are relative and then are relocated to the real address on startup. So you have to subtract 21f000 from the addresses I mentioned.

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 8, 2022

FYI: I edited the initial post to correct this.

@Pyrdacor
Copy link
Owner Author

Pyrdacor commented Jul 8, 2022

The problem is that you also have to adjust the reloc table. This time two entries as you move two of them. I guess I will write a complete hex editor fix description. But not today. I reopen this until it's done.

@Pyrdacor Pyrdacor reopened this Jul 8, 2022
@Pyrdacor Pyrdacor assigned Pyrdacor and unassigned Hexaae Jul 8, 2022
@Hexaae
Copy link

Hexaae commented Jul 8, 2022

Ok... I'll be curious to test the fix on the original 1.07÷1.13 AM2_BLIT still compatible with WHDLoad...

@Pyrdacor
Copy link
Owner Author

@Hexaae I posted a hex editor fix description. Can you test this please?

@Hexaae
Copy link

Hexaae commented Jul 11, 2022

Thank you.
Tested starting from 1.13 AM2_BLIT (BTW, there are no further fixes >1.13 excluding the merging of CPU+BLIT?) and it worked :)
No crashes, and WHDLoad original SLAVE is still working.

BTW, the strange thing was that 1.13 already had some of hex-fixes but not all of them...

P.S.
The other day I also found a generic WHDLoad salve using kickstart40.68 (3.1), 2MB CHIP, 0.5MB FAST and it works just fine with the new 1.16 copying everything under /data/ without manual reworking. Another cheap way to run it "protected" through WHDLoad.

@Pyrdacor
Copy link
Owner Author

1.14 only merged the two files and I think externalized some data to files.

And yes the partial eagle fix was in there in 1.13 already. I only forgot the second one. ;)

Can you provide the WHDSlave that is working with 1.16? I would love to play Ambermoon on my A500 mini.

@Hexaae
Copy link

Hexaae commented Jul 11, 2022

Here is it: https://1drv.ms/u/s!ApMUGr0cuN39go1avydAuto9-wMRRg?e=hry54N
ToolTypes in the icon to launch the game with this generic slave:

SLAVE=GenericKick31.Slave
CUSTOM=Ambermoon
DATA=data
PRELOAD

@a1exh
Copy link

a1exh commented Jul 11, 2022

P.S. The other day I also found a generic WHDLoad slave using kickstart40.68 (3.1), 2MB CHIP, 0.5MB FAST and it works just fine with the new 1.16 copying everything under /data/ without manual reworking. Another cheap way to run it "protected" through WHDLoad.

I think that confirms what we thought. There are no real bug fixes in the WHDload slave but somehow in WHDload itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation fixed Fixed in upcoming release
Projects
None yet
Development

No branches or pull requests

3 participants