Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Banjo-Tooie crash #170

Closed
fzurita opened this issue Jul 28, 2016 · 56 comments
Closed

Banjo-Tooie crash #170

fzurita opened this issue Jul 28, 2016 · 56 comments

Comments

@fzurita
Copy link
Member

fzurita commented Jul 28, 2016

@Gillou68310

I'm seeing this crash reported on the ARM version of the new dynarec:

"oops, branch at end of block with no delay slot"

I'm seeing this happen in Banjo-Tooie. Here is a save state:
Banjo-Tooie-crash.zip

Face the big digging machine and throw an egg at it by holding Z and C-UP

This is happening in new_dynarec.c at line 8553.

@fzurita fzurita mentioned this issue Jul 29, 2016
@Gillou68310
Copy link
Member

I managed to reproduce it on arm64 but x86 does not have the issue, I'll investigate ;-)

@Gillou68310
Copy link
Member

I'm in vacation for 3 weeks so I won't be very active

@fzurita
Copy link
Member Author

fzurita commented Jul 29, 2016

OK, thanks!

@Gillou68310
Copy link
Member

@fzurita I can't reproduce it anymore on arm64!
Only think I did was to update GLiden64 to the latest version.
Did you checked if this was actually happening with other video plugin?

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

That's a good point, I didn't try any other plugins, but the crash seems to be at the core. I'll check again.

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

I just tried Angrylion and Glide64. Both crashed in the 32bit dynarec.

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

Maybe you fixed it by accident?

@Gillou68310
Copy link
Member

I didn't changed anything in the core that's weird

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

I can try building your latest 64bit core and check if I get a crash there. It will have to wait until later tonight though.

@Gillou68310
Copy link
Member

Could you test with latest GLiden64 version just to be sure

@Gillou68310
Copy link
Member

And with interpreter ;-)

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

Sure

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

Interpreter doesn't crash with latest GLideN64

@Gillou68310
Copy link
Member

But new dynarec does?

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

Yes

@Gillou68310
Copy link
Member

Ok I guess I'll have to do the test on arm32 to be in the same conditions. Time to resurrect my RPi2!

@Gillou68310
Copy link
Member

Here's the only fixes I made to the arm64 branch that you can test
#177
#178

@Gillou68310
Copy link
Member

Ok now this is weird I built the latest version of mupen64plus-ae tested with both GlideN64 and Glide64 and it's not crashing

@Gillou68310
Copy link
Member

It would be great if you can get me a savestate with the egg already thrown and crashing everytime you do a loadstate.

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

Really, you can't reproduce it with the above steps and save state?

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

I'll do that real quick.

@fzurita
Copy link
Member Author

fzurita commented Aug 25, 2016

Here it is

tooie-crash.sav.zip

Load that save state and it should crash almost immediately.

@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

Ok, I tried pull requests:
#177
#178

Neither of these fixes the crash here.

@Gillou68310
Copy link
Member

Ok I see what's happening here, you'll have to do the loadstate real quick after the core startup or it won't crash. I usually wait until the first logo screen and it's not crashing in this situation. This is because the block causing the assertion error is already recompiled as part of another block.
Anyway it's now 100% reproducible so I can start investigating!

@Gillou68310
Copy link
Member

Ok the crash happens because we are compiling junk! Speculative precompilation is disabled very early for this game (stop_after_jal=1) but if you do a loadstate before it has been detected, when recompiling block address 0x8037ecd0 the dynarec will try to recompile the code located after a JAL instruction which in this case is junk.

@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

Oh, nice. Is there any way to make that flag part of the save state?

@fzurita fzurita closed this as completed Aug 26, 2016
@fzurita fzurita reopened this Aug 26, 2016
@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

Whoops, fat finger in mobile.

@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

If we were to check all non-implemented instructions by checking to make sure it's one of the implemented ones, wouldn't that make that if statement very long? Is there a better way of doing that?

@Gillou68310
Copy link
Member

Gillou68310 commented Aug 26, 2016

Is there a better way of doing that?

if(itype[i]==NI) means that the current MIPS opcode is not valid (at least for that processor version) and should be considered as junk

@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

Oh, gotcha. Sounds like that's an easy fix. The bigger question is if it's the right fix.

@Gillou68310
Copy link
Member

Gillou68310 commented Aug 26, 2016

It's unlikely to brake anything else, the real question is will this fix have an impact on performance. Maybe (itype[i]==NI) is happening quite often disabling speculative precompilation in almost every scenarios.

@Gillou68310
Copy link
Member

Ok I think you can provide a build to mupen64plus-ae testers with stop_after_jal set to 1 in new_dynarec_init to see the impact of speculative precompilation on performance.

@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

Ok

@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

That change fixes this issue?

Edit: Missed your previous comment, yes it would.

@fzurita
Copy link
Member Author

fzurita commented Aug 26, 2016

I tested that I don't really see any performance difference at all. I'll give the update to willing/unwilling testers and see what they think.

So little time is spent in the core compared to the graphics that I doubt it will make much difference.

@fzurita
Copy link
Member Author

fzurita commented Aug 27, 2016

Retroben tried it in his usual performance tests. He didn't see a difference in performance either.

He did mention that this cheat code on DK64 freezes the name though.
816F1156 0008
816F115E 0014

And that PJ64 works with it in interpreter mode.

@retrobenny
Copy link

retrobenny commented Aug 28, 2016

Hullo! ✋
Edit: Tip,you can add a mention by using my name on here with the at symbol behind it,and I can reach the page via notification of the mention.

@Gillou68310
Copy link
Member

Well actually I was talking about cheat codes that work with mupen's interpreter and don't work with new dynarec!

@Gillou68310
Copy link
Member

Retroben tried it in his usual performance tests. He didn't see a difference in performance either.

Ok let's set stop_after_jal to 1 by default. I think this should be done with a core config parameter.

@retrobenny
Copy link

retrobenny commented Aug 30, 2016

I need to test Groggy again via GLideN64 to see if he crashes without legacy blending enabled.

Gillou68310 pushed a commit to Gillou68310/mupen64plus-core that referenced this issue Aug 30, 2016
Gillou68310 pushed a commit to Gillou68310/mupen64plus-core that referenced this issue Aug 30, 2016
@fzurita
Copy link
Member Author

fzurita commented Aug 31, 2016

Ok, there is a problem with this. With stop_after_jal=1 Blast corps doesn't get past the opening N64 logo.

@Gillou68310
Copy link
Member

Gillou68310 commented Aug 31, 2016

For some reason @Ari64 also stops speculative precompilation after a SYSCALL.
if(i>0&&itype[i-1]==SYSCALL&&stop_after_jal) done=1;

The problem is when a JAL is located just behind the SYSCALL you get a branch at end of block with no delay slot.

  0x8029e624: ADDIU r21,r0,20
...
* 0x8029e6b4: ADDIU r1,r0,1
  0x8029e6b8: BEQ r30,r1,8029e6d4
  0x8029e6bc: SLL r0,r0,0
* 0x8029e6c0: SYSCALL
* 0x8029e6c4: JAL 8029ef80 <- end of block

Since SYSCALL doesn't have a delay slot I think we can safely end the block after the SYSCALL.
if(i>0&&itype[i]==SYSCALL&&stop_after_jal) done=1;

@fzurita
Copy link
Member Author

fzurita commented Aug 31, 2016

Ok. I'm glad you are really good at finding these issues.

@Gillou68310
Copy link
Member

Well actually issues triggering asserts are the easy ones. This one #174 is going to be harder ;-)

@fzurita
Copy link
Member Author

fzurita commented Aug 31, 2016

Oh, yeah, that will be tough I'm sure.

@retrobenny
Copy link

Oops,forgot to say it happily still doesn't crash on Groggy anymore using GLideN64 with non-legacy blending,maybe make that change only affect certain games,one being Banjo-Tooie obviously.

Slightly off topic but,too bad performance is still sucky though,even at 640x480 at the most optimal and sacrificial settings on Conker,but at least the Buga's Knuts pixelated defeat scene is still smooth.
Maybe we need to get Nvidia to add Mupen64Plus AE/FZ to the profiles list like Dolphin Emu already has before the big update comes out for Shield TV so it comes with it.
Primarily to use from 600Mhz to 900Mhz of GPU clock rate on their X1 chip.
Does GLideN64 even utilize the 256 cores within it for multi-threading?

@fzurita
Copy link
Member Author

fzurita commented Aug 31, 2016

With fragment shaders, the driver makes the decision about how many copies to use. Usually it will use as many as it can.

With CUDA/OpenCL you have fine grained control about how many to use.

Gillou68310 pushed a commit to Gillou68310/mupen64plus-core that referenced this issue Sep 11, 2016
@richard42
Copy link
Member

This should be fixed with the pull request that I just merged from Gillou68310. fzurita, can you build the latest and test?

@fzurita
Copy link
Member Author

fzurita commented Sep 14, 2016

Yes, I will test this tomorrow.

@fzurita
Copy link
Member Author

fzurita commented Sep 14, 2016

It's fixed.

@fzurita fzurita closed this as completed Sep 14, 2016
fzurita added a commit to mupen64plus-ae/mupen64plus-ae that referenced this issue Sep 14, 2016
mupen64plus/mupen64plus-core@5fa30ab

*   5fa30ab Merge pull request #182 from Gillou68310/new_dynarec_fix
|\
| * bbf2d25 new_dynarec: Dynamically allocate memory when copying original MIPS source code of a recompiled block. Previously this was done using a 2MB circular buffer. From my testing the new design usually use ~1MB Memory is freed when all dirty entry points of a recompiled block have been removed from the dirty list. The number of dirty entry points is stored in an additional word at the end of the dynamically allocated buffer and decremented every time a dirty entry point is removed from the dirty list.
| * 987c21c new_dynarec: dynamic_linker/_ds and get_addr/_32 now always return clean addresses When jumping to a dirty address verify_code* is called in order to make sure the block is still valid. Or when selecting an address from the dirty list, we already now that the address is still valid because verify_dirty returns true.
| * 540a1b8 new_dynarec: End block after SYSCALL when speculative precompilation disabled. SYSCALL doesn't have a delay slot, no need to have another instruction at the end of the block.
| * dec5b7a new_dynarec: Disable speculative precompilation by default This fixes issue mupen64plus/mupen64plus-core#170
| * 447ee55 new_dynarec: Saturate left shift amount, in the range 0 to 63
| * ba06688 new_dynarec: Discard divide by zero MIPS does not have a divide by zero exception. This fixes issue mupen64plus/mupen64plus-core#179
| * fb023df new_dynarec: Fixed invalidate_block get_bounds returns physical address not virtual address
| * 811a1da new_dynarec: Fixed trap writes to compiled pages
* |   0d30e1b Merge pull request #180 from charlemagnelasse/master
|\ \
| |/
|/|
| * b0e1301 Remove unused Travis CI define __extern_always_inline
| * 3666f1f Use Ubuntu Trusty as base system for Travis CI
|/
*   2cffe8c Merge pull request #168 from TheFlagCourier/master
|\
| * bff82e8 Update README
| * f4bced5 Create README.md to replace README.txt
|/
*   725ba74 Merge branch 'master' of https://github.com/mupen64plus/mupen64plus-core
|\
| * e9e7772 change VS2013 project files to build with the XP-compatible vc120 toolset, so the resulting binaries will run on pre-vista machines
* | 986c228 bugfix: if the video plugin requests a Core opengl context, then the SDL2 wrapper (used on systems building against SDL2) should not force a compatibility context right before creating the context
|/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants