New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory refactorings #313
Memory refactorings #313
Conversation
I tried Super Mario and Smash Bros, neither booted with x64 dynarec or either interpreter, it didn't crash, but CPU was at 100%, here is what GDB had if I Ctrl+C'ed out of the program:
I tested this using rsp-cxd4 in HLE video mode |
Also, malloc'ing 512MB of RAM will be pretty tight on some ARM devices. A lot of Android devices only have 2GB of RAM, my Shield Tablet has 2GB and it says only 487 is free right now (although I assume it would just close some old apps if an app requested memory, so it's probably not a problem). The Raspberry Pi also has 1GB of RAM. Is there some way to have a "fallback mode"? Either if the malloc fails or if the user requests it. I'm really excited for fast memory access so I think it should go forward, but it might mean that performance actually suffers on low memory devices |
Normally the kernel should lazy alloc it, so only pages that are really accessed are allocated. I guess it should be no problem. |
edb6c9d
to
c55fb8a
Compare
Several fixes which re-enable the cached interpreter and improve pifbootrom hle implementation. |
Crashing is fixed for me, tested several games on x64 dynarec and Cached/Pure Interpreter, seems to be working well now |
Just tried compiling x86 new dynarec, I get this error on linking:
I'm not sure if the error is a result of a commit from this PR or the previous one |
New_dynarec requires some more work unfortunately. I will need help from Gillou on this, because I'm not familiar with how the new_dynarec does code invalidation, and TLB translation. |
I've tested the Pure Interpreter inside my VMware machine. In my testing this PR saves about 2% CPU time in Pure Interpreter mode on average. Sometimes it was pretty much the same, I would just run the same scene in both versions and compare what I saw in "top", so the testing wasn't anything too scientific. I don't see any regressions anyway. |
Good ! Thanks for testing ! 2% now, hope for some more with full fast mem 😀 |
This also validate the big allocation approach. So this is good. |
My dream would be to get to the point where the cached interpreter would be fast enough on mobile devices to get rid of the dynarec's altogether (but this is probably an unrealistic dream). I think having an arm dynarec without anyone with enough time or knowledge to maintain it is really going to hold things back. |
I feel the same way ! Only one person knows how to work with the new dynarec... And its not me. So all my refactorings take much more time than needed. I have hope (and maybe a long term plan) to unify the pure,cached and both dynarec. But this will take more time than I have now. |
Tried something to fix the new dynarec. Don't know if it will work, but it's a step closer I guess. |
What if only 128MB + ROM size is used and a catch for PIF access? Also how does Linux handle lazy allocation if MAP_LOCKED is used with mmap() to try and reduce page faults? |
Not really sure, but from what I undestand it allocates 512MB of virtual memory, then when accesses are made physical pages are allocated when needed. Of all of these 512MB only ram_size+rom_size+rsp_mem_size+pif_boot_rom_size+pif_ram_size+rounding to next physical page size will get physically allocated (approximately). |
build error with arm + RPI3:
this might be a compilation options issue with -fPIC, but by default I get this. |
I briefly tested a few games with the x86 new dynarec and it seems to be working properly. EDIT: Actually I must have been pointing to the wrong thing, I still get this with x86 new dynarec:
|
3871a9f
to
896a51e
Compare
Updated with 2 fixes :
Thank you guys for testing :) |
builds! but segfaults for me. unfortunately i think the backtrace is not useful:
went back to master and everything worked ok via gdb. that's all i've got, sorry :( |
@dankcushions Thanks for testing. |
1de7316
to
ad3d781
Compare
4f0b5df
to
bd42118
Compare
Rebased against master, includes the changes from PR remove_memd, cleaned up commits. |
…,dword}_in_memory This slipped through while rebasing.
762b38b
to
bfb3842
Compare
The allocated memory is big enough (512M) to hold all the directly accessible physical memory an N64 system have. By having a single allocated block of memory we don't have to if/switch for accessing ram, sp_mem, pif_ram and cart_rom in fast_mem_access. This is also a first step toward implementation of "fast memory".
I broke it when introducing opaque data to read/write memory handlers. The issue was that there is a single opaque pointer for both read and write function, whereas read and write breakpoints can be activated independently. This was leading to a situation where wrong opaque pointers was passed to read/write handlers.
bfb3842
to
64bf012
Compare
Rebased against master. Finally, this PR should be ready for merging. |
@bsmiles32 something in this PR broke the reset functionality |
It was introduced by 1c01c23 |
DON'T MERGE !
This is a WIP, with several refactorings concerning mainly the memory module :
edit:
It also breaks the cached_interpreter, don't know why. AndI didn't really bother with the new_dynarec so it should break it quite badly.When I get some time, I will try to rework it and put it into shape. But early testers are welcome :)
Especially, I'd like to know if it improves the emulation speed on other computers without bad regressions.