Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect multiplication behavior on last cycle prior to next multiplication/division #256

Open
KungFuFurby opened this issue Aug 31, 2022 · 5 comments

Comments

@KungFuFurby
Copy link

KungFuFurby commented Aug 31, 2022

Forwarded from the NESDev forums. Tested on my end using the latest master commit, 6fc6bf1 .

According to regiscaelus...

  • On write to WRMPYB ($4203), always clear RDMPY ($4216-$4217), but only start a new multiplication if the current multiplication is complete (> 8 cycles).
  • RDDIV ($4214-$4215) is updated with WRMPYA ($4202) & WRMPYB ($4203) if multiplication is complete or on the last cycle (>7 cycles).
  • On cycle 8, multiplication should not be started and RDDIV ($4214-$4215) should only get updated once with WRMPYA ($4202) & WRMPYB ($4203.
  • On write to WRDIVB ($4206),

It looks like the cycle 16 case is not emulated correctly on division, and the cycle 8 case on multiplication is also not emulated correctly.

Two test ROMs were provided: one by Myself086, and one by Undisbeliever.

@Max833
Copy link

Max833 commented Aug 31, 2022

Unfortunatly, this project is pretty much dead. But thank you for the report. Maybe ares can do something with that info.

@carmiker
Copy link
Contributor

I will take a look tonight and see if I can come up with a working solution.

@KungFuFurby
Copy link
Author

Sure! Then this fix can be ported to the other two forks (particularly ares, since that's still actively maintained to my knowledge).

@carmiker
Copy link
Contributor

Here is a draft solution against the master branch of this repo which could probably be cleaned up a bit, but it appears to have correct behaviour. If anyone would like to test this to make sure it does not introduce regressions, that would be wonderful. I also have a solution for bsnes-jg which is posted on the nesdev forum (pending approval).

diff --git a/bsnes/sfc/cpu/cpu.hpp b/bsnes/sfc/cpu/cpu.hpp
index 19a486be..bf68bea3 100644
--- a/bsnes/sfc/cpu/cpu.hpp
+++ b/bsnes/sfc/cpu/cpu.hpp
@@ -163,7 +163,9 @@ private:
 
   struct ALU {
     uint mpyctr = 0;
+    uint mpylast = 0;
     uint divctr = 0;
+    uint divlast = 0;
     uint shift = 0;
   } alu;
 
diff --git a/bsnes/sfc/cpu/io.cpp b/bsnes/sfc/cpu/io.cpp
index 3338051c..99d81791 100644
--- a/bsnes/sfc/cpu/io.cpp
+++ b/bsnes/sfc/cpu/io.cpp
@@ -157,8 +157,10 @@ auto CPU::writeCPU(uint addr, uint8 data) -> void {
     io.rddiv = io.wrmpyb << 8 | io.wrmpya;
 
     if(!configuration.hacks.cpu.fastMath) {
-      alu.mpyctr = 8;  //perform multiplication over the next eight cycles
-      alu.shift = io.wrmpyb;
+      if (!alu.mpylast) {
+        alu.mpyctr = 8;  //perform multiplication over the next eight cycles
+        alu.shift = io.wrmpyb;
+      }
     } else {
       io.rdmpy = io.wrmpya * io.wrmpyb;
     }
@@ -176,12 +178,14 @@ auto CPU::writeCPU(uint addr, uint8 data) -> void {
     io.rdmpy = io.wrdiva;
     if(alu.mpyctr || alu.divctr) return;
 
-    io.wrdivb = data;
-
     if(!configuration.hacks.cpu.fastMath) {
-      alu.divctr = 16;  //perform division over the next sixteen cycles
-      alu.shift = io.wrdivb << 16;
+      if (!alu.divlast) {
+        io.wrdivb = data;
+        alu.divctr = 16;  //perform division over the next sixteen cycles
+        alu.shift = io.wrdivb << 16;
+      }
     } else {
+      io.wrdivb = data;
       if(io.wrdivb) {
         io.rddiv = io.wrdiva / io.wrdivb;
         io.rdmpy = io.wrdiva % io.wrdivb;
diff --git a/bsnes/sfc/cpu/timing.cpp b/bsnes/sfc/cpu/timing.cpp
index e443e5a5..3df7e35b 100644
--- a/bsnes/sfc/cpu/timing.cpp
+++ b/bsnes/sfc/cpu/timing.cpp
@@ -142,13 +142,19 @@ auto CPU::scanline() -> void {
 auto CPU::aluEdge() -> void {
   if(alu.mpyctr) {
     alu.mpyctr--;
+    if (!alu.mpyctr)
+      alu.mpylast = 1;
     if(io.rddiv & 1) io.rdmpy += alu.shift;
     io.rddiv >>= 1;
     alu.shift <<= 1;
   }
+  else
+    alu.mpylast = 0;
 
   if(alu.divctr) {
     alu.divctr--;
+    if (!alu.divctr)
+      alu.divlast = 1;
     io.rddiv <<= 1;
     alu.shift >>= 1;
     if(io.rdmpy >= alu.shift) {
@@ -156,6 +162,8 @@ auto CPU::aluEdge() -> void {
       io.rddiv |= 1;
     }
   }
+  else
+    alu.divlast = 0;
 }
 
 auto CPU::dmaEdge() -> void {

@orbea
Copy link
Contributor

orbea commented Sep 3, 2022

For reference here is the fix for the jgemu bsnes fork.

https://gitlab.com/jgemu/bsnes/-/commit/ff43a0d611ae99e306af6050a9de062fcf7bc043

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants