Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JALR causes wrong behavior on AsmMachine when rs1 and rd use the same register. #92

Closed
jjyr opened this issue Dec 17, 2019 · 3 comments · Fixed by #99
Closed

JALR causes wrong behavior on AsmMachine when rs1 and rd use the same register. #92

jjyr opened this issue Dec 17, 2019 · 3 comments · Fixed by #99
Labels
t:bug

Comments

@jjyr
Copy link

@jjyr jjyr commented Dec 17, 2019

Describe the bug

The following instruction will cause different behavior on DefaultMachine(rust) and AsmMachine(asm).

   1021a:	00000097          	auipc	ra,0x0
   1021e:	fd4080e7          	jalr	-44(ra) # 0x101ee

Let's focus on the instruction 1021e. We expect this instruction to jump to rs1(which is ra) - 14 and set the pc + 4 to the rd(which is ra) register. The DefaultMachine handles this as expected. But unfortunately, the AsmMachine set the registers in a wrong ordering:

https://github.com/nervosnetwork/ckb-vm/blob/develop/src/machine/asm/execute.S#L441

AsmMachine set rd first, then read from rs1, when the rs1 and rd use the different registers this behavior is correct, but if the rs1 and rd use the same register then the WRITE_RD operation overwrite the value of rs1.

How to fix

To simply fix this problem, we can move the WRITE_RD to the after of the REGISTER_ADDRESS(RS1).

The change may cause the network consensus split. We need to wait for a hard-fork to apply it.

@jjyr jjyr added m:consensus t:bug labels Dec 17, 2019
@jjyr jjyr changed the title JALR causes wrong behavior on ASMMachine when rs1 and rd use the same register. JALR causes wrong behavior on AsmMachine when rs1 and rd use the same register. Dec 17, 2019
@xxuejie
Copy link
Member

@xxuejie xxuejie commented Dec 18, 2019

First, I've checked that all the system scripts do not have contain code that can trigger this bug, the problem is that gcc would optimize the following code:

   1021a:	00000097          	auipc	ra,0x0
   1021e:	fd4080e7          	jalr	-44(ra) # 0x101ee

into the following one instruction:

   jal ra, 0x101ee

jal won't trigger the above bug, hence the behavior we will observe here is: for programs compiled with gcc, CKB VM can run them perfectly, but for programs compiled via LLVM(it needs to be checked if latest LLVM has already fixed this), this bug might be encountered.

But this optimization is missing at least in the LLVM version used in Rust right now.

A proper fix here will need consensus change, hence we can only wait till the time a hardfork is performed. However, for the moment we can build a separate patcher that applies the transformation above manually, the patched binary will work both right now, and when the proper fix is made.

@xxuejie
Copy link
Member

@xxuejie xxuejie commented Dec 18, 2019

This tool provides a workaround for now that patches RISC-V binary so it won't use the code sequence that is triggering this bug. It can make the code work for now till one of the following conditions are met:

  1. This bug is resolved in a future fork
  2. The compiler adds linker optimization and stops generating this code sequence

@doitian doitian added this to 📬 Needs triage in CKB - Issues Dec 23, 2019
@doitian doitian moved this from 📬 Needs triage to ❄️ Icebox in CKB - Issues Dec 23, 2019
@doitian doitian moved this from ❄️ Icebox to 🔙 Backlog in CKB - Issues Jan 8, 2020
@doitian doitian moved this from 📬 Needs triage to ⚙️Chores in CKB - Issues Jan 8, 2020
@doitian doitian moved this from ⚙️Chores to 📬 Needs triage in CKB - Issues Jan 8, 2020
@doitian doitian moved this from 📬 Needs triage to ❄️Icebox in CKB - Issues Jan 8, 2020
@doitian doitian added this to Next Release in CKB Upgrade Plan via automation Jan 13, 2020
@doitian doitian moved this from ❄️ Icebox to 🍴 Next Fork in CKB - Issues Jan 15, 2020
xxuejie added a commit to xxuejie/ckb-vm that referenced this issue Feb 26, 2020
@xxuejie
Copy link
Member

@xxuejie xxuejie commented Feb 26, 2020

ASM & AOT mode fixed in #99
AOT mode will need a separate PR

xxuejie added a commit to xxuejie/ckb-vm that referenced this issue Mar 6, 2020
xxuejie added a commit to xxuejie/ckb-vm that referenced this issue Aug 19, 2020
@xxuejie xxuejie closed this in #99 Aug 27, 2020
CKB - Issues automation moved this from 🍴 Next Fork to ✅ Closed Aug 27, 2020
xxuejie added a commit that referenced this issue Aug 27, 2020
* feat: Introduce VM version

Since one major use case of CKB VM is on blockchains, we cannot just
change the code in case of bugs. We have to wait for the next fork
event, only then can we upgrade the code. Hence we are introducing a
concept of versions in the VM, whenever we fix a bug that might alter
behavior, we will insert checking code to only include the bug fix in
the next VM version. This way blockchains can expect consistent VM
behavior when they lock specific VM version.

* fix: #97 #98 in version 1

* fix: #92 in version 1

* fix: Reading the last byte of memory would trigger OutOfBound error

* fix: Only compile version tests when we do have asm mode

* fix: Bench

* fix: Bug fixes for AOT mode

* fix: VM Test Suite version

* fix: Loading binary with unaligned executable page would trigger error

* fix: Replicate the same bugs in Rust interpreter as well.

* test: Add missing alignment tests

* fix: Default run method should always use the latest non-buggy code

* test: Remove unneeded tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t:bug
Projects
CKB - Issues
✅ Closed
CKB Upgrade Plan
  
Next Upgrade
Development

Successfully merging a pull request may close this issue.

3 participants