Skip to content
This repository has been archived by the owner on Feb 29, 2024. It is now read-only.

why in zcmp ra is placed in lowest address slot of GPR save area? #182

Closed
JoshCafe opened this issue Sep 19, 2022 · 8 comments
Closed

why in zcmp ra is placed in lowest address slot of GPR save area? #182

JoshCafe opened this issue Sep 19, 2022 · 8 comments

Comments

@JoshCafe
Copy link

JoshCafe commented Sep 19, 2022

hi

I found in GCC ra is placed in highest address slot of GPR save area in stack frame.
image
Same is obseved when -msave-restore is enabled.

However, in zcmp ra is placed in lowest address slot.
image

The reserved storage order causes different cfi offset of GPRs saved in GC/msave-restore cases and zcmp case.
May i know why in zcmp ra is placed in lowest address slot of GPR save area? and can we align it to GC/msave-restore cases?

BR,
Josh

@JoshCafe JoshCafe changed the title why why in zcmp ra is placed in highest address slot? Sep 19, 2022
@JoshCafe JoshCafe changed the title why in zcmp ra is placed in highest address slot? why in zcmp ra is placed in lowest address slot of GPR save area? Sep 19, 2022
@linsinan1995
Copy link
Contributor

iirc, there is no regulation for the storage order (see issue).

In the current GCC implementation, the storage order is supposed to be the same as save-restore (the placement order for push is ra, s0, s1, ..., s11) and have the same CFI order as save-restore, since this part in GCC was adapted from save-restore.

@JoshCafe
Copy link
Author

JoshCafe commented Sep 20, 2022

iirc, there is no regulation for the storage order (see issue).

In the current GCC implementation, the storage order is supposed to be the same as save-restore (the placement order for push is ra, s0, s1, ..., s11) and have the same CFI order as save-restore, since this part in GCC was adapted from save-restore.

Thanks for providing the previously discussed issue thread. (see #166).

From tariqkurd-repo's reply, the bytes in the block of memory can be accessed in any order - but they must be read from the correct locations.

The toolchain you mentioned has no functional issue in running push and pop pairs, but has issues in debugging. Because the CFI generated from toolchain could not read from the correct location.

Toolchain has to generate CFI offset info as per the zcmp spec, or the spec needs to reverse the push order.

@tariqkurd-repo, could you please clarify here also, Thanks.

@tariqkurd-repo
Copy link
Contributor

what's the problem here? I don't know what CFI means.
Zcmt is frozen, and will soon go to public review, so it's very unlikely to change.

@abukharmeh
Copy link
Contributor

abukharmeh commented Sep 22, 2022

@JoshCafe I think you meant Zcmp not Zcmt and Call Frame Information (Part of DWARF) ?, I am not sure how this matters when they are single instructions ?

@abukharmeh
Copy link
Contributor

abukharmeh commented Sep 22, 2022

It feels like this is needed for debugger to unwind stack frame, right ? https://stackoverflow.com/questions/24462106/what-do-the-cfi-directives-mean-and-some-more-questions I don't see any issue with the compiler generating different CFI information to match the implementation ? IIRC the order was set like this to optimise for interruptibility!

@JoshCafe
Copy link
Author

JoshCafe commented Sep 23, 2022

It feels like this is needed for debugger to unwind stack frame, right ? https://stackoverflow.com/questions/24462106/what-do-the-cfi-directives-mean-and-some-more-questions I don't see any issue with the compiler generating different CFI information to match the implementation ? IIRC the order was set like this to optimise for interruptibility!

@JoshCafe I think you meant Zcmp not Zcmt and Call Frame Information (Part of DWARF) ?, I am not sure how this matters when they are single instructions ?

yes. I mean Zcmp and Call Frame Information (Part of DWARF) which is needed for debugger to unwind stack frame.

https://sourceware.org/binutils/docs/as/CFI-directives.html has a breif description of the CFI directives.

If user compiles with -g compiler option, CFI-directives will be generated. As cm.push grows the stack and stores GPRs into it, .cfi_adjust_cfa_offset offset and multiple .cfi_offset register, offset will be generated accordingly.

let me reuse example from @Silabs-ArjanB(see #166)

This is the order from Zcmp spec:
sw s5, -4(sp)
sw s4, -8(sp)
sw s3,-12(sp)
sw s2,-16(sp)
sw s1,-20(sp)
sw s0,-24(sp)
sw ra,-28(sp)

This is the order current toolchain implements. It alligns with what GCC does for save-restore from @linsinan1995 .
sw ra, -4(sp)
sw s0, -8(sp)
sw s1,-12(sp)
sw s2,-16(sp)
sw s3,-20(sp)
sw s4,-24(sp)
sw s5,-28(sp)

If a crash happens between cm.push and cm.popret, the CFI info will be used to unwind stack frame. In this case, offset -4, as in sw ra, -4(sp), will be used as the stack frame offet to take ra, while in hardware, as per Zcmp spec, offset -4 is actually the slot for s5. So this mismatch will cause an error in unwinding the stack frame.

As @tariqkurd-repo mentioned, Zcmp spec if frozen, we has to take this issue into consideration in toolchain.

@tariqkurd-repo
Copy link
Contributor

Hi @JoshCafe , you're right, the toolchain will have to be updated to support this.

@tovine
Copy link
Contributor

tovine commented Sep 23, 2022

I don't remember what the justification was (if any) for the current ordering, but what would be the point against aligning with the ordering from save-restore in the toolchain (obviously apart from the fact that it's frozen)?
Guess it depends on what's easier to change - spec or toolchain and if all toolchains do it the same way or not, but I can definitely see the benefit of having a common alignment on this across the instructions and the software equivalents.

tclin914 added a commit to llvm/llvm-project that referenced this issue Aug 2, 2023
…push.

Issue mentioned: riscvarchive/riscv-code-size-reduction#182

The order of callee-saved registers stored by Zcmp push in memory is reversed.

Pseudo code for cm.push in https://github.com/riscv/riscv-code-size-reduction/releases/download/v1.0.4-1/Zc.1.0.4-1.pdf

```
if (XLEN==32) bytes=4; else bytes=8;

addr=sp-bytes;
for(i in 27,26,25,24,23,22,21,20,19,18,9,8,1)  {
  //if register i is in xreg_list
  if (xreg_list[i]) {
    switch(bytes) {
      4:  asm("sw x[i], 0(addr)");
      8:  asm("sd x[i], 0(addr)");
    }
    addr-=bytes;
  }
}
```

The placement order for push is s11, s10, ..., ra.

CFI offset should be calculed as reversed order for correct stack unwinding.

Reviewed By: fakepaper56, kito-cheng

Differential Revision: https://reviews.llvm.org/D156437
tclin914 added a commit to tclin914/llvm-project that referenced this issue Aug 2, 2023
…push.

Issue mentioned: riscvarchive/riscv-code-size-reduction#182

The order of callee-saved registers stored by Zcmp push in memory is reversed.

Pseudo code for cm.push in https://github.com/riscv/riscv-code-size-reduction/releases/download/v1.0.4-1/Zc.1.0.4-1.pdf

```
if (XLEN==32) bytes=4; else bytes=8;

addr=sp-bytes;
for(i in 27,26,25,24,23,22,21,20,19,18,9,8,1)  {
  //if register i is in xreg_list
  if (xreg_list[i]) {
    switch(bytes) {
      4:  asm("sw x[i], 0(addr)");
      8:  asm("sd x[i], 0(addr)");
    }
    addr-=bytes;
  }
}
```

The placement order for push is s11, s10, ..., ra.

CFI offset should be calculed as reversed order for correct stack unwinding.

Reviewed By: fakepaper56, kito-cheng

Differential Revision: https://reviews.llvm.org/D156437
tru pushed a commit to llvm/llvm-project-release-prs that referenced this issue Aug 3, 2023
…push.

Issue mentioned: riscvarchive/riscv-code-size-reduction#182

The order of callee-saved registers stored by Zcmp push in memory is reversed.

Pseudo code for cm.push in https://github.com/riscv/riscv-code-size-reduction/releases/download/v1.0.4-1/Zc.1.0.4-1.pdf

```
if (XLEN==32) bytes=4; else bytes=8;

addr=sp-bytes;
for(i in 27,26,25,24,23,22,21,20,19,18,9,8,1)  {
  //if register i is in xreg_list
  if (xreg_list[i]) {
    switch(bytes) {
      4:  asm("sw x[i], 0(addr)");
      8:  asm("sd x[i], 0(addr)");
    }
    addr-=bytes;
  }
}
```

The placement order for push is s11, s10, ..., ra.

CFI offset should be calculed as reversed order for correct stack unwinding.

Reviewed By: fakepaper56, kito-cheng

Differential Revision: https://reviews.llvm.org/D156437
doru1004 pushed a commit to doru1004/llvm-project that referenced this issue Aug 3, 2023
…push.

Issue mentioned: riscvarchive/riscv-code-size-reduction#182

The order of callee-saved registers stored by Zcmp push in memory is reversed.

Pseudo code for cm.push in https://github.com/riscv/riscv-code-size-reduction/releases/download/v1.0.4-1/Zc.1.0.4-1.pdf

```
if (XLEN==32) bytes=4; else bytes=8;

addr=sp-bytes;
for(i in 27,26,25,24,23,22,21,20,19,18,9,8,1)  {
  //if register i is in xreg_list
  if (xreg_list[i]) {
    switch(bytes) {
      4:  asm("sw x[i], 0(addr)");
      8:  asm("sd x[i], 0(addr)");
    }
    addr-=bytes;
  }
}
```

The placement order for push is s11, s10, ..., ra.

CFI offset should be calculed as reversed order for correct stack unwinding.

Reviewed By: fakepaper56, kito-cheng

Differential Revision: https://reviews.llvm.org/D156437
razmser pushed a commit to razmser/llvm-project that referenced this issue Sep 8, 2023
…push.

Issue mentioned: riscvarchive/riscv-code-size-reduction#182

The order of callee-saved registers stored by Zcmp push in memory is reversed.

Pseudo code for cm.push in https://github.com/riscv/riscv-code-size-reduction/releases/download/v1.0.4-1/Zc.1.0.4-1.pdf

```
if (XLEN==32) bytes=4; else bytes=8;

addr=sp-bytes;
for(i in 27,26,25,24,23,22,21,20,19,18,9,8,1)  {
  //if register i is in xreg_list
  if (xreg_list[i]) {
    switch(bytes) {
      4:  asm("sw x[i], 0(addr)");
      8:  asm("sd x[i], 0(addr)");
    }
    addr-=bytes;
  }
}
```

The placement order for push is s11, s10, ..., ra.

CFI offset should be calculed as reversed order for correct stack unwinding.

Reviewed By: fakepaper56, kito-cheng

Differential Revision: https://reviews.llvm.org/D156437
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants