Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arch/riscv64: Add basic mcount tracing support for RISC-V 64bit #1815

Merged
merged 3 commits into from
Oct 15, 2023

Conversation

gichoel
Copy link
Contributor

@gichoel gichoel commented Aug 29, 2023

This pull request is a porting effort to enable successful, error-free builds on the RISC-V 64-bit architecture environment and to support tracing using the mcount function.

This feature has also been tested on the VisionFive2 RISC-V board.

The following features are currently supported in the PR.

  • Basic mcount Tracing
  • Argument Handling

The following features are not yet supported

  • PLT hooking (this feature is in progress)
  • Dynamic tracing

This PR was born out of this issue.


This work has been discussed previously with @honggyukim in the PR link below.

The Argument Handling feature was split into the PRs below and then merged back together.

@gichoel
Copy link
Contributor Author

gichoel commented Aug 29, 2023

If you are interested in doing basic tracing with the mcount function in a RISC-V environment, please see the branch at the link below.

@SEOKMIN83
Copy link
Contributor

SEOKMIN83 commented Aug 30, 2023

If you are interested in doing basic tracing with the mcount function in a RISC-V environment, please see the branch at the link below.

I tried basic tracing on RISC-V with your implementation,
and it worked fine.

The result of basic test is below.

sm_ple38@starfive:~/uftrace$ ./uftrace --libmcount-path=. -v --no-libcall t-abc
uftrace: running uftrace  ( riscv64 dwarf python3 tui perf sched dynamic )
uftrace: checking binary t-abc
uftrace: skipping perf event due to error: Function not implemented
uftrace: creating 1 thread(s) for recording
uftrace: using ./libmcount/libmcount.so library for tracing
mcount: initializing mcount library
mcount: mcount setup done
mcount: new session started: 51b82f567d330d82: t-abc
mcount: mcount trace finished
mcount: exit from libmcount
uftrace: child terminated with exit code: 8
uftrace: reading /tmp/uftrace-live-iGLq3T/task.txt file
uftrace: flushing /uftrace-51b82f567d330d82-234695-000
uftrace: live-record finished.. 
uftrace: start live-replaying...
uftrace: reading /tmp/uftrace-live-iGLq3T/task.txt file
fstack: fixup for some special functions
uftrace: add field "duration"
uftrace: add field "tid"
# DURATION     TID     FUNCTION
            [234695] | main() {
            [234695] |   a() {
            [234695] |     b() {
   5.250 us [234695] |       c();
  14.000 us [234695] |     } /* b */
  17.500 us [234695] |   } /* a */
  23.251 us [234695] | } /* main */
uftrace: removing /tmp/uftrace-live-iGLq3T directory

Thank you so much for making uftrace usable
in the RISC-V environment.

@SEOKMIN83
Copy link
Contributor

In my opinion,
it would be more useful to have a detailed manual
on how to do basic tracing tests in a RISC-V environment.

@gichoel
Copy link
Contributor Author

gichoel commented Aug 30, 2023

Thanks for the good point @SEOKMIN83. I had forgotten the part about needing a way to run this.

Currently, this task can be executed as follows after building from the root directory of the uftrace source code.

  1. compile the target program
  • Using gcc

    $ gcc -o t-abc -pg tests/s-abc.c
    
  • Using clang

    $ clang -o t-abc -pg tests/s-abc.c`
    
  1. Trace the target program with uftrace
  • Run uftrace with Target Program
    $ ./uftrace --libmcount-path=. --no-libcall t-abc
    

@SEOKMIN83
Copy link
Contributor

Thanks for the good point @SEOKMIN83. I had forgotten the part about needing a way to run this.

Currently, this task can be executed as follows after building from the root directory of the uftrace source code.

  1. compile the target program
  • Using gcc
    $ gcc -o t-abc -pg tests/s-abc.c
    
  • Using clang
    $ clang -o t-abc -pg tests/s-abc.c`
    
  1. Trace the target program with uftrace
  • Run uftrace with Target Program
    $ ./uftrace --libmcount-path=. --no-libcall t-abc
    

Thank you☺️☺️🌸🌸
It'll be great help to those who want to test it.

@gichoel
Copy link
Contributor Author

gichoel commented Aug 31, 2023

I believe the mcount_return code can be reduced to the following after testing.

diff --git a/arch/riscv64/mcount.S b/arch/riscv64/mcount.S
index 91d677bb..498f38e0 100644
--- a/arch/riscv64/mcount.S
+++ b/arch/riscv64/mcount.S
@@ -59,20 +59,12 @@ ENTRY(mcount_return)
        sd fp, 8(sp)
        addi fp, sp, 24

        /* save return values */
        sd a0, 0(sp)

        /* set the first argument of mcount_exit as pointer to return values */
        addi a0, sp, 0

        /* call mcount_exit func */
        call mcount_exit

-       mv t1, a0
-
-       /* restore return values */
-       ld a0, 0(sp)
-
        /* restore frame pointer */
        ld s0, 8(sp)
        ld ra, 16(sp)
@@ -80,5 +72,5 @@ ENTRY(mcount_return)
        addi sp, sp, 24

        /* call return address */
-       jr t1
+      jr a0
 END(mcount_return)

If you try this code and see any issues, please let me know in the comments.

@gichoel gichoel changed the title arch/riscv64: Add basic mcount tracing support for RISC-V 64bit [WIP] arch/riscv64: Add basic mcount tracing support for RISC-V 64bit Aug 31, 2023
@gichoel gichoel changed the title [WIP] arch/riscv64: Add basic mcount tracing support for RISC-V 64bit arch/riscv64: Add basic mcount tracing support for RISC-V 64bit Aug 31, 2023
Copy link
Collaborator

@honggyukim honggyukim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In d8ad79e arch/riscv64: Update 'mcount.S' to support tracking based on mcount func,

Info : When compiling the target program with gcc, you need to add
'-O0' or add '-fno-omit-frame-pointer' as an option because it will
disable frame pointers when optimization options are enabled.

Users do not have to specify -O0 because it is the default setting. You better explain it as follows.

In order to figure out the address of parent_loc, we need the frame pointer,
but compiler optimization such as `-O2` in gcc removes the `fp` so we won't
be able know where to change to hijack the return address to `mcount_return`.
This problem only happens in gcc, but not in clang.

To avoid the problem, `-fno-omit-frame-pointer` must be used when gcc
optimization option is used in riscv64.

In addition, we better have the following change in this PR.

diff --git a/uftrace.c b/uftrace.c
index 69e6cefc..69486e94 100644
--- a/uftrace.c
+++ b/uftrace.c
@@ -1389,7 +1389,10 @@ int main(int argc, char *argv[])
        struct uftrace_opts opts = {
                .mode = UFTRACE_MODE_INVALID,
                .dirname = UFTRACE_DIR_NAME,
+#if !defined(__riscv)
+               /* FIXME: disable libcall until PLT hooking is implemented in riscv64. */
                .libcall = true,
+#endif
                .bufsize = SHMEM_BUFFER_SIZE,
                .max_stack = OPT_RSTACK_DEFAULT,
                .port = UFTRACE_RECV_PORT,

This is needed to avoid segfault when --no-libcall is not used.

@honggyukim
Copy link
Collaborator

It looks the argument handling is incorrect even for integer types for now.

$ gcc -pg -g tests/s-abc.c -o t-abc
$ uftrace -a --no-libcall t-abc 
# DURATION     TID     FUNCTION
            [265212] | main(0x3f80d79808, 0x3fc59ab208) {
            [265212] |   a() {
            [265212] |     b() {
   2.500 us [265212] |       c() = 65212;
   7.250 us [265212] |     } = 0xffffffffc7cc78c5; /* b */
   8.500 us [265212] |   } = 0xffffffffc7cc789f; /* a */
  13.500 us [265212] | } = 0; /* main */

The return address of b and a are incorrect.

@gichoel
Copy link
Contributor Author

gichoel commented Sep 2, 2023

I will update the commit to incorporate your feedback.

Also, the argument handling issue has been confirmed in my environment and seems to be related to the mcount_return part, which I recently removed as unnecessary.

As such, I will be reverting that code back.

@gichoel
Copy link
Contributor Author

gichoel commented Sep 2, 2023

And I was able to see the following results after the patch.

Thanks to this, we no longer need to use the --no-libcall option!

$ gcc -pg -g tests/s-abc.c -o t-abc
$ ./uftrace --libmcount-path=. -a t-abc
# DURATION     TID     FUNCTION
            [274658] | main(0x3f8da21808, 0x3fe7908478) {
            [274658] |   a() {
            [274658] |     b() {
   2.000 us [274658] |       c() = 74658;
   6.250 us [274658] |     } = 74659; /* b */
   8.250 us [274658] |   } = 74658; /* a */
  13.000 us [274658] | } = 0; /* main */

@SEOKMIN83
Copy link
Contributor

And I was able to see the following results after the patch.

Thanks to this, we no longer need to use the --no-libcall option!

$ gcc -pg -g tests/s-abc.c -o t-abc
$ ./uftrace --libmcount-path=. -a t-abc
# DURATION     TID     FUNCTION
            [274658] | main(0x3f8da21808, 0x3fe7908478) {
            [274658] |   a() {
            [274658] |     b() {
   2.000 us [274658] |       c() = 74658;
   6.250 us [274658] |     } = 74659; /* b */
   8.250 us [274658] |   } = 74658; /* a */
  13.000 us [274658] | } = 0; /* main */

It works well. !!!!!
The below is the result of no use the --no-libcall option.

sm_ple38@starfive:~/uftrace$ gcc -pg -g tests/s-abc.c -o t-abc
sm_ple38@starfive:~/uftrace$ ./uftrace --libmcount-path=. -a t-abc
# DURATION     TID     FUNCTION
            [285395] | main(0x3f8ba48808, 0x3fe94d3338) {
            [285395] |   a() {
            [285395] |     b() {
   1.750 us [285395] |       c() = 85395;
   6.000 us [285395] |     } = 85396; /* b */
   7.250 us [285395] |   } = 85395; /* a */
  12.750 us [285395] | } = 0; /* main */

@coyotek
Copy link
Contributor

coyotek commented Sep 3, 2023

If it is a string, it will not work.
The second argument of the bar() function is different.
Below are the test results.
The test source file is ./tests/s-arg.c.

  • for x86_64
$ ./uftrace -a t-arg
# DURATION     TID     FUNCTION
   0.691 us [224359] | __monstartup();
   0.140 us [224359] | __cxa_atexit();
            [224359] | main(1, 0x7fff378d7338) {
            [224359] |   foo(3) {
            [224359] |     bar(2, "c") {
   0.561 us [224359] |       strcmp("c", "c") = 0;
 184.333 us [224359] |     } = 0; /* bar */
            [224359] |     bar(1, "b") {
   0.220 us [224359] |       strcmp("b", "b") = 0;
   0.872 us [224359] |     } = 0; /* bar */
            [224359] |     bar(0, "a") {
   0.190 us [224359] |       strcmp("a", "a") = 0;
   0.682 us [224359] |     } = 0; /* bar */
 187.098 us [224359] |   } = 0; /* foo */
   0.270 us [224359] |   many(12) = 0;
            [224359] |   pass(3) {
   0.531 us [224359] |     check(big{...}, 0x7fff378d7190) = 0;
   1.082 us [224359] |   } = 0; /* pass */
 190.775 us [224359] | } = 0; /* main */
  • for RISC-V
$ ./uftrace -a t-arg
# DURATION     TID     FUNCTION
            [286418] | main(1, 0x3ff0657448) {
            [286418] |   foo(3) {
   2.501 us [286418] |     bar(2, 68944) = 0;
   0.500 us [286418] |     bar(1, 68942) = 0;
   0.500 us [286418] |     bar(0, 68940) = 0;
  10.001 us [286418] |   } = 0; /* foo */
   1.250 us [286418] |   many(12) = 0;
            [286418] |   pass(3) {
   1.250 us [286418] |     check(big{...}, 0x3ff0657200) = 0;
   4.250 us [286418] |   } = 0; /* pass */
  21.501 us [286418] | } = 0; /* main */

Also, if you change the order of the arguments in the bar() function, the result is the same.

$ ./uftrace -a t-arg2                   
# DURATION     TID     FUNCTION                                 
            [287700] | main(1, 0x3fd905b438) {                  
            [287700] |   foo(3) {                               
   3.000 us [287700] |     bar(68944, 2) = 0;                   
   0.750 us [287700] |     bar(68942, 1) = 0;                   
   0.750 us [287700] |     bar(68940, 0) = 0;                   
  10.750 us [287700] |   } = 0; /* foo */                       
   1.000 us [287700] |   many(12) = 0;                          
            [287700] |   pass(3) {                              
   1.500 us [287700] |     check(big{...}, 0x3fd905b1f0) = 0;   
   4.750 us [287700] |   } = 0; /* pass */                      
  24.000 us [287700] | } = 0; /* main */                        

@gichoel
Copy link
Contributor Author

gichoel commented Sep 3, 2023

Hmm... I found the same problem on RISC-V and tested characters and strings.

  • Testing character output

    /* test2.c */
    #include <stdio.h>
    
    void foo(int i, char c){
            printf("%d %c\n",i, c);
    }
    
    int main(void){
            foo(15, 'c');
            return 0;
    }
    $ clang -pg -g test2.c -o t-arg3
    $ ./uftrace --libmcount-path=. -a t-arg3
    
    # DURATION     TID     FUNCTION
                [288561] | main() {
      12.501 us [288561] |   foo(15, 'c');
      34.001 us [288561] | } = 0; /* main */
  • Testing string output

    $ clang -pg -g tests/s-arg.c -o t-arg
    $ ./uftrace --libmcount-path=. -a t-arg
    
    # DURATION     TID     FUNCTION
                [288583] | main(1, 0x3ff7bc7478) {
                [288583] |   foo(3) {
       3.250 us [288583] |     bar(2, 68944) = 0;
       1.250 us [288583] |     bar(1, 68942) = 0;
       1.000 us [288583] |     bar(0, 68940) = 0;
      91.001 us [288583] |   } = 0; /* foo */
       1.250 us [288583] |   many(12) = 0;
                [288583] |   pass(3) {
       1.500 us [288583] |     check(big{...}, 0x3ff7bc7230) = 0;
      32.001 us [288583] |   } = 0; /* pass */
     223.004 us [288583] | } = 0; /* main */
  • Debugging with gdb

    $ gdb --args ./uftrace --libmcount-path=. -a t-arg
    
    ......
    
    (gdb) b _mcount:1
    (gdb) r
    
    ......
    
    /* Skip main() func */
    Thread 2.1 "t-arg" hit Breakpoint 1, _mcount () at ...
    7               addi sp, sp, -80
    (gdb) c
    
    ......
    
    /* Skip foo() func */
    Thread 2.1 "t-arg" hit Breakpoint 1, _mcount () at  ...
    7               addi sp, sp, -80
    (gdb) c
    
    ......
    
    /* in bar() func */
    Thread 2.1 "t-arg" hit Breakpoint 1, _mcount () at ...
    7               addi sp, sp, -80
    (gdb) n
    
    ......
    
    (gdb) n
    20              sd a0, 0(sp)
    
    (gdb) info register
    ra             0x107e0  0x107e0 <bar+20>
    sp             0x3ffffff1b0     0x3ffffff1b0
    gp             0x12800  0x12800
    tp             0x3ff7afe7b0     0x3ff7afe7b0
    t0             0x3fffffec08     274877901832
    t1             0x106dc  67292
    t2             0x10     16
    fp             0x3ffffff200     0x3ffffff200
    s1             0x0      0
    a0             0x2      2
    a1             0x10d50  68944
    a2             0x0      0
    a3             0x0      0
    a4             0x3ffffff2f8     274877903608
    a5             0x10b72  68466
    a6             0x0      0
    a7             0x3ff7e95600     274742203904
    s2             0x3ffffff548     274877904200
    s3             0x0      0
    s4             0x3      3
    s5             0x0      0
    s6             0x0      0
    s7             0x2aaaaf9548     183252260168
    s8             0x3ff7cd2450     274740356176
    s9             0x3ffffff1bc     274877903292
    s10            0x3      3
    s11            0x3ffffff7e0     274877904864
    t3             0x3ff7fcbaa2     274743474850
    t4             0x17     23
    t5             0x1483357adb5cdd 5773765251914973
    t6             0x2cb1572ace4    3071261453540
    pc             0x3ff7fcbab8     0x3ff7fcbab8 <_mcount+22>
    
    (gdb) x/b 0x10d50
    0x10d50:        99 

In the ASCII code table, the decimal 99 corresponds to 'c', and debugging result that the value of the string argument to the _mcount function is normal.

So I'm guessing it's a matter of the code below in utils/dwarf.c not recognizing it as a string, but I need a little more confirmation.

	/* utils/dwarf.c */
931:	if (td->pointer) {
932:		td->size = sizeof(long) * 8;
933:
934:		/* treat 'char *' as string */
935:		if (td->pointer == 1 && tname && !strcmp(tname, "char"))
936:			td->fmt = ARG_FMT_STR;
937:		else
938:			td->fmt = ARG_FMT_PTR;
939:		return true;
940:	}

@coyotek
Copy link
Contributor

coyotek commented Sep 3, 2023

Like you said
It should print the value 0x10d50, but it seems to be printing an address.

Copy link
Collaborator

@honggyukim honggyukim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the following error when running it in debug build.

$ uftrace record -a a.out 
/home/user/uftrace/utils/regs.c:437: arch_register_dwarf_name: ASSERT `arch < ARRAY_SIZE(arch_dwarf_tables)' failed.
Stack trace:
  #1  0x003fb30d3ca2 stacktrace + 0x34

Please report this bug to https://github.com/namhyung/uftrace/issues.

WARN: child terminated by signal: 5: Trace/breakpoint trap

Attaching gdb shows backtrace info as follows.

(gdb) r
    ...
/home/user/uftrace/utils/regs.c:437: arch_register_dwarf_name: ASSERT `arch < ARRAY_SIZE(arch_dwarf_tables)' failed.
Stack trace:
  #1  0x003ff7fbaca2 stacktrace + 0x34

Please report this bug to https://github.com/namhyung/uftrace/issues.

Thread 2.1 "a.out" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x3ff7ac9010 (LWP 290922)]
__GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:49
49      in ../sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x0000003ff7fb0092 in arch_register_dwarf_name (arch=UFT_CPU_RISCV64, dwarf_reg=90) at /home/user/uftrace/utils/regs.c:437
#2  0x0000003ff7fc1456 in add_location (spec=0x3fffffea10 "arg1", len=256, die=0x3fffffeb18, ad=0x3fffffeb80)
    at /home/user/uftrace/utils/dwarf.c:1184
#3  0x0000003ff7fc185e in get_argspec (die=0x3fffffec98, data=0x3fffffeb80) at /home/user/uftrace/utils/dwarf.c:1290
#4  0x0000003ff7fc1fec in get_dwarfspecs_cb (die=0x3fffffec98, data=0x3fffffee10) at /home/user/uftrace/utils/dwarf.c:1512
#5  0x0000003ff7c348b8 in ?? () from /usr/lib/riscv64-linux-gnu/libdw.so.1

It looks you need to implement utf_riscv64_dwarf_table at https://github.com/namhyung/uftrace/blob/v0.14/utils/regs.c#L415-L421.

static const struct uftrace_reg_table *arch_dwarf_tables[] = {
        NULL,   
        uft_x86_64_dwarf_table,
        uft_arm_dwarf_table,
        uft_aarch64_dwarf_table,
        uft_i386_dwarf_table,
};

Comment on lines 79 to 83
default:
return -1;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My compiler complains this part as follows.

/home/user/uftrace/arch/riscv64/mcount-support.c: In function 'mcount_get_register_arg':
/home/user/uftrace/arch/riscv64/mcount-support.c:82:1: error: control reaches end of non-void function [-Werror=return-type]
   82 | }
      | ^
cc1: all warnings being treated as errors

You better remove default case, then move return -1 to the end of this function.

Copy link
Contributor Author

@gichoel gichoel Sep 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also got an error message after the make DEBUG=1 debug build, which looked like it needed fixing.

Your suggestion is good, but mcount-support.c in other architectures uses return 0;.

This was my mistake during the porting process and I think adding return 0; is a good way to be consistent, so I will apply the same method as the other architectures.

@gichoel gichoel marked this pull request as draft September 4, 2023 16:29
@gichoel
Copy link
Contributor Author

gichoel commented Sep 4, 2023

I was referring to the psABI documentation for RISC-V to implement the DWARF-specific code that was the feedback regarding the Argument Handling feature.

However, I've realized that the list of changes to be made is longer than I thought and that I need to reorder the previous commits to make this work.


Therefore, I reordered the commits and made the modifications in the arch/riscv-support-mcount-v2 branch, but there is a problem when building using the make DEBUG=1 command.

I followed the problem backwards using git bisect and encountered the following error in the first commit of the RISC-V porting work, so I will fix it and get it reviewed again.

/home/test/workdir/uftrace/arch/riscv64/mcount-support.c:18:13: error: 'mcount_get_struct_arg' defined but not used [-Werror=unused-function].
   18 | static void mcount_get_struct_arg(struct mcount_arg_context *ctx, struct uftrace_arg_spec *spec)
      | ^~~~~~~~~~~~~~~~~~~~~
/home/test/workdir/uftrace/arch/riscv64/mcount-support.c:13:13: error: 'mcount_get_stack_arg' defined but not used [-Werror=unused-function].
   13 | static void mcount_get_stack_arg(struct mcount_arg_context *ctx, struct uftrace_arg_spec *spec)
      | ^~~~~~~~~~~~~~~~~~~~
/home/test/workdir/uftrace/arch/riscv64/mcount-support.c:8:12: error: 'mcount_get_register_arg' defined but not used [-Werror=unused-function].
    8 | static int mcount_get_register_arg(struct mcount_arg_context *ctx, struct uftrace_arg_spec *spec)
      | ^~~~~~~~~~~~~~~~~~~~~~~

@namhyung
Copy link
Owner

namhyung commented Sep 4, 2023

I think we can merge the basic RISC-V support code first without argument handling. The new architecture support is not a simple task, we can approach it step by step. You may want to file a separate issue for the arguments and return value tracing.

@gichoel
Copy link
Contributor Author

gichoel commented Sep 5, 2023

Thanks for the feedback @namhyung

I will leave only the basic RISC-V support code in the current PR and create a new PR for the argument handling.

However, when I built the first commit for RISC-V porting using the make command, there was no error, but when I built it using the make DEBUG=1 command, there was an error.

Therefore, I will verify the commits related to the basic RISC-V Support code using both commands and if everything is fine, I will re-upload the commits to the current PR.

@gichoel gichoel force-pushed the arch/riscv-support-mcount branch 2 times, most recently from d1d24a7 to 0ed96e9 Compare September 5, 2023 13:33
@gichoel gichoel marked this pull request as ready for review September 5, 2023 13:34
@namhyung
Copy link
Owner

The unittest is a separate binary including all the code but with a different main function. Some test codes are in the arch/ directory and only included for the architecture (x86_64).

@honggyukim
Copy link
Collaborator

I have investigated the unittest and found that it doesn't work only in gcc.

Switching the compiler to clang makes it work fine. You can manually modify CC := gcc to CC := clang in .config.

    ...
unit test stats
====================
 94 ran successfully
  0 failed
  1 skipped
  0 signal caught
  0 unknown result

The reason looks the uftrace.unit_test section is not properly filled out.

1. gcc built unittest

$ objdump -D unittest
      ...
Disassembly of section uftrace.unit_test:

00000000000e0cb8 <test_unittest_framework>:
        ...

00000000000e0cc8 <test_option_parsing1>:
        ...

00000000000e0cd8 <test_option_parsing2>:
        ...

00000000000e0ce8 <test_option_parsing3>:
        ...

00000000000e0cf8 <test_option_parsing4>:
        ...

2. clang built unittest

$ objdump -D unittest
      ...
Disassembly of section uftrace.unit_test:

00000000000a4358 <test_unittest_framework>:
   a4358:       c88c                    sw      a1,16(s1)
   a435a:       00000007                .4byte  0x7
   a435e:       0000                    unimp
   a4360:       6eb8                    ld      a4,88(a3)
   a4362:       0001                    nop
   a4364:       0000                    unimp
        ...

00000000000a4368 <test_option_parsing1>:
   a4368:       0111                    addi    sp,sp,4
   a436a:       0008                    .2byte  0x8
   a436c:       0000                    unimp
   a436e:       0000                    unimp
   a4370:       8334                    .2byte  0x8334
   a4372:       0001                    nop
   a4374:       0000                    unimp
        ...

00000000000a4378 <test_option_parsing2>:
   a4378:       07e0                    addi    s0,sp,972
   a437a:       0008                    .2byte  0x8
   a437c:       0000                    unimp
   a437e:       0000                    unimp
   a4380:       9406                    add     s0,s0,ra
   a4382:       0001                    nop
   a4384:       0000                    unimp
        ...

@gichoel It looks a bug in gcc on riscv so I think you can ignore this issue. I think clang better supports RISC-V for our purpose because of multiple reasons.

clang doesn't have the following problems that gcc have.

  1. broken unittest
  2. -fno-omit-frame-pointer required in optimization builds
  3. the first argument capture is broken

Let's focus on clang support for RISC-V as of now.

@gichoel
Copy link
Contributor Author

gichoel commented Sep 17, 2023

As you confirmed, switching to clang has resulted in all test cases working fine except for one skip. (The skip occurs in the same test case on x86_64).

All future RISC-V porting work will be done with clang as you suggested.

Thanks.

@namhyung
Copy link
Owner

I'm ok with proceeding for clang, but could you please share more details of the GCC problems? Specially I'd like to see the actual objdump -D output and ELF section headers (like readelf -SW).

@gichoel
Copy link
Contributor Author

gichoel commented Sep 18, 2023

The results of your requested objdump -D and readelf -SW are shown below, and gcc's objdump -D result was outputting ... literally, rather than omitting the contents.

Additionally, these results were run on a RISC-V board.

1. gcc built unittest

$ objdump -D unittest

......

Disassembly of section uftrace.unit_test:

000000000009e5a0 <test_unittest_framework>:
	...

000000000009e5b0 <test_option_parsing5>:
	...

000000000009e5c0 <test_option_parsing4>:
	...

000000000009e5d0 <test_option_parsing3>:
	...

000000000009e5e0 <test_option_parsing2>:
	...

000000000009e5f0 <test_option_parsing1>:
	...
$ readelf -SW unittest

There are 40 section headers, starting at offset 0x3c8210:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        00000000000002a8 0002a8 000021 00   A  0   0  1
  [ 2] .note.gnu.build-id NOTE            00000000000002cc 0002cc 000024 00   A  0   0  4
  [ 3] .note.ABI-tag     NOTE            00000000000002f0 0002f0 000020 00   A  0   0  4
  [ 4] .gnu.hash         GNU_HASH        0000000000000310 000310 0001d0 00   A  5   0  8
  [ 5] .dynsym           DYNSYM          00000000000004e0 0004e0 0020d0 18   A  6   2  8
  [ 6] .dynstr           STRTAB          00000000000025b0 0025b0 000f13 00   A  0   0  1
  [ 7] .gnu.version      VERSYM          00000000000034c4 0034c4 0002bc 02   A  5   0  2
  [ 8] .gnu.version_r    VERNEED         0000000000003780 003780 000190 00   A  6   9  8
  [ 9] .rela.dyn         RELA            0000000000003910 003910 00b538 18   A  5   0  8
  [10] .rela.plt         RELA            000000000000ee48 00ee48 001950 18  AI  5  24  8
  [11] .plt              PROGBITS        00000000000107a0 0107a0 001100 10  AX  0   0 16
  [12] .text             PROGBITS        00000000000118a0 0118a0 05e69a 00  AX  0   0  4
  [13] .rodata           PROGBITS        000000000006ff40 06ff40 0270fa 00   A  0   0  8
  [14] .eh_frame_hdr     PROGBITS        000000000009703c 09703c 000014 00   A  0   0  4
  [15] .eh_frame         PROGBITS        0000000000097050 097050 00002c 00   A  0   0  8
  [16] .tbss             NOBITS          0000000000098360 097360 0010f0 00 WAT  0   0  8
  [17] .preinit_array    PREINIT_ARRAY   0000000000098360 097360 000008 08  WA  0   0  1
  [18] .init_array       INIT_ARRAY      0000000000098368 097368 000008 08  WA  0   0  8
  [19] .fini_array       FINI_ARRAY      0000000000098370 097370 000008 08  WA  0   0  8
  [20] .data.rel.ro      PROGBITS        0000000000098378 097378 002a08 00  WA  0   0  8
  [21] .dynamic          DYNAMIC         000000000009ad80 099d80 000280 10  WA  6   0  8
  [22] .data             PROGBITS        000000000009b000 09a000 00359c 00  WA  0   0  8
  [23] uftrace.unit_test PROGBITS        000000000009e5a0 09d5a0 0005f0 00  WA  0   0  8
  [24] .got              PROGBITS        000000000009eb90 09db90 000a40 08  WA  0   0  8
  [25] .bss              NOBITS          000000000009f5d0 09e5d0 003368 00  WA  0   0  8
  [26] .comment          PROGBITS        0000000000000000 09e5d0 00001e 01  MS  0   0  1
  [27] .riscv.attributes RISCV_ATTRIBUTES 0000000000000000 09e5ee 000037 00      0   0  1
  [28] .debug_aranges    PROGBITS        0000000000000000 09e625 000aa0 00      0   0  1
  [29] .debug_info       PROGBITS        0000000000000000 09f0c5 11fb40 00      0   0  1
  [30] .debug_abbrev     PROGBITS        0000000000000000 1bec05 013545 00      0   0  1
  [31] .debug_line       PROGBITS        0000000000000000 1d214a 0bab1a 00      0   0  1
  [32] .debug_frame      PROGBITS        0000000000000000 28cc68 011128 00      0   0  8
  [33] .debug_str        PROGBITS        0000000000000000 29dd90 015fb8 01  MS  0   0  1
  [34] .debug_line_str   PROGBITS        0000000000000000 2b3d48 00116a 01  MS  0   0  1
  [35] .debug_loclists   PROGBITS        0000000000000000 2b4eb2 0d0ce1 00      0   0  1
  [36] .debug_rnglists   PROGBITS        0000000000000000 385b93 029e51 00      0   0  1
  [37] .symtab           SYMTAB          0000000000000000 3af9e8 00f600 18     38 1534  8
  [38] .strtab           STRTAB          0000000000000000 3befe8 009072 00      0   0  1
  [39] .shstrtab         STRTAB          0000000000000000 3c805a 0001b1 00      0   0  1

2. clang built unittest

$ objdump -D unittest

......

Disassembly of section uftrace.unit_test:

00000000000a3760 <test_unittest_framework>:
   a3760:	c49c                	sw	a5,8(s1)
   a3762:	00000007          	.4byte	0x7
   a3766:	0000                	unimp
   a3768:	6eb8                	ld	a4,88(a3)
   a376a:	0001                	nop
   a376c:	0000                	unimp
	...

00000000000a3770 <test_option_parsing1>:
   a3770:	fcd5                	bnez	s1,a372c <type_size.table+0xe5c>
   a3772:	00000007          	.4byte	0x7
   a3776:	0000                	unimp
   a3778:	82f8                	.2byte	0x82f8
   a377a:	0001                	nop
   a377c:	0000                	unimp
	...

00000000000a3780 <test_option_parsing2>:
   a3780:	03a4                	addi	s1,sp,456
   a3782:	0008                	.2byte	0x8
   a3784:	0000                	unimp
   a3786:	0000                	unimp
   a3788:	93ca                	add	t2,t2,s2
   a378a:	0001                	nop
   a378c:	0000                	unimp
	...

00000000000a3790 <test_option_parsing3>:
   a3790:	051f 0008 0000      	.byte	0x1f, 0x05, 0x08, 0x00, 0x00, 0x00
   a3796:	0000                	unimp
   a3798:	97fc                	.2byte	0x97fc
   a379a:	0001                	nop
   a379c:	0000                	unimp
	...

00000000000a37a0 <test_option_parsing4>:
   a37a0:	0698                	addi	a4,sp,832
   a37a2:	0008                	.2byte	0x8
   a37a4:	0000                	unimp
   a37a6:	0000                	unimp
   a37a8:	9d22                	add	s10,s10,s0
   a37aa:	0001                	nop
   a37ac:	0000                	unimp
	...

00000000000a37b0 <test_option_parsing5>:
   a37b0:	0008078f          	.4byte	0x8078f
   a37b4:	0000                	unimp
   a37b6:	0000                	unimp
   a37b8:	a1fc                	fsd	fa5,192(a1)
   a37ba:	0001                	nop
   a37bc:	0000                	unimp
	...
$ readelf -SW unittest

There are 40 section headers, starting at offset 0x5cbd18:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        00000000000102a8 0002a8 000021 00   A  0   0  1
  [ 2] .note.gnu.build-id NOTE            00000000000102cc 0002cc 000024 00   A  0   0  4
  [ 3] .note.ABI-tag     NOTE            00000000000102f0 0002f0 000020 00   A  0   0  4
  [ 4] .hash             HASH            0000000000010310 000310 0008e0 04   A  6   0  8
  [ 5] .gnu.hash         GNU_HASH        0000000000010bf0 000bf0 0009dc 00   A  6   0  8
  [ 6] .dynsym           DYNSYM          00000000000115d0 0015d0 001c68 18   A  7   1  8
  [ 7] .dynstr           STRTAB          0000000000013238 003238 000c8c 00   A  0   0  1
  [ 8] .gnu.version      VERSYM          0000000000013ec4 003ec4 00025e 02   A  6   0  2
  [ 9] .gnu.version_r    VERNEED         0000000000014128 004128 000190 00   A  7   9  8
  [10] .rela.dyn         RELA            00000000000142b8 0042b8 0000d8 18   A  6   0  8
  [11] .rela.plt         RELA            0000000000014390 004390 001968 18  AI  6  24  8
  [12] .plt              PROGBITS        0000000000015d00 005d00 001110 10  AX  0   0 16
  [13] .text             PROGBITS        0000000000016e10 006e10 065688 00  AX  0   0  4
  [14] .rodata           PROGBITS        000000000007c498 06c498 0272c5 00   A  0   0  8
  [15] uftrace.unit_test PROGBITS        00000000000a3760 093760 0005f0 00   A  0   0  8
  [16] .eh_frame_hdr     PROGBITS        00000000000a3d50 093d50 000014 00   A  0   0  4
  [17] .eh_frame         PROGBITS        00000000000a3d68 093d68 00002c 00   A  0   0  8
  [18] .tbss             NOBITS          00000000000a4d78 094d78 0010f0 00 WAT  0   0  8
  [19] .preinit_array    PREINIT_ARRAY   00000000000a4d78 094d78 000008 08  WA  0   0  1
  [20] .init_array       INIT_ARRAY      00000000000a4d80 094d80 000008 08  WA  0   0  8
  [21] .fini_array       FINI_ARRAY      00000000000a4d88 094d88 000008 08  WA  0   0  8
  [22] .dynamic          DYNAMIC         00000000000a4d90 094d90 000270 10  WA  7   0  8
  [23] .data             PROGBITS        00000000000a5000 095000 0029c8 00  WA  0   0  8
  [24] .got              PROGBITS        00000000000a79c8 0979c8 000898 08  WA  0   0  8
  [25] .sdata            PROGBITS        00000000000a8260 098260 000158 00  WA  0   0  8
  [26] .sbss             NOBITS          00000000000a83b8 0983b8 000459 00  WA  0   0  8
  [27] .bss              NOBITS          00000000000a8818 0983b8 002f48 00  WA  0   0  8
  [28] .comment          PROGBITS        0000000000000000 0983b8 00003c 01  MS  0   0  1
  [29] .riscv.attributes RISCV_ATTRIBUTES 0000000000000000 0983f4 000039 00      0   0  1
  [30] .debug_info       PROGBITS        0000000000000000 09842d 0b0987 00      0   0  1
  [31] .debug_abbrev     PROGBITS        0000000000000000 148db4 00bc54 00      0   0  1
  [32] .debug_line       PROGBITS        0000000000000000 154a08 0754fb 00      0   0  1
  [33] .debug_frame      PROGBITS        0000000000000000 1c9f08 00b7b8 00      0   0  8
  [34] .debug_str        PROGBITS        0000000000000000 1d56c0 014dc3 01  MS  0   0  1
  [35] .debug_loc        PROGBITS        0000000000000000 1ea483 0a74be 00      0   0  1
  [36] .debug_ranges     PROGBITS        0000000000000000 291941 01cbe0 00      0   0  1
  [37] .symtab           SYMTAB          0000000000000000 2ae528 3147d8 18     38 133482  8
  [38] .strtab           STRTAB          0000000000000000 5c2d00 008e88 00      0   0  1
  [39] .shstrtab         STRTAB          0000000000000000 5cbb88 00018b 00      0   0  1

@gichoel
Copy link
Contributor Author

gichoel commented Oct 4, 2023

umm... In the last discussion, we discussed an issue with the execution of the make unittest command.
This was analyzed by @honggyukim and he concluded that it was just a gcc issue in the RISC-V environment, so we decided to move on to clang.

I don't see anything further mentioned in that PR, so I'm curious.
My recollection is that there were no unresolved issues prior to the make unittest issue.
Are there any further issues being investigated, or should we wait for others to analyze this further?

PS : I've been working on researching and analyzing material for a porting of the PLT hooking feature for a few weeks now, and I'm still working on it.

So if more testing is needed or there are additional issues, there's no need to rush this PR.

@namhyung
Copy link
Owner

namhyung commented Oct 5, 2023

Sorry I didn't follow the development lately and thought there are more remaining issues. The test failures on GCC are unfortunate but as long as it works on clang, there's should be no fundamental problems. I'm more concerning about the PLT problems and looking forward to seeing a solution soon.

If @honggyukim is ok, I can merge this.

@honggyukim
Copy link
Collaborator

Sorry @gichoel, we missed further discussion after going into the unittest topic. I'm okay for almost most of the implementation.

But I just found that the floating point argument handling is broken. You can reproduce the problem as follows.

$ ./runtest.py -vdp -O0 -c clang 083
      ...
t083_arg_float: diff result of clang -pg -O0
--- expect      2023-10-05 13:19:42.213047344 +0000
+++ result      2023-10-05 13:19:42.213047344 +0000
@@ -1,6 +1,6 @@
 main() {
-   float_add(-0.100000, 0.200000) = 0.100000;
-   float_sub(0.100000, 0.200000) = -0.100000;
-   float_mul(300.000000, 400.000000) = 120000.000000;
-   float_div(40000000000.000000, -0.020000) = -2000000000000.000000;
+   float_add(-0.100000, 0.200000) = nan;
+   float_sub(0.100000, 0.200000) = nan;
+   float_mul(300.000000, 400.000000) = 0.000000;
+   float_div(40000000000.000000, -0.020000) = 0.000000;
  } /* main */

083 arg_float           : NG

Except for this issue, I'm fine with this work for merging into master.

@gichoel
Copy link
Contributor Author

gichoel commented Oct 6, 2023

Hi, @namhyung and @honggyukim.

I really appreciate you both taking the time to review and give feedback each time, I'm learning a lot from you :)

make unittest with clang there was no problem, so I think the new issue is an exceptional case that only happens in the Python test code.

Hmm... I wonder if it might be related to the x8, x18 registers issue identified in the aarch64 architecture, but I'm not sure, so I'll have to keep my mind open and analyze various possibilities.

@gichoel
Copy link
Contributor Author

gichoel commented Oct 9, 2023

I am investigating a previously presented issue that occurs when running the command $ ./runtest.py -vdp -O0 -c clang 083.

let's summarize conclusions, which are twofold

  1. $ ./runtest.py -vdp -O0 -c clang 083 was also unable to get a return value in the RISC-V QEMU environment.
  2. make unittest failed with the same ELF, SIG error with clang on RISC-V QEMU.
    - On the RISC-V QEMU environment, even with clang, the results of running objdump and readelf are the same as running them with gcc on the visionfive2.

the reason I did the second task was to see if other RISC-V environments had the same issue with the make unittest command.

Previously, it didn't happen when using clang in make unittest, so we just let it go.

However, it turns out that this is not the case in the RISC-V QEMU environment.
Should we look into make unittest issue first and analyze the cause?


Each environment used for testing is shown below.

[visionfive2 Board - Debian 12 ]

$ clang --version
Debian clang version 13.0.1-6
Target: riscv64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ uname -a
Linux starfive 5.15.0-starfive #1 SMP Mon Dec 19 07:56:37 EST 2022 riscv64 GNU/Linux

$ cat /etc/issue
Debian GNU/Linux bookworm/sid

[RISC-V QEMU - Ubuntu 23.04 ]

$ clang --version
Ubuntu clang version 15.0.7
Target: riscv64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ uname -a
Linux ubuntu 6.2.0-19-generic #19.1-Ubuntu SMP Fri Mar 31 12:41:53 UTC 2023 riscv64 riscv64 riscv64 GNU/Linux

$ cat /etc/issue
Ubuntu 23.04

[RISC-V QEMU - Ubuntu 22.04 ]

$ clang --version
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: riscv64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ uname -a
Linux ubuntu 5.19.0-1021-generic #23~22.04.1-Ubuntu SMP Thu Jun 22 12:49:35 UTC 2023 riscv64 riscv64 riscv64 GNU/Linux

$ cat /etc/issue
Ubuntu 22.04.3 LTS

@namhyung
Copy link
Owner

Can you please build tests/s-exp-float.c manually and check the output?

$ gcc -pg -g -o t-exp-float  tests/s-exp-float.c
$ uftrace -a ./t-exp-float

@gichoel
Copy link
Contributor Author

gichoel commented Oct 10, 2023

I manually compiled tests/s-exp-float.c as advised and ran it twice as shown below.

$ gcc -pg -g -o t-exp-float tests/s-exp-float.c

$ ./uftrace --libmcount-path=. -a ./t-exp-float
# duration tid function
            [ 881] | main(0x3fa04631d4, 0x3fc2ac3298) {
   5.500 us [ 881] | float_add(-0.100000, 0.200000) = -86.095337;
   2.500 us [ 881] | float_sub(0.100000, 0.200000) = -86.095337;
   1.700 us [ 881] | float_mul(300.000000, 400.000000) = 0.000000;
   1.500 us [ 881] | float_div(40000000000.000000, -0.020000) = 0.000000;
  34.401 us [ 881] | } = 0; /* main */

$ ./uftrace --libmcount-path=. -a ./t-exp-float
# duration tid function
            [ 934] | main(0x3f88fca1d4, 0x3fda4cb298) {
   5.700 us [ 934] | float_add(-0.100000, 0.200000) = -14403825662164992.000000;
   2.500 us [ 934] | float_sub(0.100000, 0.200000) = -14403825662164992.000000;
   1.400 us [ 934] | float_mul(300.000000, 400.000000) = 0.000000;
   1.400 us [ 934] | float_div(40000000000.000000, -0.020000) = 0.000000;
  33.500 us [ 934] | } = 0; /* main */

@namhyung
Copy link
Owner

Yeah, the return value seems not correct. But I don't think it should be a blocker at this point.

Copy link
Collaborator

@honggyukim honggyukim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to fix it but I also think that it doesn't have block it from being merged into master. I will give LGTM now. Thanks very much for your effort and great contribution!

@namhyung
Copy link
Owner

Good, can you please rebase your changes onto the current master?

@gichoel
Copy link
Contributor Author

gichoel commented Oct 14, 2023

It's great to see the early work on RISC-V support finally coalescing into master, but I'm a little confused.

That's because there are still issues with make unittest and return value.

The make unittest issue seems to be a compiler issue, which I understand.
But I wonder why they are merged when the return value issue exists.

@honggyukim Can we talk about this issue soon?

@namhyung Yes, I will deal with it right away.

gichoel and others added 3 commits October 14, 2023 22:39
This is a commit to ensure a successful build in a RISC-V 64bit
environment, no functional behavior is implemented.

Tested-by: Seonghee Jin <shjy180909@gmail.com>
Co-authored-by: Honggyu Kim <honggyu.kp@gmail.com>
Signed-off-by: Gichoel Choi <gichoel3101@gmail.com>
In order to figure out the address of parent_loc, we need the frame
pointer, but compiler optimization such as `-O2` in gcc removes the
`fp` so we won't be able know where to change to hijack the return
address to `mcount_return`.

This problem only happens in gcc, but not in clang.

To avoid the problem, `-fno-omit-frame-pointer` must be used when gcc
optimization option is used in riscv64.

Tested-by: Seonghee Jin <shjy180909@gmail.com>
Tested-by: Paran Lee <p4ranlee@gmail.com>
Co-authored-by: Honggyu Kim <honggyu.kp@gmail.com>
Signed-off-by: Gichoel Choi <gichoel3101@gmail.com>
This is work to support argument handling on the RISC-V 64-bit
architecture.

If you are curious about how RISC-V 64bit register numbers map to
DWARF register numbers, please refer to the 'RISC-V Run-time ABI
Specification' in 'RISC-V ELF psABI'.

Tested-by: Seonghee Jin <shjy180909@gmail.com>
Tested-by: Jungmin Kim <jungmin82@gmail.com>
Tested-by: SeokMin Kwon <ksm012015@naver.com>
Reviewed-by: Honggyu Kim <honggyu.kp@gmail.com>
Signed-off-by: Gichoel Choi <gichoel3101@gmail.com>
@honggyukim
Copy link
Collaborator

But I wonder why they are merged when the return value issue exists.

It's because we don't want to hold this PR too long and so we can merge this early then handle floating point return value issue as a bug fix later.

We're also aware that this work doesn't include plthook support so we know that this is incomplete somehow, but supporting all the features for a new architecture at once is difficult. We rather think that it'd be better to support RISC-V port incrementally.

So I think it's good to consider this PR has completed the initial milestone for the RISC-V support.

Copy link
Collaborator

@honggyukim honggyukim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rebase is done so I can give LGTM again.

@gichoel
Copy link
Contributor Author

gichoel commented Oct 14, 2023

I now understand why it was merged, thank you.

@namhyung namhyung merged commit b5c3bc6 into namhyung:master Oct 15, 2023
3 checks passed
@namhyung
Copy link
Owner

I've pushed check/riscv-fpret. Can you please take a look?

@gichoel
Copy link
Contributor Author

gichoel commented Oct 16, 2023

I ran the check/riscv-fpret branch code on my RISC-V board, and got the following results.

$ clang -pg -g -o t-exp-float tests/s-exp-float.c
$ ./uftrace --libmcount-path=. -a ./t-exp-float
# DURATION     TID     FUNCTION
            [  3784] | main(1, 0x3fe497a218) {
   1.250 us [  3784] |   float_add(-0.100000, 0.200000) = 0.100000;
   0.750 us [  3784] |   float_sub(0.100000, 0.200000) = -0.100000;
   0.750 us [  3784] |   float_mul(300.000000, 400.000000) = 120000.000000;
   0.750 us [  3784] |   float_div(40000000000.000000, -0.020000) = -2000000000000.000000;
  14.750 us [  3784] | } = 0; /* main */

Wow... this surprisingly solved the floating point return value issue.


I was curious as to how it was done, so I analyzed the commit and came up with the following conclusions.

  • Since it was a return value issue, the mcount_return function needs to be modified.
  • (in mcount.S) Save fa0, the first return register of the floating point, to the stack before calling the mcount_exit function.
  • (in mcount-support.c) Save the value of the fa0 register directly to ctx->val.v in mcount_arch_get_retval, the function that gets the return value of the function.

However, I couldn't figure out why the problem occurred with the existing method of using the float_retval variable.

@namhyung
Copy link
Owner

You used ctx->retval - 2 to record the return value but forgot to save it in mcount_return(). In the above code, it'd be +2 instead -2 though. Also we can just use fa0 directly (should be same as ctx->retval + 2).

FYI, the return value is passed to mcount_exit() as the first argument in a0. The sp should point to a stack location that contains the following data.

        ...
        |--------------------------|
     40 | ra (return address)      |
     32 | fp (frame pointer)       |
        |     (rest of fa0       ) |
     16 | fa0 (return value in FP) |
      8 | a1 (return value2)       |
sp    0 | a0 (return value1)       |   <---  ctx->retval
        |--------------------------|

@gichoel
Copy link
Contributor Author

gichoel commented Oct 17, 2023

Thanks for the explanation.

Your explanation above helped me understand why it is possible to avoid using the float_retval variable and what I was missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants