Skip to content

Conversation

xen0n
Copy link
Contributor

@xen0n xen0n commented Dec 26, 2021

This is based on the WIP Linux port still under review, but the port is
re-using asm-generic syscall numbers and parameters, so breakage should
be minimal when the port is eventually merged, if at all.

Currently only the LP64* ABIs are implemented in the toolchains, so only
support the host type "loongarch64".

  • NEWS: Mention this.
  • configure.ac [$host_cpu == loongarch64]: Define LOONGARCH64.
  • src/Makefile.am (EXTRA_DIST): Add the new files.
  • src/linux/loongarch64/arch_defs_.h: New file.
  • src/linux/loongarch64/arch_regs.c: Likewise.
  • src/linux/loongarch64/get_error.c: Likewise.
  • src/linux/loongarch64/get_scno.c: Likewise.
  • src/linux/loongarch64/get_syscall_args.c: Likewise.
  • src/linux/loongarch64/ioctls_arch0.h: Likewise.
  • src/linux/loongarch64/ioctls_inc0.h: Likewise.
  • src/linux/loongarch64/raw_syscall.h: Likewise.
  • src/linux/loongarch64/set_error.c: Likewise.
  • src/linux/loongarch64/set_scno.c: Likewise.
  • src/linux/loongarch64/syscallent.h: Likewise.
  • src/xlat/elf_em.in: Add EM_LOONGARCH.

* SPDX-License-Identifier: LGPL-2.1-or-later
*/

#define PERSONALITY0_AUDIT_ARCH { AUDIT_ARCH_LOONGARCH64, 0 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add AUDIT_ARCH_LOONGARCH64 to src/xlat/audit_arch.in, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; I'll do so later (after finishing my $DAY_JOB).

@ldv-alt
Copy link
Member

ldv-alt commented Jan 7, 2022

I've tested this on gcc401.fsffrance.org:
Linux gcc401 5.15.0+ #348 SMP PREEMPT Thu Dec 30 13:46:50 CST 2021 loongarch64 GNU/Linux

Testsuite summary:

FAIL: bpf-success.gen
FAIL: bpf-success-v.gen
FAIL: clone3-success.gen
FAIL: clone3-success-Xabbrev.gen
FAIL: clone3-success-Xraw.gen
FAIL: clone3-success-Xverbose.gen
FAIL: landlock_create_ruleset-success.gen
FAIL: landlock_create_ruleset-success-y.gen
FAIL: prctl-fp-mode.gen
FAIL: prctl-spec-inject.gen
FAIL: prlimit64-success.gen
FAIL: prlimit64-success--pidns-translation.gen
FAIL: quotactl-success.gen
FAIL: quotactl-success-v.gen
FAIL: quotactl-xfs-success.gen
FAIL: quotactl-xfs-success-v.gen
FAIL: seccomp_get_notif_sizes-success.gen
FAIL: bpf-success-long-y
FAIL: qual_fault-exit_group
FAIL: syslog-success
FAIL: poke-ptrace
FAIL: poke-unaligned
FAIL: printpath-umovestr-legacy
FAIL: printstrn-umoven-legacy
FAIL: qual_fault
FAIL: qual_inject-error-signal
FAIL: qual_inject-retval

In other words, all tests based on syscall tampering are failing there.

Assuming that src/linux/loongarch64/set_scno.c and src/linux/loongarch64/set_error.c are correct,
the kernel seems to be missing syscall tampering support.

@codecov
Copy link

codecov bot commented Jan 7, 2022

Codecov Report

Merging #205 (2bf68c3) into master (6efe7d7) will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #205      +/-   ##
==========================================
- Coverage   89.88%   89.88%   -0.01%     
==========================================
  Files         287      287              
  Lines       24047    24047              
==========================================
- Hits        21615    21614       -1     
- Misses       2432     2433       +1     
Impacted Files Coverage Δ
src/pidns.c 94.44% <0.00%> (-0.51%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6efe7d7...2bf68c3. Read the comment docs.

@xen0n
Copy link
Contributor Author

xen0n commented Jan 7, 2022

cc @chenhuacai

@xen0n
Copy link
Contributor Author

xen0n commented Jan 7, 2022

I've tested this on gcc401.fsffrance.org: Linux gcc401 5.15.0+ #348 SMP PREEMPT Thu Dec 30 13:46:50 CST 2021 loongarch64 GNU/Linux

Testsuite summary:

FAIL: bpf-success.gen FAIL: bpf-success-v.gen FAIL: clone3-success.gen FAIL: clone3-success-Xabbrev.gen FAIL: clone3-success-Xraw.gen FAIL: clone3-success-Xverbose.gen FAIL: landlock_create_ruleset-success.gen FAIL: landlock_create_ruleset-success-y.gen FAIL: prctl-fp-mode.gen FAIL: prctl-spec-inject.gen FAIL: prlimit64-success.gen FAIL: prlimit64-success--pidns-translation.gen FAIL: quotactl-success.gen FAIL: quotactl-success-v.gen FAIL: quotactl-xfs-success.gen FAIL: quotactl-xfs-success-v.gen FAIL: seccomp_get_notif_sizes-success.gen FAIL: bpf-success-long-y FAIL: qual_fault-exit_group FAIL: syslog-success FAIL: poke-ptrace FAIL: poke-unaligned FAIL: printpath-umovestr-legacy FAIL: printstrn-umoven-legacy FAIL: qual_fault FAIL: qual_inject-error-signal FAIL: qual_inject-retval

In other words, all tests based on syscall tampering are failing there.

Assuming that src/linux/loongarch64/set_scno.c and src/linux/loongarch64/set_error.c are correct, the kernel seems to be missing syscall tampering support.

Okay... this is the same result I've been seeing all the time (and exactly why I haven't pushed the xlat changes recently).

It indeed seems ptrace functionality is missing, judging from the latest kernel source. Syscall tracing machinery cannot be seen. Maybe @chenhuacai could implement this in next revision of loongarch-next series?

@ldv-alt
Copy link
Member

ldv-alt commented Jan 7, 2022

In other words, all tests based on syscall tampering are failing there.
Assuming that src/linux/loongarch64/set_scno.c and src/linux/loongarch64/set_error.c are correct, the kernel seems to be missing syscall tampering support.

Okay... this is the same result I've been seeing all the time (and exactly why I haven't pushed the xlat changes recently).

It indeed seems ptrace functionality is missing, judging from the latest kernel source. Syscall tracing machinery cannot be seen. Maybe @chenhuacai could implement this in next revision of loongarch-next series?

There is some support for changing the syscall nr in do_syscall, but, apparently, it doesn't work as expected.

@ldv-alt
Copy link
Member

ldv-alt commented Jan 7, 2022

Well, the syscall cancellation itself seems to be working, it's the return value handling that is not implemented properly in the kernel:

[gcc401 src]$ cat close2.c
#include <unistd.h>
int main() {
	dup2(0, 42);
	close(42);
	close(42);
	return 0;
}
[gcc401 src]$ ./strace -qq -eclose ./close2
close(3)                                = 0
close(42)                               = 0
close(42)                               = -1 EBADF (Bad file descriptor)
[gcc401 src]$ ./strace -qq -eclose -einject=close:error=enosys:when=2 ./close2
close(3)                                = 0
close(42./strace: Failed to tamper with process 2758428: unexpectedly got no error (return value 0x2a, error 0)
)                               = 42 (INJECTED)
close(42)                               = 0
[gcc401 src]$ ./strace -qq -eclose -einject=close:retval=9:when=2 ./close2
close(3)                                = 0
close(42./strace: Failed to tamper with process 2758433: unexpectedly got no error (return value 0x2a, error 0)
)                               = 42 (INJECTED)
close(42)                               = 0
[gcc401 src]$ ./strace -qq -esignal=none -eptrace ./strace -qq -eclose -einject=close:error=enosys:when=2 -o/dev/null ./close2
...
ptrace(PTRACE_GET_SYSCALL_INFO, 2758442, 88, {op=PTRACE_SYSCALL_INFO_ENTRY, arch=AUDIT_ARCH_LOONGARCH64, instruction_pointer=0x7ffff2fab930, stack_pointer=0x7ffffbb55fb0, entry={nr=__NR_close, args=[0x2a, 0x2a, 0, 0x7ffff303c538, 0, 0x7ffff305af90]}}) = 80
ptrace(PTRACE_GETREGSET, 2758442, NT_PRSTATUS, {iov_base={regs=[0, 0x12000054c, 0x7ffff307af90, 0x7ffffbb55fb0, 0x2a, 0x2a, 0, 0x7ffff303c538, 0, 0x7ffff305af90, 0x7ffffbb56140, 0x39, 0x7ffff3079200, 0x8, 0x1200084f0, 0x7ffff3061090, 0x1, 0x101010101010101, 0x7f7f7f7f7f7f7f7f, 0x10, 0x2, 0x7ac0000, 0, 0x7ffffbb56148, 0x1, 0, 0x120000530, 0x7ffff3077228, 0, 0x7ffffbb56158, 0x120007e38, 0], csr_era=0x7ffff2fab930, csr_badv=0x7ffff2fab9f0, reserved=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}, iov_len=360}) = 0
ptrace(PTRACE_SETREGSET, 2758442, NT_PRSTATUS, {iov_base={regs=[0, 0x12000054c, 0x7ffff307af90, 0x7ffffbb55fb0, 0x2a, 0x2a, 0, 0x7ffff303c538, 0, 0x7ffff305af90, 0x7ffffbb56140, 0xffffffffffffffff, 0x7ffff3079200, 0x8, 0x1200084f0, 0x7ffff3061090, 0x1, 0x101010101010101, 0x7f7f7f7f7f7f7f7f, 0x10, 0x2, 0x7ac0000, 0, 0x7ffffbb56148, 0x1, 0, 0x120000530, 0x7ffff3077228, 0, 0x7ffffbb56158, 0x120007e38, 0], csr_era=0x7ffff2fab930, csr_badv=0x7ffff2fab9f0, reserved=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}, iov_len=360}) = 0
ptrace(PTRACE_SYSCALL, 2758442, NULL, 0) = 0
ptrace(PTRACE_GET_SYSCALL_INFO, 2758442, 88, {op=PTRACE_SYSCALL_INFO_EXIT, arch=AUDIT_ARCH_LOONGARCH64, instruction_pointer=0x7ffff2fab934, stack_pointer=0x7ffffbb55fb0, exit={rval=42, is_error=0}}) = 33
./strace: Failed to tamper with process 2758442: unexpectedly got no error (return value 0x2a, error 0)
...

You can see from this example that in case of syscall cancellation the first syscall argument is returned unconditionally as the return value.

@ldv-alt
Copy link
Member

ldv-alt commented Jan 7, 2022

The problematic piece of kernel code is

        if ((nr == -1UL) && (onr != -1UL))
                goto out; /* Don't set -ENOSYS for SECCOMP */

Here the Don't set -ENOSYS for SECCOMP is the culprit:
when the syscall is cancelled by ptrace/seccomp, the return value must be set properly to -ENOSYS.

@ldv-alt
Copy link
Member

ldv-alt commented Jan 7, 2022

The problematic piece of kernel code is

        if ((nr == -1UL) && (onr != -1UL))
                goto out; /* Don't set -ENOSYS for SECCOMP */

Here the Don't set -ENOSYS for SECCOMP is the culprit: when the syscall is cancelled by ptrace/seccomp, the return value must be set properly to -ENOSYS.

This piece of code is correct for SECCOMP because the latter calls syscall_set_return_value itself.

Unfortunately, this approach is not applicable for ptrace:
ptracer cannot set syscall return value on entering syscall
because the register used for the return value is often used for other purposes.
For example, on x86 it's also used for the syscall nr, and on loongarch it's also used for the first syscall argument.

I'm afraid syscall_enter_from_user_mode needs to be changed before it could be used on loongarch.
Besides loongarch, the only architecture where syscall_enter_from_user_mode is used
is x86 where rax is initialized to -ENOSYS before syscall_enter_from_user_mode is invoked.

@chenhuacai
Copy link

The problematic piece of kernel code is

        if ((nr == -1UL) && (onr != -1UL))
                goto out; /* Don't set -ENOSYS for SECCOMP */

Here the Don't set -ENOSYS for SECCOMP is the culprit: when the syscall is cancelled by ptrace/seccomp, the return value must be set properly to -ENOSYS.

This piece of code is correct for SECCOMP because the latter calls syscall_set_return_value itself.

Unfortunately, this approach is not applicable for ptrace: ptracer cannot set syscall return value on entering syscall because the register used for the return value is often used for other purposes. For example, on x86 it's also used for the syscall nr, and on loongarch it's also used for the first syscall argument.

I'm afraid syscall_enter_from_user_mode needs to be changed before it could be used on loongarch. Besides loongarch, the only architecture where syscall_enter_from_user_mode is used is x86 where rax is initialized to -ENOSYS before syscall_enter_from_user_mode is invoked.

Hi, could you please change do_syscall() to be like this and test ptrace/strace (I have tested seccomp)?
void noinstr do_syscall(struct pt_regs *regs)
{
unsigned long nr;
sys_call_fn syscall_fn;

    nr = regs->regs[11];

    /* Set for syscall restarting */
    if (nr < NR_syscalls)
            regs->regs[0] = nr + 1;

    regs->csr_era += 4;
    regs->orig_a0 = regs->regs[4];
    regs->regs[4] = -ENOSYS;
    nr = syscall_enter_from_user_mode(regs, nr);

    if (nr < NR_syscalls) {
            syscall_fn = sys_call_table[nr];
            regs->regs[4] = syscall_fn(regs->orig_a0, regs->regs[5], regs->regs[6],
                                       regs->regs[7], regs->regs[8], regs->regs[9]);
    }

    syscall_exit_to_user_mode(regs);

}

@ldv-alt
Copy link
Member

ldv-alt commented Jan 8, 2022

regs->orig_a0 = regs->regs[4];
regs->regs[4] = -ENOSYS;
nr = syscall_enter_from_user_mode(regs, nr);

This way the first syscall argument won't be available for ptracers because regs->orig_a0 is not exposed in struct user_pt_regs and not handled by gpr_get and gpr_set.

@chenhuacai
Copy link

chenhuacai commented Jan 8, 2022

regs->orig_a0 = regs->regs[4];
regs->regs[4] = -ENOSYS;
nr = syscall_enter_from_user_mode(regs, nr);

This way the first syscall argument won't be available for ptracers because regs->orig_a0 is not exposed in struct user_pt_regs and not handled by gpr_get and gpr_set.

Hmm, can we keep do_syscall() as is and solve the ptrace problem by overriding arch_syscall_enter_tracehook() like this?

static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs)
{

    ret = tracehook_report_syscall_entry(regs);

    if (ret)
            syscall_set_return_value(current, current_pt_regs(), -ENOSYS, 0);

    return ret;

}

xen0n added a commit to xen0n/linux that referenced this pull request Jan 8, 2022
see strace/strace#205

Suggested-by: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
@xen0n
Copy link
Contributor Author

xen0n commented Jan 8, 2022

regs->orig_a0 = regs->regs[4];
regs->regs[4] = -ENOSYS;
nr = syscall_enter_from_user_mode(regs, nr);

This way the first syscall argument won't be available for ptracers because regs->orig_a0 is not exposed in struct user_pt_regs and not handled by gpr_get and gpr_set.

Hmm, can we keep do_syscall() as is and solve the ptrace problem by overriding arch_syscall_enter_tracehook() like this?

static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs) {

    ret = tracehook_report_syscall_entry(regs);

    if (ret)
            syscall_set_return_value(current, current_pt_regs(), -ENOSYS, 0);

    return ret;

}

Hi @chenhuacai, I tried in this commit and confirmed it's not working; exactly the same failures as before.

@ldv-alt
Copy link
Member

ldv-alt commented Jan 9, 2022

Hmm, can we keep do_syscall() as is and solve the ptrace problem by overriding arch_syscall_enter_tracehook() like this?

static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs) {

    ret = tracehook_report_syscall_entry(regs);

    if (ret)
            syscall_set_return_value(current, current_pt_regs(), -ENOSYS, 0);

It's better to use regs instead of current_pt_regs() here.

return ret;

}

No, this cannot help to solve the problem because syscall tampering doesn't make tracehook_report_syscall_entry return a nonzero value.

However, this approach makes sense for a different reason - the lengthy comment before tracehook_report_syscall_entry says:
Returns zero normally, or nonzero if the calling arch code should abort the system call. That must prevent normal entry so no system call is made. If @task ever returns to user mode after this, its register state is unspecified, but should be something harmless like an %ENOSYS error return.

@chenhuacai
Copy link

chenhuacai commented Jan 9, 2022

Hmm, can we keep do_syscall() as is and solve the ptrace problem by overriding arch_syscall_enter_tracehook() like this?
static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs) {

    ret = tracehook_report_syscall_entry(regs);

    if (ret)
            syscall_set_return_value(current, current_pt_regs(), -ENOSYS, 0);

It's better to use regs instead of current_pt_regs() here.

return ret;

}

No, this cannot help to solve the problem because syscall tampering doesn't make tracehook_report_syscall_entry return a nonzero value.

I'm puzzled here. I think the original problem is if syscall_enter_from_user_mode() return -1 by ptrace we should make the syscall return -ENOSYS. And we don't need to do anything if tracehook_report_syscall_entry() return 0. Because if tracehook_report_syscall_entry() return 0, then syscall_enter_from_user_mode() also return 0 (there is no SYSCALL_WORK_SYSCALL_EMU on LoongArch), and then do_syscall() will do the real syscall.

However, this approach makes sense for a different reason - the lengthy comment before tracehook_report_syscall_entry says: Returns zero normally, or nonzero if the calling arch code should abort the system call. That must prevent normal entry so no system call is made. If @task ever returns to user mode after this, its register state is unspecified, but should be something harmless like an %ENOSYS error return.

@ldv-alt
Copy link
Member

ldv-alt commented Jan 9, 2022

I'm puzzled here. I think the original problem is if syscall_enter_from_user_mode() return -1 by ptrace we should make the syscall return -ENOSYS. And we don't need to do anything if tracehook_report_syscall_entry() return 0. Because if tracehook_report_syscall_entry() return 0, then syscall_enter_from_user_mode() also return 0 (there is no SYSCALL_WORK_SYSCALL_EMU on LoongArch), and then do_syscall() will do the real syscall.

The problem is that syscall_enter_from_user_mode does a few different things but has a single return value - the new syscall nr. In particular, it calls both arch_syscall_enter_tracehook and __secure_computing, each of them can change pt_regs, but they have different semantics: while __secure_computing is guaranteed to call syscall_set_return_value e.g. in case of SECCOMP_RET_ERRNO, arch_syscall_enter_tracehook has no such option because syscall_set_return_value is not exposed to userspace and regs[4] is used for different purposes on entering and exiting syscall.

btw, another problem with the current implementation of do_syscall is that unlike other architectures, here the ptracer cannot change the first syscall argument on entering syscall.

@xen0n
Copy link
Contributor Author

xen0n commented Jan 9, 2022

With @chenhuacai's latest patch, every test passes except this:

FAIL: xet_robust_list--pidns-translation.gen
============================================

../../src/strace: Exit of unknown pid 5 ignored
../../src/strace: Exit of unknown pid 7 ignored
../../src/strace: Exit of unknown pid 8 ignored
../../src/strace: Exit of unknown pid 10 ignored
--- exp 2022-01-09 18:28:42.137625044 +0800
+++ out 2022-01-09 18:28:42.145624888 +0800
@@ -19,7 +19,8 @@
 9     get_robust_list(4 /* 9 in strace's PID NS */, [0x7ffff3d23fe8], [24]) = 0
 9     +++ exited with 0 +++
 10    get_robust_list(0, NULL, NULL) = -1 EFAULT (Bad address)
-10    get_robust_list(5 /* 10 in strace's PID NS */, [0x7ffff3f0a970], [24]) = 0
+10    get_robust_list(5 /* 10 in strace's PID NS */,  <unfinished ...>
+10    <... get_robust_list resumed>[0x7ffff3f0a970], [24]) = 0
 10    set_robust_list(0x7ffff3d23fe8, 24) = 0
 10    get_robust_list(5 /* 10 in strace's PID NS */, [0x7ffff3d23fe8], [24]) = 0
 10    +++ exited with 0 +++
xet_robust_list--pidns-translation.gen.test: failed test: unshare -Urpf ../../src/strace --decode-pids=pidns -f -a24 -e trace=get_robust_list,set_robust_list ../xet_robust_list--pidns-translation output mismatch
FAIL xet_robust_list--pidns-translation.gen.test (exit status: 1)

@ldv-alt
Copy link
Member

ldv-alt commented Jan 9, 2022

With @chenhuacai's latest patch, every test passes except this:

FAIL: xet_robust_list--pidns-translation.gen
...
-10    get_robust_list(5 /* 10 in strace's PID NS */, [0x7ffff3f0a970], [24]) = 0
+10    get_robust_list(5 /* 10 in strace's PID NS */,  <unfinished ...>
+10    <... get_robust_list resumed>[0x7ffff3f0a970], [24]) = 0

This looks like a race, how often do you see this?
Could you try running this particular test a few more times, e.g.
make -s check TESTS=xet_robust_list--pidns-translation.gen.test VERBOSE=1?

@ldv-alt
Copy link
Member

ldv-alt commented Jan 9, 2022

With @chenhuacai's latest patch, every test passes except this:

Yes, this looks plausible.

As I said earlier, another problem with the current implementation of do_syscall is that unlike other architectures, here the ptracer cannot change the first syscall argument because it is currently stored in regs->orig_a0 which is out of reach of ptracers. What do you think about exposing regs->orig_a0 to ptracers?

@chenhuacai
Copy link

With @chenhuacai's latest patch, every test passes except this:

Yes, this looks plausible.

As I said earlier, another problem with the current implementation of do_syscall is that unlike other architectures, here the ptracer cannot change the first syscall argument because it is currently stored in regs->orig_a0 which is out of reach of ptracers. What do you think about exposing regs->orig_a0 to ptracers?

Exposing orig_a0 means modifying user_pt_regs, gpr_get(), gpr_set() and all tracers (gdb, strace, etc) and don't need to overriding arch_syscall_enter_tracehook(), right?

@xen0n
Copy link
Contributor Author

xen0n commented Jan 9, 2022

With @chenhuacai's latest patch, every test passes except this:

FAIL: xet_robust_list--pidns-translation.gen
...
-10    get_robust_list(5 /* 10 in strace's PID NS */, [0x7ffff3f0a970], [24]) = 0
+10    get_robust_list(5 /* 10 in strace's PID NS */,  <unfinished ...>
+10    <... get_robust_list resumed>[0x7ffff3f0a970], [24]) = 0

This looks like a race, how often do you see this? Could you try running this particular test a few more times, e.g. make -s check TESTS=xet_robust_list--pidns-translation.gen.test VERBOSE=1?

Hmm, I'm back home, and just re-run that failing case multiple times with the command line you provided, and it now passes every time. 🤦

@ldv-alt
Copy link
Member

ldv-alt commented Jan 9, 2022 via email

@chenhuacai
Copy link

chenhuacai commented Jan 12, 2022

On Sun, Jan 09, 2022 at 05:58:44AM -0800, Huacai Chen wrote: > > With @chenhuacai's latest patch, every test passes except this: > > Yes, this looks plausible. > > As I said earlier, another problem with the current implementation of do_syscall is that unlike other architectures, here the ptracer cannot change the first syscall argument because it is currently stored in regs->orig_a0 which is out of reach of ptracers. What do you think about exposing regs->orig_a0 to ptracers? Exposing orig_a0 means modifying user_pt_regs, gpr_get(), gpr_set() and all tracers (gdb, strace, etc) and don't need to overriding arch_syscall_enter_tracehook(), right?
Yes, exposing orig_a0 means modifying user_pt_regs, gpr_get(), gpr_set(), and all tracers. This would allow to do regs->regs[4] = -ENOSYS; in do_syscall() right before the syscall_enter_from_user_mode() invocation, and there won't be any need to override arch_syscall_enter_tracehook() then.

Hmm, there is a more simple solution, overide arch_syscall_enter_tracehook() as below:

static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs)
{

    int ret = tracehook_report_syscall_entry(regs);

    regs->orig_a0 = regs->regs[4];
    syscall_set_return_value(current, regs, -ENOSYS, 0);

    return ret;

}

@xen0n
Copy link
Contributor Author

xen0n commented Jan 12, 2022

The latest suggested modification made all tests pass:

============================================================================
Testsuite summary for strace 5.13.0.317.9f96e2
============================================================================
# TOTAL: 1153
# PASS:  958
# SKIP:  195
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0
============================================================================

(BTW please fix your Markdown syntax @chenhuacai)

@ldv-alt
Copy link
Member

ldv-alt commented Jan 12, 2022 via email

@ldv-alt
Copy link
Member

ldv-alt commented Jan 12, 2022 via email

@xen0n
Copy link
Contributor Author

xen0n commented Jan 13, 2022

On Wed, Jan 12, 2022 at 04:05:21AM -0800, Dmitry V. Levin wrote: On Wed, Jan 12, 2022 at 01:13:01AM -0800, WÁNG Xuěruì wrote: > The latest suggested modification made all tests pass: Yes, this should work indeed, thanks.
I suppose this PR could be merged as soon as it passes all tests on a cfarm loongarch host.

I've pinged @chenhuacai for arranging the kernel updates for the cfarm hosts; meanwhile the test failures in the commit message could be removed as well.

@chenhuacai
Copy link

On Wed, Jan 12, 2022 at 04:05:21AM -0800, Dmitry V. Levin wrote: On Wed, Jan 12, 2022 at 01:13:01AM -0800, WÁNG Xuěruì wrote: > The latest suggested modification made all tests pass: Yes, this should work indeed, thanks.
I suppose this PR could be merged as soon as it passes all tests on a cfarm loongarch host.

I've pinged @chenhuacai for arranging the kernel updates for the cfarm hosts; meanwhile the test failures in the commit message could be removed as well.

kernel has been updated.

@xen0n
Copy link
Contributor Author

xen0n commented Jan 13, 2022

I've just rebased to latest master branch and removed the test failure message from the commit message; I'll mark this as ready after re-running the test suite.

@xen0n
Copy link
Contributor Author

xen0n commented Jan 13, 2022

Tests pass on my development box:

============================================================================
Testsuite summary for strace 5.13.0.368.351b
============================================================================
# TOTAL: 1184
# PASS:  959
# SKIP:  225
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0
============================================================================

uname -a: Linux lily 5.16.0-rc8-00066-g60ef89495465 #3 SMP PREEMPT Thu Nov 25 11:42:54 PM CST 2021 loongarch64 GNU/Linux (built from this commit)

I think this should be ready now. Thanks for the help and revision to my code!

@xen0n xen0n marked this pull request as ready for review January 13, 2022 11:41
@xen0n xen0n changed the title [RFC] Add 64-bit LoongArch support Add 64-bit LoongArch support Jan 13, 2022
xen0n and others added 3 commits January 13, 2022 19:43
This is based on the WIP Linux port still under review, but the port is
re-using asm-generic syscall numbers and parameters, so breakage should
be minimal when the port is eventually merged, if at all.

Currently only the LP64* ABIs are implemented in the toolchains, so only
support the host type "loongarch64".

* NEWS: Mention this.
* configure.ac [$host_cpu == loongarch64]: Define LOONGARCH64.
* src/linux/loongarch64/arch_defs_.h: New file.
* src/linux/loongarch64/arch_regs.c: Likewise.
* src/linux/loongarch64/get_error.c: Likewise.
* src/linux/loongarch64/get_scno.c: Likewise.
* src/linux/loongarch64/get_syscall_args.c: Likewise.
* src/linux/loongarch64/ioctls_arch0.h: Likewise.
* src/linux/loongarch64/ioctls_inc0.h: Likewise.
* src/linux/loongarch64/raw_syscall.h: Likewise.
* src/linux/loongarch64/set_error.c: Likewise.
* src/linux/loongarch64/set_scno.c: Likewise.
* src/linux/loongarch64/syscallent.h: Likewise.
* src/Makefile.am (EXTRA_DIST): Add them.
* src/xlat/elf_em.in: Add EM_LOONGARCH.

Link: https://lore.kernel.org/lkml/20211013063656.3084555-1-chenhuacai@loongson.cn/
* src/xlat/audit_arch.in (AUDIT_ARCH_LOONGARCH32,
AUDIT_ARCH_LOONGARCH64): New constants.

Link: https://lore.kernel.org/lkml/20211013063656.3084555-4-chenhuacai@loongson.cn/
* src/linux/loongarch64/arch_fpregset.c: New file.
* src/linux/loongarch64/arch_fpregset.h: Likewise.
* src/linux/loongarch64/arch_prstatus_regset.c: Likewise.
* src/linux/loongarch64/arch_prstatus_regset.h: Likewise.
* src/Makefile.am (EXTRA_DIST): Add them.
* tests/ptrace.c [__loongarch__] (TRACEE_REGS_STRUCT): Define.
(print_prstatus_regset, print_fpregset) [__loongarch__]: Update
expected output.
@xen0n
Copy link
Contributor Author

xen0n commented Jan 13, 2022

(For the record: the HEAD before my force-push was 644ac28, and if the missing syscall tampering support is to be called out (as is the case for the linked version of Linux port), the commit message could be recovered from there.)

Copy link
Member

@ldv-alt ldv-alt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on gcc401.fsffrance.org:
Linux gcc401 5.16.0+ #1266 SMP PREEMPT Thu Jan 13 18:08:10 CST 2022 loongarch64 GNU/Linux

Testsuite summary for strace 5.16.0.5.351b
============================================================================
# TOTAL: 1181
# PASS:  961
# SKIP:  220
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

@ldv-alt ldv-alt merged commit 2bf68c3 into strace:master Jan 13, 2022
@xen0n xen0n deleted the loongarch-master branch January 14, 2022 01:10
@chenhuacai
Copy link

I'm sorry that we still have to export orig_a0, because the current solution cannot let ptrace to modify arg0 and return value at the same time.

@xen0n
Copy link
Contributor Author

xen0n commented Jan 23, 2022

I'm sorry that we still have to export orig_a0, because the current solution cannot let ptrace to modify arg0 and return value at the same time.

Well, full user-space rebuilds are ready, it's only a matter of putting out the new branch and corresponding glibc sources on your part before I can test...

@xen0n
Copy link
Contributor Author

xen0n commented Jan 23, 2022

Also I don't think the already merged strace code is affected, but it's nice for you to sync this info with us. 👍

@ldv-alt
Copy link
Member

ldv-alt commented Jan 23, 2022 via email

@ldv-alt
Copy link
Member

ldv-alt commented Jan 23, 2022 via email

@chenhuacai
Copy link

On Sun, Jan 23, 2022 at 02:49:27AM -0800, Huacai Chen wrote: I'm sorry that we still have to export orig_a0, because the current solution cannot let ptrace to modify arg0 and return value at the same time.
What ptracers usually do is tampering with the syscall number and/or its arguments on entering syscall and tampering with the return value on exiting syscall. Tampering with the return value on entering syscall is not reliable as it is not supported by all linux architectures. That is, I don't see why the change is necessary.

Could you please see tools/testing/selftests/seccomp/seccomp_bpf.c in kernel? TRACE_syscall.ptrace.syscall_errno and TRACE_syscall.ptrace.syscall_faked fails with the current solution.

@ldv-alt
Copy link
Member

ldv-alt commented Jan 23, 2022 via email

@xen0n
Copy link
Contributor Author

xen0n commented Jan 23, 2022

On Sun, Jan 23, 2022 at 02:55:10AM -0800, WÁNG Xuěruì wrote:

Also I don't think the already merged strace code is affected, but it's nice for you to sync this info with us. 👍

If struct user_pt_regs is changed, then at least its decoder and the test
(src/linux/loongarch64/arch_prstatus_regset.c and tests/ptrace.c)
would have to be updated.

Oops, of course; I didn't implement the bits in my first take but the changes are definitely needed. Lucky we still have the chance to do that before the ABI freeze...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants