Skip to content

Commit

Permalink
Adds a new ioctl32 syscall for backwards compatibility layers
Browse files Browse the repository at this point in the history
Problem presented:
A backwards compatibility layer that allows running x86-64 and x86
processes inside of an AArch64 process.
  - CPU is emulated
  - Syscall interface is mostly passthrough
  - Some syscalls require patching or emulation depending on behaviour
  - Not viable from the emulator design to use an AArch32 host process

x86-64 and x86 userspace emulator source:
https://github.com/FEX-Emu/FEX
Usage of ioctl32 is currently in a downstream fork. This will be the
first user of the syscall.

Cross documentation:
https://github.com/FEX-Emu/FEX/wiki/32Bit-x86-Woes#ioctl---54

ioctls are opaque from the emulator perspective and the data wants to be
passed through a syscall as unimpeded as possible.
Sadly due to ioctl struct differences between x86 and x86-64, we need a
syscall that exposes the compatibility ioctl handler to userspace in a
64bit process.

This is necessary behaves of the behaviour differences that occur
between an x86 process doing an ioctl and an x86-64 process doing an
ioctl.

Both of which are captured and passed through the AArch64 ioctl space.
This is implementing a new ioctl32 syscall that allows us to pass 32bit
x86 ioctls through to the kernel with zero or minimal manipulation.

The only supported hosts where we care about this currently is AArch64
and x86-64 (For testing purposes).
PPC64LE, MIPS64LE, and RISC-V64 might be interesting to support in the
future; But I don't have any platforms that get anywhere near Cortex-A77
performance in those architectures. Nor do I have the time to bring up
the emulator on them.
x86-64 can get to the compatibility ioctl through the int $0x80 handler.

This does not solve the following problems:
1) compat_alloc_user_space inside ioctl
2) ioctls that check task mode instead of entry point for behaviour
3) ioctls allocating memory
4) struct packing problems between architectures

Workarounds for the problems presented:
1a) Do a stack pivot to the lower 32bits from userspace
  - Forces host 64bit process to have its thread stacks to live in 32bit
  space. Not ideal.
  - Only do a stack pivot on ioctl to save previous 32bit VA space
1b) Teach kernel that compat_alloc_userspace can return a 64bit pointer
  - x86-64 truncates stack from this function
  - AArch64 returns the full stack pointer
  - Only ~29 users. Validating all of them support a 64bit stack is
  trivial?

2a) Any application using these can be checked for compatibility in
userspace and put on a block list.
2b) Fix any ioctls doing broken behaviour based on task mode rather than
ioctl entry point

3a) Userspace consumes all VA space above 32bit. Forcing allocations to
occur in lower 32bits
  - This is the current implementation
3b) Ensure any allocation in the ioctl handles ioctl entrypoint rather
than just allow generic memory allocations in full VA space
  - This is hard to guarantee

4a) Blocklist any application using ioctls that have different struct
packing across the boundary
  - Can happen when struct packing of 32bit x86 application goes down
  the aarch64 compat_ioctl path
  - Userspace is a AArch64 process passing 32bit x86 ioctl structures
  through the compat_ioctl path which is typically for AArch32 processes
  - None currently identified
4b) Work with upstream kernel and userspace projects to evaluate and fix
  - Identify the problem ioctls
  - Implement a new ioctl with more sane struct packing that matches
  cross-arch
  - Implement new ioctl while maintaining backwards compatibility with
  previous ioctl handler
  - Change upstream project to use the new compatibility ioctl
  - ioctl deprecation will be case by case per device and project
4b) Userspace implements a full ioctl emulation layer
  - Parses the full ioctl tree
  - Either passes through ioctls that it doesn't understand or
  transforms ioctls that it knows are trouble
  - Has the downside that it can still run in to edge cases that will
  fail
  - Performance of additional tracking is a concern
  - Prone to failure keeping the kernel ioctl and userspace ioctl
  handling in sync
  - Really want to have it in the kernel space as much as possible

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
  • Loading branch information
Sonicadvance1 authored and intel-lab-lkp committed Jan 6, 2021
1 parent e71ba94 commit a927032
Show file tree
Hide file tree
Showing 7 changed files with 38 additions and 5 deletions.
2 changes: 1 addition & 1 deletion arch/arm64/include/asm/unistd.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)

#define __NR_compat_syscalls 442
#define __NR_compat_syscalls 443
#endif

#define __ARCH_WANT_SYS_CLONE
Expand Down
2 changes: 2 additions & 0 deletions arch/arm64/include/asm/unistd32.h
Original file line number Diff line number Diff line change
Expand Up @@ -891,6 +891,8 @@ __SYSCALL(__NR_faccessat2, sys_faccessat2)
__SYSCALL(__NR_process_madvise, sys_process_madvise)
#define __NR_epoll_pwait2 441
__SYSCALL(__NR_epoll_pwait2, compat_sys_epoll_pwait2)
#define __NR_ioctl32 442
__SYSCALL(__NR_ioctl32, compat_sys_ioctl)

/*
* Please add new compat syscalls above this comment and update
Expand Down
16 changes: 14 additions & 2 deletions fs/ioctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -790,8 +790,8 @@ long compat_ptr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
}
EXPORT_SYMBOL(compat_ptr_ioctl);

COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
compat_ulong_t, arg)
long do_ioctl32(unsigned int fd, unsigned int cmd,
compat_ulong_t arg)
{
struct fd f = fdget(fd);
int error;
Expand Down Expand Up @@ -850,4 +850,16 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,

return error;
}

COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
compat_ulong_t, arg)
{
return do_ioctl32(fd, cmd, arg);
}

SYSCALL_DEFINE3(ioctl32, unsigned int, fd, unsigned int, cmd,
compat_ulong_t, arg)
{
return do_ioctl32(fd, cmd, arg);
}
#endif
2 changes: 2 additions & 0 deletions include/linux/syscalls.h
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,8 @@ asmlinkage long sys_inotify_rm_watch(int fd, __s32 wd);
/* fs/ioctl.c */
asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd,
unsigned long arg);
asmlinkage long sys_ioctl32(unsigned int fd, unsigned int cmd,
compat_ulong_t arg);

/* fs/ioprio.c */
asmlinkage long sys_ioprio_set(int which, int who, int ioprio);
Expand Down
9 changes: 8 additions & 1 deletion include/uapi/asm-generic/unistd.h
Original file line number Diff line number Diff line change
Expand Up @@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise)
#define __NR_epoll_pwait2 441
__SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)

#define __NR_ioctl32 442
#ifdef CONFIG_COMPAT
__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl)
#else
__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall)
#endif

#undef __NR_syscalls
#define __NR_syscalls 442
#define __NR_syscalls 443

/*
* 32 bit systems traditionally used different
Expand Down
3 changes: 3 additions & 0 deletions kernel/sys_ni.c
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,9 @@ COND_SYSCALL(recvmmsg_time32);
COND_SYSCALL_COMPAT(recvmmsg_time32);
COND_SYSCALL_COMPAT(recvmmsg_time64);

COND_SYSCALL(ioctl32);
COND_SYSCALL_COMPAT(ioctl32);

/*
* Architecture specific syscalls: see further below
*/
Expand Down
9 changes: 8 additions & 1 deletion tools/include/uapi/asm-generic/unistd.h
Original file line number Diff line number Diff line change
Expand Up @@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise)
#define __NR_epoll_pwait2 441
__SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)

#define __NR_ioctl32 442
#ifdef CONFIG_COMPAT
__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl)
#else
__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall)
#endif

#undef __NR_syscalls
#define __NR_syscalls 442
#define __NR_syscalls 443

/*
* 32 bit systems traditionally used different
Expand Down

0 comments on commit a927032

Please sign in to comment.