Skip to content

Commit

Permalink
Merge branch 'for-next/mte' into for-next/core
Browse files Browse the repository at this point in the history
Add userspace support for the Memory Tagging Extension introduced by
Armv8.5.

(Catalin Marinas and others)
* for-next/mte: (30 commits)
  arm64: mte: Fix typo in memory tagging ABI documentation
  arm64: mte: Add Memory Tagging Extension documentation
  arm64: mte: Kconfig entry
  arm64: mte: Save tags when hibernating
  arm64: mte: Enable swap of tagged pages
  mm: Add arch hooks for saving/restoring tags
  fs: Handle intra-page faults in copy_mount_options()
  arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset
  arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support
  arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
  arm64: mte: Restore the GCR_EL1 register after a suspend
  arm64: mte: Allow user control of the generated random tags via prctl()
  arm64: mte: Allow user control of the tag check mode via prctl()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Introduce arch_validate_flags()
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_calc_vm_flag_bits()
  arm64: mte: Tags-aware aware memcmp_pages() implementation
  arm64: Avoid unnecessary clear_user_page() indirection
  ...
  • Loading branch information
willdeacon committed Oct 2, 2020
2 parents 0a21ac0 + b575614 commit baab853
Show file tree
Hide file tree
Showing 63 changed files with 1,764 additions and 61 deletions.
2 changes: 2 additions & 0 deletions Documentation/arm64/cpu-feature-registers.rst
Expand Up @@ -175,6 +175,8 @@ infrastructure:
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
| MTE | [11-8] | y |
+------------------------------+---------+---------+
| SSBS | [7-4] | y |
+------------------------------+---------+---------+
| BT | [3-0] | y |
Expand Down
4 changes: 4 additions & 0 deletions Documentation/arm64/elf_hwcaps.rst
Expand Up @@ -240,6 +240,10 @@ HWCAP2_BTI

Functionality implied by ID_AA64PFR0_EL1.BT == 0b0001.

HWCAP2_MTE

Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
by Documentation/arm64/memory-tagging-extension.rst.

4. Unused AT_HWCAP bits
-----------------------
Expand Down
1 change: 1 addition & 0 deletions Documentation/arm64/index.rst
Expand Up @@ -14,6 +14,7 @@ ARM64 Architecture
hugetlbpage
legacy_instructions
memory
memory-tagging-extension
perf
pointer-authentication
silicon-errata
Expand Down
305 changes: 305 additions & 0 deletions Documentation/arm64/memory-tagging-extension.rst
@@ -0,0 +1,305 @@
===============================================
Memory Tagging Extension (MTE) in AArch64 Linux
===============================================

Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
Catalin Marinas <catalin.marinas@arm.com>

Date: 2020-02-25

This document describes the provision of the Memory Tagging Extension
functionality in AArch64 Linux.

Introduction
============

ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
(Top Byte Ignore) feature and allows software to access a 4-bit
allocation tag for each 16-byte granule in the physical address space.
Such memory range must be mapped with the Normal-Tagged memory
attribute. A logical tag is derived from bits 59-56 of the virtual
address used for the memory access. A CPU with MTE enabled will compare
the logical tag against the allocation tag and potentially raise an
exception on mismatch, subject to system registers configuration.

Userspace Support
=================

When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
supported by the hardware, the kernel advertises the feature to
userspace via ``HWCAP2_MTE``.

PROT_MTE
--------

To access the allocation tags, a user process must enable the Tagged
memory attribute on an address range using a new ``prot`` flag for
``mmap()`` and ``mprotect()``:

``PROT_MTE`` - Pages allow access to the MTE allocation tags.

The allocation tag is set to 0 when such pages are first mapped in the
user address space and preserved on copy-on-write. ``MAP_SHARED`` is
supported and the allocation tags can be shared between processes.

**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
types of mapping will result in ``-EINVAL`` returned by these system
calls.

**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
be cleared by ``mprotect()``.

**Note**: ``madvise()`` memory ranges with ``MADV_DONTNEED`` and
``MADV_FREE`` may have the allocation tags cleared (set to 0) at any
point after the system call.

Tag Check Faults
----------------

When ``PROT_MTE`` is enabled on an address range and a mismatch between
the logical and allocation tags occurs on access, there are three
configurable behaviours:

- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
tag check fault.

- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
memory access is not performed. If ``SIGSEGV`` is ignored or blocked
by the offending thread, the containing process is terminated with a
``coredump``.

- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the offending
thread, asynchronously following one or multiple tag check faults,
with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0`` (the faulting
address is unknown).

The user can select the above modes, per thread, using the
``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
bit-field:

- ``PR_MTE_TCF_NONE`` - *Ignore* tag check faults
- ``PR_MTE_TCF_SYNC`` - *Synchronous* tag check fault mode
- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode

The current tag check fault mode can be read using the
``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.

Tag checking can also be disabled for a user thread by setting the
``PSTATE.TCO`` bit with ``MSR TCO, #1``.

**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
irrespective of the interrupted context. ``PSTATE.TCO`` is restored on
``sigreturn()``.

**Note**: There are no *match-all* logical tags available for user
applications.

**Note**: Kernel accesses to the user address space (e.g. ``read()``
system call) are not checked if the user thread tag checking mode is
``PR_MTE_TCF_NONE`` or ``PR_MTE_TCF_ASYNC``. If the tag checking mode is
``PR_MTE_TCF_SYNC``, the kernel makes a best effort to check its user
address accesses, however it cannot always guarantee it.

Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
-----------------------------------------------------------------

The architecture allows excluding certain tags to be randomly generated
via the ``GCR_EL1.Exclude`` register bit-field. By default, Linux
excludes all tags other than 0. A user thread can enable specific tags
in the randomly generated set using the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
flags, 0, 0, 0)`` system call where ``flags`` contains the tags bitmap
in the ``PR_MTE_TAG_MASK`` bit-field.

**Note**: The hardware uses an exclude mask but the ``prctl()``
interface provides an include mask. An include mask of ``0`` (exclusion
mask ``0xffff``) results in the CPU always generating tag ``0``.

Initial process state
---------------------

On ``execve()``, the new process has the following configuration:

- ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
- Tag checking mode set to ``PR_MTE_TCF_NONE``
- ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
- ``PSTATE.TCO`` set to 0
- ``PROT_MTE`` not set on any of the initial memory maps

On ``fork()``, the new process inherits the parent's configuration and
memory map attributes with the exception of the ``madvise()`` ranges
with ``MADV_WIPEONFORK`` which will have the data and tags cleared (set
to 0).

The ``ptrace()`` interface
--------------------------

``PTRACE_PEEKMTETAGS`` and ``PTRACE_POKEMTETAGS`` allow a tracer to read
the tags from or set the tags to a tracee's address space. The
``ptrace()`` system call is invoked as ``ptrace(request, pid, addr,
data)`` where:

- ``request`` - one of ``PTRACE_PEEKMTETAGS`` or ``PTRACE_POKEMTETAGS``.
- ``pid`` - the tracee's PID.
- ``addr`` - address in the tracee's address space.
- ``data`` - pointer to a ``struct iovec`` where ``iov_base`` points to
a buffer of ``iov_len`` length in the tracer's address space.

The tags in the tracer's ``iov_base`` buffer are represented as one
4-bit tag per byte and correspond to a 16-byte MTE tag granule in the
tracee's address space.

**Note**: If ``addr`` is not aligned to a 16-byte granule, the kernel
will use the corresponding aligned address.

``ptrace()`` return value:

- 0 - tags were copied, the tracer's ``iov_len`` was updated to the
number of tags transferred. This may be smaller than the requested
``iov_len`` if the requested address range in the tracee's or the
tracer's space cannot be accessed or does not have valid tags.
- ``-EPERM`` - the specified process cannot be traced.
- ``-EIO`` - the tracee's address range cannot be accessed (e.g. invalid
address) and no tags copied. ``iov_len`` not updated.
- ``-EFAULT`` - fault on accessing the tracer's memory (``struct iovec``
or ``iov_base`` buffer) and no tags copied. ``iov_len`` not updated.
- ``-EOPNOTSUPP`` - the tracee's address does not have valid tags (never
mapped with the ``PROT_MTE`` flag). ``iov_len`` not updated.

**Note**: There are no transient errors for the requests above, so user
programs should not retry in case of a non-zero system call return.

``PTRACE_GETREGSET`` and ``PTRACE_SETREGSET`` with ``addr ==
``NT_ARM_TAGGED_ADDR_CTRL`` allow ``ptrace()`` access to the tagged
address ABI control and MTE configuration of a process as per the
``prctl()`` options described in
Documentation/arm64/tagged-address-abi.rst and above. The corresponding
``regset`` is 1 element of 8 bytes (``sizeof(long))``).

Example of correct usage
========================

*MTE Example code*

.. code-block:: c
/*
* To be compiled with -march=armv8.5-a+memtag
*/
#include <errno.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/auxv.h>
#include <sys/mman.h>
#include <sys/prctl.h>
/*
* From arch/arm64/include/uapi/asm/hwcap.h
*/
#define HWCAP2_MTE (1 << 18)
/*
* From arch/arm64/include/uapi/asm/mman.h
*/
#define PROT_MTE 0x20
/*
* From include/uapi/linux/prctl.h
*/
#define PR_SET_TAGGED_ADDR_CTRL 55
#define PR_GET_TAGGED_ADDR_CTRL 56
# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
# define PR_MTE_TCF_SHIFT 1
# define PR_MTE_TCF_NONE (0UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TCF_SYNC (1UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TCF_ASYNC (2UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TCF_MASK (3UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TAG_SHIFT 3
# define PR_MTE_TAG_MASK (0xffffUL << PR_MTE_TAG_SHIFT)
/*
* Insert a random logical tag into the given pointer.
*/
#define insert_random_tag(ptr) ({ \
uint64_t __val; \
asm("irg %0, %1" : "=r" (__val) : "r" (ptr)); \
__val; \
})
/*
* Set the allocation tag on the destination address.
*/
#define set_tag(tagged_addr) do { \
asm volatile("stg %0, [%0]" : : "r" (tagged_addr) : "memory"); \
} while (0)
int main()
{
unsigned char *a;
unsigned long page_sz = sysconf(_SC_PAGESIZE);
unsigned long hwcap2 = getauxval(AT_HWCAP2);
/* check if MTE is present */
if (!(hwcap2 & HWCAP2_MTE))
return EXIT_FAILURE;
/*
* Enable the tagged address ABI, synchronous MTE tag check faults and
* allow all non-zero tags in the randomly generated set.
*/
if (prctl(PR_SET_TAGGED_ADDR_CTRL,
PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfffe << PR_MTE_TAG_SHIFT),
0, 0, 0)) {
perror("prctl() failed");
return EXIT_FAILURE;
}
a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (a == MAP_FAILED) {
perror("mmap() failed");
return EXIT_FAILURE;
}
/*
* Enable MTE on the above anonymous mmap. The flag could be passed
* directly to mmap() and skip this step.
*/
if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
perror("mprotect() failed");
return EXIT_FAILURE;
}
/* access with the default tag (0) */
a[0] = 1;
a[1] = 2;
printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
/* set the logical and allocation tags */
a = (unsigned char *)insert_random_tag(a);
set_tag(a);
printf("%p\n", a);
/* non-zero tag access */
a[0] = 3;
printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
/*
* If MTE is enabled correctly the next instruction will generate an
* exception.
*/
printf("Expecting SIGSEGV...\n");
a[16] = 0xdd;
/* this should not be printed in the PR_MTE_TCF_SYNC mode */
printf("...haven't got one\n");
return EXIT_FAILURE;
}
33 changes: 33 additions & 0 deletions arch/arm64/Kconfig
Expand Up @@ -1645,6 +1645,39 @@ config ARCH_RANDOM
provides a high bandwidth, cryptographically secure
hardware random number generator.

config ARM64_AS_HAS_MTE
# Initial support for MTE went in binutils 2.32.0, checked with
# ".arch armv8.5-a+memtag" below. However, this was incomplete
# as a late addition to the final architecture spec (LDGM/STGM)
# is only supported in the newer 2.32.x and 2.33 binutils
# versions, hence the extra "stgm" instruction check below.
def_bool $(as-instr,.arch armv8.5-a+memtag\nstgm xzr$(comma)[x0])

config ARM64_MTE
bool "Memory Tagging Extension support"
default y
depends on ARM64_AS_HAS_MTE && ARM64_TAGGED_ADDR_ABI
select ARCH_USES_HIGH_VMA_FLAGS
help
Memory Tagging (part of the ARMv8.5 Extensions) provides
architectural support for run-time, always-on detection of
various classes of memory error to aid with software debugging
to eliminate vulnerabilities arising from memory-unsafe
languages.

This option enables the support for the Memory Tagging
Extension at EL0 (i.e. for userspace).

Selecting this option allows the feature to be detected at
runtime. Any secondary CPU not implementing this feature will
not be allowed a late bring-up.

Userspace binaries that want to use this feature must
explicitly opt in. The mechanism for the userspace is
described in:

Documentation/arm64/memory-tagging-extension.rst.

endmenu

config ARM64_SVE
Expand Down
3 changes: 2 additions & 1 deletion arch/arm64/include/asm/cpucaps.h
Expand Up @@ -64,7 +64,8 @@
#define ARM64_BTI 54
#define ARM64_HAS_ARMv8_4_TTL 55
#define ARM64_HAS_TLB_RANGE 56
#define ARM64_MTE 57

#define ARM64_NCAPS 57
#define ARM64_NCAPS 58

#endif /* __ASM_CPUCAPS_H */
6 changes: 6 additions & 0 deletions arch/arm64/include/asm/cpufeature.h
Expand Up @@ -681,6 +681,12 @@ static __always_inline bool system_uses_irq_prio_masking(void)
cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
}

static inline bool system_supports_mte(void)
{
return IS_ENABLED(CONFIG_ARM64_MTE) &&
cpus_have_const_cap(ARM64_MTE);
}

static inline bool system_has_prio_mask_debugging(void)
{
return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
Expand Down

0 comments on commit baab853

Please sign in to comment.