Skip to content
Permalink
Branch: landlock-v12
Commits on Oct 31, 2019
  1. landlock: Add user and kernel documentation for Landlock

    l0kod committed Oct 31, 2019
    This documentation can be built with the Sphinx framework.
    
    Signed-off-by: Mickaël Salaün <mic@digikod.net>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morris <jmorris@namei.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Serge E. Hallyn <serge@hallyn.com>
    Cc: Will Drewry <wad@chromium.org>
    ---
    
    Changes since v11:
    * cosmetic improvements
    
    Changes since v10:
    * replace the filesystem hooks with the ptrace one
    * remove the triggers
    * update example
    * add documenation for Landlock domains and seccomp interaction
    * reference more kernel documenation (e.g. LSM hooks)
    
    Changes since v9:
    * update with expected attach type and expected attach triggers
    
    Changes since v8:
    * remove documentation related to chaining and tagging according to this
      patch series
    
    Changes since v7:
    * update documentation according to the Landlock revamp
    
    Changes since v6:
    * add a check for ctx->event
    * rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE
    * rename Landlock version to ABI to better reflect its purpose and add a
      dedicated changelog section
    * update tables
    * relax no_new_privs recommendations
    * remove ABILITY_WRITE related functions
    * reword rule "appending" to "prepending" and explain it
    * cosmetic fixes
    
    Changes since v5:
    * update the rule hierarchy inheritance explanation
    * briefly explain ctx->arg2
    * add ptrace restrictions
    * explain EPERM
    * update example (subtype)
    * use ":manpage:"
  2. bpf,landlock: Add tests for the Landlock ptrace program type

    l0kod committed Oct 31, 2019
    Test eBPF program context access and ptrace hooks semantic.
    
    Signed-off-by: Mickaël Salaün <mic@digikod.net>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morris <jmorris@namei.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Serge E. Hallyn <serge@hallyn.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: Will Drewry <wad@chromium.org>
    ---
    
    Changes since v11:
    * cosmetic fixes
    
    Changes since v10:
    * rework tests with new Landlock ptrace programs which restrict ptrace
      thanks to the task_landlock_ptrace_ancestor() helper
    * simplify ptrace tests (make expect_ptrace implicit)
    * add tests:
      * check a child process tracing its parent
      * check Landlock domain without ptrace enforcement (e.g. useful for
        audit/signaling purpose)
      * check inherited-only domains
      * check task pointer arithmetic
    * fix flaky test for multi-core
    * increase log size
    * cosmetic renames
    * update and improve the Makefile
    
    Changes since v9:
    * replace subtype with expected_attach_type and expected_attach_triggers
    * rename inode_map_lookup() into inode_map_lookup_elem()
    * check for inode map entry without value (which is now possible thanks
      to the pointer null check)
    * use read-only inode map for Landlock programs
    
    Changes since v8:
    * update eBPF include path for macros
    * use TEST_GEN_PROGS and use the generic "clean" target
    * add more verbose errors
    * update the bpf/verifier files
    * remove chain tests (from landlock and bpf/verifier)
    * replace the whitelist tests with blacklist tests (because of stateless
      Landlock programs): remove "dotdot" tests and other depth tests
    * sync the landlock Makefile with its bpf sibling directory and use
      bpf_load_program_xattr()
    
    Changes since v7:
    * update tests and add new ones for filesystem hierarchy and Landlock
      chains.
    
    Changes since v6:
    * use the new kselftest_harness.h
    * use const variables
    * replace ASSERT_STEP with ASSERT_*
    * rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE
    * force sample library rebuild
    * fix install target
    
    Changes since v5:
    * add subtype test
    * add ptrace tests
    * split and rename files
    * cleanup and rebase
  3. bpf,landlock: Add task_landlock_ptrace_ancestor() helper

    l0kod committed Oct 31, 2019
    This new task_landlock_ptrace_ancestor() helper can be used to identify
    if the Landlock domain tied to the current tracer is in the same
    hierarchy as the domain of tracee.
    
    Indeed, ptrace(2) can be used to impersonate an unsandboxed process and
    lead to a privilege escalation.  A common use-case when sandboxing a
    process is then to forbid it to debug a less-privileged process.  A
    sandbox process (tracer) should only be allowed to trace another process
    (tracee) if the tracee has fewer privileges than the tracer.  This
    policy can be implemented with this helper.
    
    More complex helpers could be added in the future to enable other ways
    to check the relation between the tracer and the tracee.
    
    Signed-off-by: Mickaël Salaün <mic@digikod.net>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morris <jmorris@namei.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Serge E. Hallyn <serge@hallyn.com>
    Cc: Will Drewry <wad@chromium.org>
    ---
    
    Changes since v10:
    * new patch taking inspiration from the previous static ptrace policy
  4. landlock: Add ptrace LSM hooks

    l0kod committed Oct 31, 2019
    Add a first Landlock hook that can be used to enforce a security policy
    or to audit some process activities.  For a sandboxing use-case, it is
    needed to inform the kernel if a task can legitimately debug another.
    ptrace(2) can also be used by an attacker to impersonate another task
    and remain undetected while performing malicious activities.
    
    Using ptrace(2) and related features on a target process can lead to a
    privilege escalation.  A sandboxed task must then be able to tell the
    kernel if another task is more privileged, via ptrace_may_access().
    
    Signed-off-by: Mickaël Salaün <mic@digikod.net>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morris <jmorris@namei.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Serge E. Hallyn <serge@hallyn.com>
    Cc: Will Drewry <wad@chromium.org>
    ---
    
    Changes since v10:
    * revamp and replace the static policy with a Landlock hook which may be
      used by the corresponding BPF_LANDLOCK_PTRACE program (attach) type
      and a dedicated process_cmp_landlock_ptrace() BPF helper
    * check prog return value against LANDLOCK_RET_DENY (ret is a bitmask)
    
    Changes since v6:
    * factor out ptrace check
    * constify pointers
    * cleanup headers
    * use the new security_add_hooks()
  5. landlock,seccomp: Load Landlock programs per process hierarchy

    l0kod committed Oct 31, 2019
    The seccomp(2) syscall can be used by a task to apply a Landlock program
    to itself. As a seccomp filter, a Landlock program is enforced for the
    current task and all its future children. A program is immutable and a
    task can only add new restricting programs to itself, forming a list of
    programs.
    
    A Landlock program is tied to a Landlock hook. If the action on a kernel
    object is allowed by the other Linux security mechanisms (e.g. DAC,
    capabilities, other LSM), then a Landlock hook related to this kind of
    object is triggered. The list of programs for this hook is then
    evaluated. Each program return a binary value which can deny the action
    on a kernel object with a non-zero value. If every programs of the list
    return zero, then the action on the object is allowed.
    
    The next commit adds the LSM hooks to enforce the memory protection
    programs on the appropriate process hierarchies.
    
    Signed-off-by: Mickaël Salaün <mic@digikod.net>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morris <jmorris@namei.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Serge E. Hallyn <serge@hallyn.com>
    Cc: Will Drewry <wad@chromium.org>
    Link: https://lore.kernel.org/lkml/c10a503d-5e35-7785-2f3d-25ed8dd63fab@digikod.net/
    ---
    
    Changes since v11:
    * fix build of seccomp without Landlock (reported by kbuild test robot)
    
    Changes since v10:
    * rewrite the Landlock program attaching mechanisme to not rely on
      internal seccomp structures but only on the (LSM-stacked) task's
      credentials:
      * make the use of seccomp (and task's credentials) optional if not
        relying on its syscall, which may be useful for domains defined by
        other means (e.g. cgroups or system-wide thanks to a dedicated
        securityfs)
    
    Changes since v9:
    * replace subtype with expected_attach_type and expected_attach_triggers
    
    Changes since v8:
    * Remove the chaining concept from the eBPF program contexts (chain and
      cookie). We need to keep these subtypes this way to be able to make
      them evolve, though.
    
    Changes since v7:
    * handle and verify program chains
    * split and rename providers.c to enforce.c and enforce_seccomp.c
    * rename LANDLOCK_SUBTYPE_* to LANDLOCK_*
    
    Changes since v6:
    * rename some functions with more accurate names to reflect that an eBPF
      program for Landlock could be used for something else than a rule
    * reword rule "appending" to "prepending" and explain it
    * remove the superfluous no_new_privs check, only check global
      CAP_SYS_ADMIN when prepending a Landlock rule (needed for containers)
    * create and use {get,put}_seccomp_landlock() (suggested by Kees Cook)
    * replace ifdef with static inlined function (suggested by Kees Cook)
    * use get_user() (suggested by Kees Cook)
    * replace atomic_t with refcount_t (requested by Kees Cook)
    * move struct landlock_{rule,events} from landlock.h to common.h
    * cleanup headers
    
    Changes since v5:
    * remove struct landlock_node and use a similar inheritance mechanisme
      as seccomp-bpf (requested by Andy Lutomirski)
    * rename SECCOMP_ADD_LANDLOCK_RULE to SECCOMP_APPEND_LANDLOCK_RULE
    * rename file manager.c to providers.c
    * add comments
    * typo and cosmetic fixes
    
    Changes since v4:
    * merge manager and seccomp patches
    * return -EFAULT in seccomp(2) when user_bpf_fd is null to easely check
      if Landlock is supported
    * only allow a process with the global CAP_SYS_ADMIN to use Landlock
      (will be lifted in the future)
    * add an early check to exit as soon as possible if the current process
      does not have Landlock rules
    
    Changes since v3:
    * remove the hard link with seccomp (suggested by Andy Lutomirski and
      Kees Cook):
      * remove the cookie which could imply multiple evaluation of Landlock
        rules
      * remove the origin field in struct landlock_data
    * remove documentation fix (merged upstream)
    * rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
    * internal renaming
    * split commit
    * new design to be able to inherit on the fly the parent rules
    
    Changes since v2:
    * Landlock programs can now be run without seccomp filter but for any
      syscall (from the process) or interruption
    * move Landlock related functions and structs into security/landlock/*
      (to manage cgroups as well)
    * fix seccomp filter handling: run Landlock programs for each of their
      legitimate seccomp filter
    * properly clean up all seccomp results
    * cosmetic changes to ease the understanding
    * fix some ifdef
  6. landlock: Add the management of domains

    l0kod committed Oct 31, 2019
    A Landlock domain is a set of eBPF programs.  There is a list for each
    different program types that can be run on a specific Landlock hook
    (e.g. ptrace).  A domain is tied to a set of subjects (i.e. tasks).  A
    Landlock program should not try (nor be able) to infer which subject is
    currently enforced, but to have a unique security policy for all
    subjects tied to the same domain.  This make the reasoning much easier
    and help avoid pitfalls.
    
    The next commits tie a domain to a task's credentials thanks to
    seccomp(2), but we could use cgroups or a security file-system to
    enforce a sysadmin-defined policy .
    
    Signed-off-by: Mickaël Salaün <mic@digikod.net>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morris <jmorris@namei.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Serge E. Hallyn <serge@hallyn.com>
    Cc: Will Drewry <wad@chromium.org>
    ---
    
    Changes since v11:
    * remove old code from previous refactoring (removing the program
      chaining concept) and simplify program prepending (reported by Serge
      E. Hallyn):
      * simplify landlock_prepend_prog() and merge it with
        store_landlock_prog()
      * add new_prog_list() and rework new_landlock_domain()
      * remove the extra page allocation checks, only rely on the eBPF
        program checks
    * replace the -EINVAL for the duplicate program check with the -EEXIST
    
    Changes since v10:
    * rename files and names to clearly define a domain
    * create a standalone patch to ease review
  7. bpf,landlock: Define an eBPF program type for Landlock hooks

    l0kod committed Oct 31, 2019
    Add a new type of eBPF program used by Landlock hooks.  The goal of this
    type of program is to accept or deny a requested access from userspace
    to a kernel object (e.g. process).  This will be more useful with the
    next commit adding a new eBPF helper.
    
    This new BPF program type will be registered with the Landlock LSM
    initialization.
    
    Add an initial Landlock Kconfig and update the MAINTAINERS file.
    
    Signed-off-by: Mickaël Salaün <mic@digikod.net>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morris <jmorris@namei.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Serge E. Hallyn <serge@hallyn.com>
    Cc: Will Drewry <wad@chromium.org>
    ---
    
    Changes since v10:
    * replace file system program types with a (simpler) ptrace program type
    * add an eBPF task pointer type
    * split files
    
    Changes since v9:
    * handle inode put and map put, which fix unmount (reported by Al Viro)
    * replace subtype with expected_attach_type and expected_attach_triggers
    * check eBPF program return code
    
    Changes since v8:
    * Remove the chaining concept from the eBPF program contexts (chain and
      cookie). We need to keep these subtypes this way to be able to make
      them evolve, though.
    * remove bpf_landlock_put_extra() because there is no more a "previous"
      field to free (for now)
    
    Changes since v7:
    * cosmetic fixes
    * rename LANDLOCK_SUBTYPE_* to LANDLOCK_*
    * cleanup UAPI definitions and move them from bpf.h to landlock.h
      (suggested by Alexei Starovoitov)
    * disable Landlock by default (suggested by Alexei Starovoitov)
    * rename BPF_PROG_TYPE_LANDLOCK_{RULE,HOOK}
    * update the Kconfig
    * update the MAINTAINERS file
    * replace the IOCTL, LOCK and FCNTL events with FS_PICK, FS_WALK and
      FS_GET hook types
    * add the ability to chain programs with an eBPF program file descriptor
      (i.e. the "previous" field in a Landlock subtype) and keep a state
      with a "cookie" value available from the context
    * add a "triggers" subtype bitfield to match specific actions (e.g.
      append, chdir, read...)
    
    Changes since v6:
    * add 3 more sub-events: IOCTL, LOCK, FCNTL
      https://lkml.kernel.org/r/2fbc99a6-f190-f335-bd14-04bdeed35571@digikod.net
    * rename LANDLOCK_VERSION to LANDLOCK_ABI to better reflect its purpose,
      and move it from landlock.h to common.h
    * rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE: an eBPF
      program could be used for something else than a rule
    * simplify struct landlock_context by removing the arch and syscall_nr fields
    * remove all eBPF map functions call, remove ABILITY_WRITE
    * refactor bpf_landlock_func_proto() (suggested by Kees Cook)
    * constify pointers
    * fix doc inclusion
    
    Changes since v5:
    * rename file hooks.c to init.c
    * fix spelling
    
    Changes since v4:
    * merge a minimal (not enabled) LSM code and Kconfig in this commit
    
    Changes since v3:
    * split commit
    * revamp the landlock_context:
      * add arch, syscall_nr and syscall_cmd (ioctl, fcntl…) to be able to
        cross-check action with the event type
      * replace args array with dedicated fields to ease the addition of new
        fields
Commits on Oct 28, 2019
  1. selftests/bpf: Restore $(OUTPUT)/test_stub.o rule

    iii-i authored and borkmann committed Oct 28, 2019
    `make O=/linux-build kselftest TARGETS=bpf` fails with
    
    	make[3]: *** No rule to make target '/linux-build/bpf/test_stub.o', needed by '/linux-build/bpf/test_verifier'
    
    The same command without the O= part works, presumably thanks to the
    implicit rule.
    
    Fix by restoring the explicit $(OUTPUT)/test_stub.o rule.
    
    Fixes: 74b5a59 ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
    Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Andrii Nakryiko <andriin@fb.com>
    Link: https://lore.kernel.org/bpf/20191028102110.7545-1-iii@linux.ibm.com
  2. selftest/bpf: Use -m{little, big}-endian for clang

    iii-i authored and borkmann committed Oct 28, 2019
    When cross-compiling tests from x86 to s390, the resulting BPF objects
    fail to load due to endianness mismatch.
    
    Fix by using BPF-GCC endianness check for clang as well.
    
    Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Andrii Nakryiko <andriin@fb.com>
    Link: https://lore.kernel.org/bpf/20191028102049.7489-1-iii@linux.ibm.com
Commits on Oct 27, 2019
  1. Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

    davem330 committed Oct 27, 2019
    Daniel Borkmann says:
    
    ====================
    pull-request: bpf-next 2019-10-27
    
    The following pull-request contains BPF updates for your *net-next* tree.
    
    We've added 52 non-merge commits during the last 11 day(s) which contain
    a total of 65 files changed, 2604 insertions(+), 1100 deletions(-).
    
    The main changes are:
    
     1) Revolutionize BPF tracing by using in-kernel BTF to type check BPF
        assembly code. The work here teaches BPF verifier to recognize
        kfree_skb()'s first argument as 'struct sk_buff *' in tracepoints
        such that verifier allows direct use of bpf_skb_event_output() helper
        used in tc BPF et al (w/o probing memory access) that dumps skb data
        into perf ring buffer. Also add direct loads to probe memory in order
        to speed up/replace bpf_probe_read() calls, from Alexei Starovoitov.
    
     2) Big batch of changes to improve libbpf and BPF kselftests. Besides
        others: generalization of libbpf's CO-RE relocation support to now
        also include field existence relocations, revamp the BPF kselftest
        Makefile to add test runner concept allowing to exercise various
        ways to build BPF programs, and teach bpf_object__open() and friends
        to automatically derive BPF program type/expected attach type from
        section names to ease their use, from Andrii Nakryiko.
    
     3) Fix deadlock in stackmap's build-id lookup on rq_lock(), from Song Liu.
    
     4) Allow to read BTF as raw data from bpftool. Most notable use case
        is to dump /sys/kernel/btf/vmlinux through this, from Jiri Olsa.
    
     5) Use bpf_redirect_map() helper in libbpf's AF_XDP helper prog which
        manages to improve "rx_drop" performance by ~4%., from Björn Töpel.
    
     6) Fix to restore the flow dissector after reattach BPF test and also
        fix error handling in bpf_helper_defs.h generation, from Jakub Sitnicki.
    
     7) Improve verifier's BTF ctx access for use outside of raw_tp, from
        Martin KaFai Lau.
    
     8) Improve documentation for AF_XDP with new sections and to reflect
        latest features, from Magnus Karlsson.
    
     9) Add back 'version' section parsing to libbpf for old kernels, from
        John Fastabend.
    
    10) Fix strncat bounds error in libbpf's libbpf_prog_type_by_name(),
        from KP Singh.
    
    11) Turn on -mattr=+alu32 in LLVM by default for BPF kselftests in order
        to improve insn coverage for built BPF progs, from Yonghong Song.
    
    12) Misc minor cleanups and fixes, from various others.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
Commits on Oct 26, 2019
  1. tc-testing: list required kernel options for act_ct action

    Roman Mashak authored and davem330 committed Oct 26, 2019
    Updated config with required kernel options for conntrac TC action,
    so that tdc can run the tests.
    
    Signed-off-by: Roman Mashak <mrv@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  2. Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

    davem330 committed Oct 26, 2019
    Pablo Neira Ayuso says:
    
    ====================
    Netfilter/IPVS updates for net-next
    
    The following patchset contains Netfilter/IPVS updates for net-next,
    more specifically:
    
    * Updates for ipset:
    
    1) Coding style fix for ipset comment extension, from Jeremy Sowden.
    
    2) De-inline many functions in ipset, from Jeremy Sowden.
    
    3) Move ipset function definition from header to source file.
    
    4) Move ip_set_put_flags() to source, export it as a symbol, remove
       inline.
    
    5) Move range_to_mask() to the source file where this is used.
    
    6) Move ip_set_get_ip_port() to the source file where this is used.
    
    * IPVS selftests and netns improvements:
    
    7) Two patches to speedup ipvs netns dismantle, from Haishuang Yan.
    
    8) Three patches to add selftest script for ipvs, also from
       Haishuang Yan.
    
    * Conntrack updates and new nf_hook_slow_list() function:
    
    9) Document ct ecache extension, from Florian Westphal.
    
    10) Skip ct extensions from ctnetlink dump, from Florian.
    
    11) Free ct extension immediately, from Florian.
    
    12) Skip access to ecache extension from nf_ct_deliver_cached_events()
        this is not correct as reported by Syzbot.
    
    13) Add and use nf_hook_slow_list(), from Florian.
    
    * Flowtable infrastructure updates:
    
    14) Move priority to nf_flowtable definition.
    
    15) Dynamic allocation of per-device hooks in flowtables.
    
    16) Allow to include netdevice only once in flowtable definitions.
    
    17) Rise maximum number of devices per flowtable.
    
    * Netfilter hardware offload infrastructure updates:
    
    18) Add nft_flow_block_chain() helper function.
    
    19) Pass callback list to nft_setup_cb_call().
    
    20) Add nft_flow_cls_offload_setup() helper function.
    
    21) Remove rules for the unregistered device via netdevice event.
    
    22) Support for multiple devices in a basechain definition at the
        ingress hook.
    
    22) Add nft_chain_offload_cmd() helper function.
    
    23) Add nft_flow_block_offload_init() helper function.
    
    24) Rewind in case of failing to bind multiple devices to hook.
    
    25) Typo in IPv6 tproxy module description, from Norman Rasmussen.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
  3. Merge branch 'net-aquantia-ptp-followup-fixes'

    davem330 committed Oct 26, 2019
    Igor Russkikh says:
    
    ====================
    net: aquantia: ptp followup fixes
    
    Here are two sparse warnings, third patch is a fix for
    scaled_ppm_to_ppb missing. Eventually I reworked this
    to exclude ptp module from build. Please consider it instead
    of this patch: https://patchwork.ozlabs.org/patch/1184171/
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
  4. net: aquantia: disable ptp object build if no config

    cail authored and davem330 committed Oct 26, 2019
    We do disable aq_ptp module build using inline
    stubs when CONFIG_PTP_1588_CLOCK is not declared.
    
    This reduces module size and removes unnecessary code.
    
    Reported-by: YueHaibing <yuehaibing@huawei.com>
    Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
    Acked-by: Richard Cochran <richardcochran@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  5. net: aquantia: fix warnings on endianness

    cail authored and davem330 committed Oct 26, 2019
    fixes to remove sparse warnings:
    sparse: sparse: cast to restricted __be64
    
    Fixes: 04a1839 ("net: aquantia: implement data PTP datapath")
    Reported-by: kbuild test robot <lkp@intel.com>
    Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  6. net: aquantia: fix var initialization warning

    cail authored and davem330 committed Oct 26, 2019
    found by sparse, simply useless local initialization with zero.
    
    Fixes: 94ad945 ("net: aquantia: add PTP rings infrastructure")
    Reported-by: kbuild test robot <lkp@intel.com>
    Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  7. netfilter: nf_tables_offload: unbind if multi-device binding fails

    ummakynes committed Oct 24, 2019
    nft_flow_block_chain() needs to unbind in case of error when performing
    the multi-device binding.
    
    Fixes: d54725c ("netfilter: nf_tables: support for multiple devices per netdev hook")
    Reported-by: wenxu <wenxu@ucloud.cn>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
  8. netfilter: nf_tables_offload: add nft_flow_block_offload_init()

    ummakynes committed Oct 24, 2019
    This patch adds the nft_flow_block_offload_init() helper function to
    initialize the flow_block_offload object.
    
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
  9. netfilter: nf_tables_offload: add nft_chain_offload_cmd()

    ummakynes committed Oct 24, 2019
    This patch adds the nft_chain_offload_cmd() helper function.
    
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
  10. netfilter: ecache: don't look for ecache extension on dying/unconfirm…

    Florian Westphal authored and ummakynes committed Oct 22, 2019
    …ed conntracks
    
    syzbot reported following splat:
    BUG: KASAN: use-after-free in __nf_ct_ext_exist
    include/net/netfilter/nf_conntrack_extend.h:53 [inline]
    BUG: KASAN: use-after-free in nf_ct_deliver_cached_events+0x5c3/0x6d0
    net/netfilter/nf_conntrack_ecache.c:205
    nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:65 [inline]
    nf_confirm+0x3d8/0x4d0 net/netfilter/nf_conntrack_proto.c:154
    [..]
    
    While there is no reproducer yet, the syzbot report contains one
    interesting bit of information:
    
    Freed by task 27585:
    [..]
     kfree+0x10a/0x2c0 mm/slab.c:3757
     nf_ct_ext_destroy+0x2ab/0x2e0 net/netfilter/nf_conntrack_extend.c:38
     nf_conntrack_free+0x8f/0xe0 net/netfilter/nf_conntrack_core.c:1418
     destroy_conntrack+0x1a2/0x270 net/netfilter/nf_conntrack_core.c:626
     nf_conntrack_put include/linux/netfilter/nf_conntrack_common.h:31 [inline]
     nf_ct_resolve_clash net/netfilter/nf_conntrack_core.c:915 [inline]
     ^^^^^^^^^^^^^^^^^^^
     __nf_conntrack_confirm+0x21ca/0x2830 net/netfilter/nf_conntrack_core.c:1038
     nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:63 [inline]
     nf_confirm+0x3e7/0x4d0 net/netfilter/nf_conntrack_proto.c:154
    
    This is whats happening:
    
    1. a conntrack entry is about to be confirmed (added to hash table).
    2. a clash with existing entry is detected.
    3. nf_ct_resolve_clash() puts skb->nfct (the "losing" entry).
    4. this entry now has a refcount of 0 and is freed to SLAB_TYPESAFE_BY_RCU
       kmem cache.
    
    skb->nfct has been replaced by the one found in the hash.
    Problem is that nf_conntrack_confirm() uses the old ct:
    
    static inline int nf_conntrack_confirm(struct sk_buff *skb)
    {
     struct nf_conn *ct = (struct nf_conn *)skb_nfct(skb);
     int ret = NF_ACCEPT;
    
      if (ct) {
        if (!nf_ct_is_confirmed(ct))
           ret = __nf_conntrack_confirm(skb);
        if (likely(ret == NF_ACCEPT))
    	nf_ct_deliver_cached_events(ct); /* This ct has refcount 0! */
      }
      return ret;
    }
    
    As of "netfilter: conntrack: free extension area immediately", we can't
    access conntrack extensions in this case.
    
    To fix this, make sure we check the dying bit presence before attempting
    to get the eache extension.
    
    Reported-by: syzbot+c7aabc9fe93e7f3637ba@syzkaller.appspotmail.com
    Fixes: 2ad9d77 ("netfilter: conntrack: free extension area immediately")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
  11. Merge branch 'ionic-updates'

    davem330 committed Oct 26, 2019
    Shannon Nelson says:
    
    ====================
    ionic updates
    
    These are a few of the driver updates we've been working on internally.
    These clean up a few mismatched struct comments, add checking for dead
    firmware, fix an initialization bug, and change the Rx buffer management.
    
    These are based on net-next v5.4-rc3-709-g985fd98ab5cc.
    
    v2: clear napi->skb in the error case in ionic_rx_frags()
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
  12. ionic: update driver version

    emusln authored and davem330 committed Oct 24, 2019
    Signed-off-by: Shannon Nelson <snelson@pensando.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  13. ionic: implement support for rx sgl

    emusln authored and davem330 committed Oct 24, 2019
    Even out Rx performance across MTU sizes by changing from full
    skb allocations to page-based frag allocations.  The device
    supports a form of scatter-gather in the Rx path, so we can
    set up a number of pages for each descriptor, all of which are
    easier to alloc and pass around than the standard kzalloc'd
    buffer.  An skb is wrapped around the pages while processing
    the received packets, and pages are recycled as needed, or
    left alone if they weren't used in the Rx.
    
    Signed-off-by: Shannon Nelson <snelson@pensando.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  14. ionic: add a watchdog timer to monitor heartbeat

    emusln authored and davem330 committed Oct 24, 2019
    Add a watchdog to periodically monitor the NIC heartbeat.
    
    Signed-off-by: Shannon Nelson <snelson@pensando.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  15. ionic: add heartbeat check

    emusln authored and davem330 committed Oct 24, 2019
    Most of our firmware has a heartbeat feature that the driver
    can watch for to see if the FW is still alive and likely to
    answer a dev_cmd or AdminQ request.
    
    Signed-off-by: Shannon Nelson <snelson@pensando.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  16. ionic: reverse an interrupt coalesce calculation

    emusln authored and davem330 committed Oct 24, 2019
    Fix the initial interrupt coalesce usec-to-hw setting
    to actually be usec-to-hw.
    
    Fixes: 780eded ("ionic: report users coalesce request")
    Signed-off-by: Shannon Nelson <snelson@pensando.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  17. ionic: fix up struct name comments

    emusln authored and davem330 committed Oct 24, 2019
    Fix up struct names in the ionic_if.h comments
    
    Signed-off-by: Shannon Nelson <snelson@pensando.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  18. r8169: improve rtl8169_rx_fill

    hkallweit authored and davem330 committed Oct 23, 2019
    We have only one user of the error path, so we can inline it.
    In addition the call to rtl8169_make_unusable_by_asic() can be removed
    because rtl8169_alloc_rx_data() didn't call rtl8169_mark_to_asic() yet
    for the respective index if returning NULL.
    
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  19. r8169: align fix_features callback with vendor driver

    hkallweit authored and davem330 committed Oct 23, 2019
    This patch aligns the fix_features callback with the vendor driver and
    also disables IPv6 HW checksumming and TSO if jumbo packets are used
    on RTL8101/RTL8168/RTL8125.
    
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  20. Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/ker…

    davem330 committed Oct 26, 2019
    …nel/git/bluetooth/bluetooth-next
    
    Johan Hedberg says:
    
    ====================
    pull request: bluetooth-next 2019-10-23
    
    Here's the main bluetooth-next pull request for the 5.5 kernel:
    
     - Multiple fixes to hci_qca driver
     - Fix for HCI_USER_CHANNEL initialization
     - btwlink: drop superseded driver
     - Add support for Intel FW download error recovery
     - Various other smaller fixes & improvements
    
    Please let me know if there are any issues pulling. Thanks.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
  21. tcp: add TCP_INFO status for failed client TFO

    almostivan authored and davem330 committed Oct 23, 2019
    The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
    or not data-in-SYN was ack'd on both the client and server side. We'd like
    to gather more information on the client-side in the failure case in order
    to indicate the reason for the failure. This can be useful for not only
    debugging TFO, but also for creating TFO socket policies. For example, if
    a middle box removes the TFO option or drops a data-in-SYN, we can
    can detect this case, and turn off TFO for these connections saving the
    extra retransmits.
    
    The newly added tcpi_fastopen_client_fail status is 2 bits and has the
    following 4 states:
    
    1) TFO_STATUS_UNSPEC
    
    Catch-all state which includes when TFO is disabled via black hole
    detection, which is indicated via LINUX_MIB_TCPFASTOPENBLACKHOLE.
    
    2) TFO_COOKIE_UNAVAILABLE
    
    If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
    is available in the cache.
    
    3) TFO_DATA_NOT_ACKED
    
    Data was sent with SYN, we received a SYN/ACK but it did not cover the data
    portion. Cookie is not accepted by server because the cookie may be invalid
    or the server may be overloaded.
    
    4) TFO_SYN_RETRANSMITTED
    
    Data was sent with SYN, we received a SYN/ACK which did not cover the data
    after at least 1 additional SYN was sent (without data). It may be the case
    that a middle-box is dropping data-in-SYN packets. Thus, it would be more
    efficient to not use TFO on this connection to avoid extra retransmits
    during connection establishment.
    
    These new fields do not cover all the cases where TFO may fail, but other
    failures, such as SYN/ACK + data being dropped, will result in the
    connection not becoming established. And a connection blackhole after
    session establishment shows up as a stalled connection.
    
    Signed-off-by: Jason Baron <jbaron@akamai.com>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Neal Cardwell <ncardwell@google.com>
    Cc: Christoph Paasch <cpaasch@apple.com>
    Cc: Yuchung Cheng <ycheng@google.com>
    Acked-by: Yuchung Cheng <ycheng@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  22. Merge branch 'phy-dp83867-enable-robust-auto-mdix'

    davem330 committed Oct 26, 2019
    Grygorii Strashko says:
    
    ====================
    net: phy: dp83867: enable robust auto-mdix
    
    Patch 1 - improves link detection when dp83867 PHY is configured in manual mode
    by enabling CFG3[9] Robust Auto-MDIX option.
    
    Patch 2 - is minor optimization.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
  23. net: phy: dp83867: move dt parsing to probe

    grygoriyS authored and davem330 committed Oct 23, 2019
    Move DT parsing code to probe dp83867_probe() as it's one time operation.
    
    Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  24. net: phy: dp83867: enable robust auto-mdix

    grygoriyS authored and davem330 committed Oct 23, 2019
    The link detection timeouts can be observed (or link might not be detected
    at all) when dp83867 PHY is configured in manual mode (speed/duplex).
    
    CFG3[9] Robust Auto-MDIX option allows to significantly improve link detection
    in case dp83867 is configured in manual mode and reduce link detection
    time.
    As per DM: "If link partners are configured to operational modes that are
    not supported by normal Auto MDI/MDIX mode (like Auto-Neg versus Force
    100Base-TX or Force 100Base-TX versus Force 100Base-TX), this Robust Auto
    MDI/MDIX mode allows MDI/MDIX resolution and prevents deadlock."
    
    Hence, enable this option by default as there are no known reasons
    not to do so.
    
    Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
  25. net: sch_generic: Use pfifo_fast as fallback scheduler for CAN hardware

    nefethael authored and davem330 committed Oct 23, 2019
    There is networking hardware that isn't based on Ethernet for layers 1 and 2.
    
    For example CAN.
    
    CAN is a multi-master serial bus standard for connecting Electronic Control
    Units [ECUs] also known as nodes. A frame on the CAN bus carries up to 8 bytes
    of payload. Frame corruption is detected by a CRC. However frame loss due to
    corruption is possible, but a quite unusual phenomenon.
    
    While fq_codel works great for TCP/IP, it doesn't for CAN. There are a lot of
    legacy protocols on top of CAN, which are not build with flow control or high
    CAN frame drop rates in mind.
    
    When using fq_codel, as soon as the queue reaches a certain delay based length,
    skbs from the head of the queue are silently dropped. Silently meaning that the
    user space using a send() or similar syscall doesn't get an error. However
    TCP's flow control algorithm will detect dropped packages and adjust the
    bandwidth accordingly.
    
    When using fq_codel and sending raw frames over CAN, which is the common use
    case, the user space thinks the package has been sent without problems, because
    send() returned without an error. pfifo_fast will drop skbs, if the queue
    length exceeds the maximum. But with this scheduler the skbs at the tail are
    dropped, an error (-ENOBUFS) is propagated to user space. So that the user
    space can slow down the package generation.
    
    On distributions, where fq_codel is made default via CONFIG_DEFAULT_NET_SCH
    during compile time, or set default during runtime with sysctl
    net.core.default_qdisc (see [1]), we get a bad user experience. In my test case
    with pfifo_fast, I can transfer thousands of million CAN frames without a frame
    drop. On the other hand with fq_codel there is more then one lost CAN frame per
    thousand frames.
    
    As pointed out fq_codel is not suited for CAN hardware, so this patch changes
    attach_one_default_qdisc() to use pfifo_fast for "ARPHRD_CAN" network devices.
    
    During transition of a netdev from down to up state the default queuing
    discipline is attached by attach_default_qdiscs() with the help of
    attach_one_default_qdisc(). This patch modifies attach_one_default_qdisc() to
    attach the pfifo_fast (pfifo_fast_ops) if the network device type is
    "ARPHRD_CAN".
    
    [1] systemd/systemd#9194
    
    Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
    Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
    Signed-off-by: Vincent Prince <vincent.prince.fr@gmail.com>
    Acked-by: Dave Taht <dave.taht@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
Older
You can’t perform that action at this time.