Skip to content
Permalink
Kumar-Kartikey…
Switch branches/tags

Commits on Nov 22, 2021

  1. samples/bpf: Add example to checkpoint/restore io_uring

    The sample demonstrates how BPF iterators for task and io_uring can be
    used to checkpoint the state of an io_uring instance and then recreate
    it using that information, as a working example of how the iterator
    will be utilized for the same by userspace projects like CRIU.
    
    This is very similar to how CRIU actually works in principle, by writing
    all data on dump to protobuf images, which are then read during restore
    to reconstruct the task and its resources. Here we use a custom binary
    format and pipe the io_uring "image(s)" (in case of wq_fd there will be
    multiple images), to the restorer, which then consumes this information
    to form a total ordering of restore actions it has to execute to reach
    the same state.
    
    The sample restores all features that currently cannot be restored
    without bpf iterators, hence is a good demonstration of what we would
    like to achieve using these new facilities. As is evident, we need a
    single iteration pass in each iterator to obtain all the information we
    require.
    
    io_uring ring buffer restoration is orthogonal and not specific to
    iterators, so it has been left out.
    
    Our example app also shares the workqueue with parent io_uring, which is
    detected by our dumper tool and it moves to first dump the parent
    io_uring. io_uring doesn't allow creating cycles in this case, so the
    chain ends eventually in practice. For now only single parent is
    supported, but it easy to extend to arbitrary length chains (by
    recursing with limit in do_dump_parent after detecting presence of wq_fd > 0).
    
    The epoll iterator usecase is similar to what we do in dump_io_uring_file,
    and would significantly simplify current implementation [0].
    
      [0]: https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/eventpoll.c
    
    The dry-run mode of bpf_cr tool prints the dump image:
    
    $ ./bpf_cr app &
    PID: 318, Parent io_uring: 3, Dependent io_uring: 4
    
    $ ./bpf_cr dump 318 4 | ./bpf_cr restore --dry-run
    DUMP_SETUP:
    	io_uring_fd: 3
    	end: true
    		flags: 14
    		sq_entries: 2
    		cq_entries: 4
    		sq_thread_cpu: 0
    		sq_thread_idle: 1500
    		wq_fd: 0
    DUMP_SETUP:
    	io_uring_fd: 4
    	end: false
    		flags: 46
    		sq_entries: 2
    		cq_entries: 4
    		sq_thread_cpu: 0
    		sq_thread_idle: 1500
    		wq_fd: 3
    DUMP_EVENTFD:
    	io_uring_fd: 4
    	end: false
    		eventfd: 5
    		async: true
    DUMP_REG_FD:
    	io_uring_fd: 4
    	end: false
    		reg_fd: 0
    		index: 0
    DUMP_REG_FD:
    	io_uring_fd: 4
    	end: false
    		reg_fd: 0
    		index: 2
    DUMP_REG_FD:
    	io_uring_fd: 4
    	end: false
    		reg_fd: 0
    		index: 4
    DUMP_REG_BUF:
    	io_uring_fd: 4
    	end: false
    		addr: 0
    		len: 0
    		index: 0
    DUMP_REG_BUF:
    	io_uring_fd: 4
    	end: true
    		addr: 140721288339216
    		len: 120
    		index: 1
    Nothing to do, exiting...
    
    ======
    
    The trace is as follows:
    // We can shift fd number around randomly, it doesn't impact C/R
    $ exec 3<> /dev/urandom
    $ exec 4<> /dev/random
    $ exec 5<> /dev/null
    $ strace ./bpf_cr app &
    	...
    	io_uring_setup(2, {flags=IORING_SETUP_SQPOLL|IORING_SETUP_SQ_AFF|IORING_SETUP_CQSIZE, sq_thread_cpu=0, sq_thread_idle=1500, sq_entries=2, cq_entries=4, features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|IORING_FEAT_SQPOLL_NONFIXED|IORING_FEAT_EXT_ARG|IORING_FEAT_NATIVE_WORKERS|IORING_FEAT_RSRC_TAGS, sq_off={head=0, tail=64, ring_mask=256, ring_entries=264, flags=276, dropped=272, array=384}, cq_off={head=128, tail=192, ring_mask=260, ring_entries=268, overflow=284, cqes=320, flags=0x118 /* IORING_CQ_??? */}}) = 6
    	getpid()                                = 324
    	...
    	io_uring_setup(2, {flags=IORING_SETUP_SQPOLL|IORING_SETUP_SQ_AFF|IORING_SETUP_CQSIZE|IORING_SETUP_ATTACH_WQ, sq_thread_cpu=0, sq_thread_idle=1500, wq_fd=6, sq_entries=2, cq_entries=4, features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|IORING_FEAT_SQPOLL_NONFIXED|IORING_FEAT_EXT_ARG|IORING_FEAT_NATIVE_WORKERS|IORING_FEAT_RSRC_TAGS, sq_off={head=0, tail=64, ring_mask=256, ring_entries=264, flags=276, dropped=272, array=384}, cq_off={head=128, tail=192, ring_mask=260, ring_entries=268, overflow=284, cqes=320, flags=0x118 /* IORING_CQ_??? */}}) = 7
    	...
    	// PID: 324, Parent io_uring: 6, Dependent io_uring: 7
    	...
    	eventfd2(42, 0)                         = 8
    	io_uring_register(7, IORING_REGISTER_EVENTFD_ASYNC, [8], 1) = 0
    	io_uring_register(7, IORING_REGISTER_FILES, [0, -1, 1, -1, 2], 5) = 0
    	io_uring_register(7, IORING_REGISTER_BUFFERS, [{iov_base=NULL, iov_len=0}, {iov_base=0x7ffdf1a27680, iov_len=120}], 2) = 0
    
    The restore's trace is as follows (which detects the wq_fd on its own)
    and dumps and restores it as well, before restoring fd 7:
    
    $ ./bpf_cr dump 326 7 | strace ./bpf_cr restore
    	...
    	io_uring_setup(2, {flags=IORING_SETUP_SQPOLL|IORING_SETUP_SQ_AFF|IORING_SETUP_CQSIZE, sq_thread_cpu=0, sq_thread_idle=1500, sq_entries=2, cq_entries=4, features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|IORING_FEAT_SQPOLL_NONFIXED|IORING_FEAT_EXT_ARG|IORING_FEAT_NATIVE_WORKERS|IORING_FEAT_RSRC_TAGS, sq_off={head=0, tail=64, ring_mask=256, ring_entries=264, flags=276, dropped=272, array=384}, cq_off={head=128, tail=192, ring_mask=260, ring_entries=268, overflow=284, cqes=320, flags=0x118 /* IORING_CQ_??? */}}) = 6
    	dup2(6, 6)                              = 6
    	...
    	io_uring_setup(2, {flags=IORING_SETUP_SQPOLL|IORING_SETUP_SQ_AFF|IORING_SETUP_CQSIZE|IORING_SETUP_ATTACH_WQ, sq_thread_cpu=0, sq_thread_idle=1500, wq_fd=6, sq_entries=2, cq_entries=4, features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|IORING_FEAT_SQPOLL_NONFIXED|IORING_FEAT_EXT_ARG|IORING_FEAT_NATIVE_WORKERS|IORING_FEAT_RSRC_TAGS, sq_off={head=0, tail=64, ring_mask=256, ring_entries=264, flags=276, dropped=272, array=384}, cq_off={head=128, tail=192, ring_mask=260, ring_entries=268, overflow=284, cqes=320, flags=0x118 /* IORING_CQ_??? */}}) = 7
    	dup2(7, 7)                              = 7
    	...
    	eventfd2(42, 0)                         = 8
    	io_uring_register(7, IORING_REGISTER_EVENTFD_ASYNC, [8], 1) = 0
    	...
    	// fd number 0 is same as 1 and 2, hence the lowest one is used during restore,
    	// it doesn't matter as underlying struct file is same...
    	io_uring_register(7, IORING_REGISTER_FILES, [0, -1, 0, -1, 0], 5) = 0
    	// This step would happen after restoring mm, so it fails for now for second iovec
    	io_uring_register(7, IORING_REGISTER_BUFFERS, [{iov_base=NULL, iov_len=0}, {iov_base=0x7ffdf1a27680, iov_len=120}], 2) = -1 EFAULT (Bad address)
    	...
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  2. selftests/bpf: Fix btf_dump test for bpf_iter_link_info

    Since we changed the definition while adding io_uring and epoll iterator
    support, adjust the selftest to check against the updated definition.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  3. selftests/bpf: Test partial reads for io_uring, epoll iterators

    Ensure that the output is consistent in face of partial reads that
    return to userspace and then resume again later. To this end, we do
    reads in 1-byte chunks, which is a bit stupid in real life, but works
    well to simulate interrupted iteration. This also tests case where
    seq_file buffer is consumed (after seq_printf) on interrupted read
    before iterator invoked BPF prog again.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  4. selftests/bpf: Add test for epoll BPF iterator

    This tests the epoll iterator, including peeking into the epitem to
    inspect the registered file and fd number, and verifying that in
    userspace.
    
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  5. selftests/bpf: Add test for io_uring BPF iterators

    This exercises the io_uring_buf and io_uring_file iterators, and tests
    sparse file sets as well.
    
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Pavel Begunkov <asml.silence@gmail.com>
    Cc: io-uring@vger.kernel.org
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  6. bpftool: Output io_uring iterator info

    Output the sole field related to io_uring iterator (inode of attached
    io_uring) so that it can be useful in informational and also debugging
    cases (trying to find actual io_uring fd attached to the iterator).
    
    Output:
    89: iter  prog 262  target_name io_uring_file  io_uring_inode 16764
    	pids test_progs(384)
    
    [
      {
        "id": 123,
        "type": "iter",
        "prog_id": 463,
        "target_name": "io_uring_buf",
        "io_uring_inode": 16871,
        "pids": [
          {
            "pid": 443,
            "comm": "test_progs"
          }
        ]
      }
    ]
    
    [
      {
        "id": 126,
        "type": "iter",
        "prog_id": 483,
        "target_name": "io_uring_file",
        "io_uring_inode": 16887,
        "pids": [
          {
            "pid": 448,
            "comm": "test_progs"
          }
        ]
      }
    ]
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  7. epoll: Implement eBPF iterator for registered items

    This patch adds eBPF iterator for epoll items (epitems) registered in an
    epoll instance. It gives access to the eventpoll ctx, and the registered
    epoll item (struct epitem). This allows the iterator to inspect the
    registered file and be able to use others iterators to associate it with
    a task's fdtable.
    
    The primary usecase this is enabling is expediting existing eventpoll
    checkpoint/restore support in the CRIU project. This iterator allows us
    to switch from a worst case O(n^2) algorithm to a single O(n) pass over
    task and epoll registered descriptors.
    
    We also make sure we're iterating over a live file, one that is not
    going away. The case we're concerned about is a file that has its
    f_count as zero, but is waiting for iterator bpf_seq_read to release
    ep->mtx, so that it can remove its epitem. Since such a file will
    disappear once iteration is done, and it is being destructed, we use
    get_file_rcu to ensure it is alive when invoking the BPF program.
    
    Getting access to a file that is going to disappear after iteration
    is not useful anyway. This does have a performance overhead however
    (since file reference will be raised and dropped for each file).
    
    The rcu_read_lock around get_file_rcu isn't strictly required for
    lifetime management since fput path is serialized on ep->mtx to call
    ep_remove, hence the epi->ffd.file pointer remains stable during our
    seq_start/seq_stop bracketing.
    
    To be able to continue back from the position we were iterating, we
    store the epi->ffi.fd and use ep_find_tfd to find the target file again.
    It would be more appropriate to use both struct file pointer and fd
    number to find the last file, but see below for why that cannot be done.
    
    Taking reference to struct file and walking RB-Tree to find it again
    will lead to reference cycle issue if the iterator after partial read
    takes reference to socket which later is used in creating a descriptor
    cycle using SCM_RIGHTS. An example that was encountered when working on
    this is mentioned below.
    
      Let there be Unix sockets SK1, SK2, epoll fd EP, and epoll iterator
      ITER.
      Let SK1 be registered in EP, then on a partial read it is possible
      that ITER returns from read and takes reference to SK1 to be able to
      find it later in RB-Tree and continue the iteration.  If SK1 sends
      ITER over to SK2 using SCM_RIGHTS, and SK2 sends SK2 over to SK1 using
      SCM_RIGHTS, and both fds are not consumed on the corresponding receive
      ends, a cycle is created.  When all of SK1, SK2, EP, and ITER are
      closed, SK1's receive queue holds reference to SK2, and SK2's receive
      queue holds reference to ITER, which holds a reference to SK1.
      All file descriptors except EP leak.
    
    To resolve it, we would need to hook into the Unix Socket GC mechanism,
    but the alternative of using ep_find_tfd is much more simpler. The
    finding of the last position in face of concurrent modification of the
    epoll set is at best an approximation anyway. For the case of CRIU, the
    epoll set remains stable.
    
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  8. io_uring: Implement eBPF iterator for registered files

    This change adds eBPF iterator for buffers registered in io_uring ctx.
    It gives access to the ctx, the index of the registered buffer, and a
    pointer to the struct file itself. This allows the iterator to save
    info related to the file added to an io_uring instance, that isn't easy
    to export using the fdinfo interface (like being able to match
    registered files to a task's file set). Getting access to underlying
    struct file allows deduplication and efficient pairing with task file
    set (obtained using task_file iterator).
    
    The primary usecase this is enabling is checkpoint/restore support.
    
    Note that we need to use mutex_trylock when the file is read from, in
    seq_start functions, as the order of lock taken is opposite of what it
    would be when io_uring operation reads the same file.  We take
    seq_file->lock, then ctx->uring_lock, while io_uring would first take
    ctx->uring_lock and then seq_file->lock for the same ctx.
    
    This can lead to a deadlock scenario described below:
    
    The sequence on CPU 0 is for normal read(2) on iterator.  For CPU 1, it
    is an io_uring instance trying to do same on iterator attached to
    itself.
    
    So CPU 0 does
    
    sys_read
    vfs_read
     bpf_seq_read
     mutex_lock(&seq_file->lock)    # A
      io_uring_buf_seq_start
      mutex_lock(&ctx->uring_lock)  # B
    
    and CPU 1 does
    
    io_uring_enter
    mutex_lock(&ctx->uring_lock)    # B
     io_read
      bpf_seq_read
      mutex_lock(&seq_file->lock)   # A
      ...
    
    Since the order of locks is opposite, it can deadlock. So we switch the
    mutex_lock in io_uring_buf_seq_start to trylock, so it can return an
    error for this case, then it will release seq_file->lock and CPU 1 will
    make progress.
    
    The trylock also protects the case where io_uring tries to read from
    iterator attached to itself (same ctx), where the order of locks would
    be:
     io_uring_enter
     mutex_lock(&ctx->uring_lock) <------------.
      io_read				    \
       seq_read				     \
        mutex_lock(&seq_file->lock)		     /
        mutex_lock(&ctx->uring_lock) # deadlock-`
    
    In both these cases (recursive read and contended uring_lock), -EDEADLK
    is returned to userspace.
    
    With the advent of descriptorless files supported by io_uring, this
    iterator provides the required visibility and introspection of io_uring
    instance for the purposes of dumping and restoring it.
    
    In the future, this iterator will be extended to support direct
    inspection of a lot of file state (currently descriptorless files
    are obtained using openat2 and socket) to dump file state for these
    hidden files. Later, we can explore filling in the gaps for dumping
    file state for more file types (those not hidden in io_uring ctx).
    All this is out of scope for the current series however, but builds
    upon this iterator.
    
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Pavel Begunkov <asml.silence@gmail.com>
    Cc: io-uring@vger.kernel.org
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  9. bpf: Add bpf_page_to_pfn helper

    In CRIU, we need to be able to determine whether the page pinned by
    io_uring is still present in the same range in the process VMA.
    /proc/<pid>/pagemap gives us the PFN, hence using this helper we can
    establish this mapping easily from the iterator side.
    
    It is a simple wrapper over the in-kernel page_to_pfn macro, and ensures
    the passed in pointer is a struct page PTR_TO_BTF_ID. This is obtained
    from the bvec of io_uring_ubuf for the CRIU usecase.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021
  10. io_uring: Implement eBPF iterator for registered buffers

    This change adds eBPF iterator for buffers registered in io_uring ctx.
    It gives access to the ctx, the index of the registered buffer, and a
    pointer to the io_uring_ubuf itself. This allows the iterator to save
    info related to buffers added to an io_uring instance, that isn't easy
    to export using the fdinfo interface (like exact struct page composing
    the registered buffer).
    
    The primary usecase this is enabling is checkpoint/restore support.
    
    Note that we need to use mutex_trylock when the file is read from, in
    seq_start functions, as the order of lock taken is opposite of what it
    would be when io_uring operation reads the same file.  We take
    seq_file->lock, then ctx->uring_lock, while io_uring would first take
    ctx->uring_lock and then seq_file->lock for the same ctx.
    
    This can lead to a deadlock scenario described below:
    
    The sequence on CPU 0 is for normal read(2) on iterator.
    For CPU 1, it is an io_uring instance trying to do same on iterator attached to
    itself.
    
    So CPU 0 does
    
    sys_read
    vfs_read
     bpf_seq_read
     mutex_lock(&seq_file->lock)    # A
      io_uring_buf_seq_start
      mutex_lock(&ctx->uring_lock)  # B
    
    and CPU 1 does
    
    io_uring_enter
    mutex_lock(&ctx->uring_lock)    # B
     io_read
      bpf_seq_read
      mutex_lock(&seq_file->lock)   # A
      ...
    
    Since the order of locks is opposite, it can deadlock. So we switch the
    mutex_lock in io_uring_buf_seq_start to trylock, so it can return an
    error for this case, then it will release seq_file->lock and CPU 1 will
    make progress.
    
    The trylock also protects the case where io_uring tries to read from
    iterator attached to itself (same ctx), where the order of locks would
    be:
     io_uring_enter
     mutex_lock(&ctx->uring_lock) <------------.
      io_read				    \
       seq_read				     \
        mutex_lock(&seq_file->lock)		     /
        mutex_lock(&ctx->uring_lock) # deadlock-`
    
    In both these cases (recursive read and contended uring_lock), -EDEADLK
    is returned to userspace.
    
    In the future, this iterator will be extended to directly support
    iteration of bvec Flexible Array Member, so that when there is no
    corresponding VMA that maps to the registered buffer (e.g. if VMA is
    destroyed after pinning pages), we are able to reconstruct the
    registration on restore by dumping the page contents and then replaying
    them into a temporary mapping used for registration later. All this is
    out of scope for the current series however, but builds upon this
    iterator.
    
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Pavel Begunkov <asml.silence@gmail.com>
    Cc: io-uring@vger.kernel.org
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    kkdwivedi authored and intel-lab-lkp committed Nov 22, 2021

Commits on Nov 19, 2021

  1. libbpf: Change bpf_program__set_extra_flags to bpf_program__set_flags

    bpf_program__set_extra_flags has just been introduced so we can still
    change it without breaking users.
    
    This new interface is a bit more flexible (for example if someone wants
    to clear a flag).
    
    Signed-off-by: Florent Revest <revest@chromium.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211119180035.1396139-1-revest@chromium.org
    Florent Revest authored and anakryiko committed Nov 19, 2021
  2. selftests/bpf: Add btf_dedup case with duplicated structs within CU

    Add an artificial minimal example simulating compilers producing two
    different types within a single CU that correspond to identical struct
    definitions.
    
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211117194114.347675-2-andrii@kernel.org
    olsajiri authored and borkmann committed Nov 19, 2021
  3. libbpf: Accommodate DWARF/compiler bug with duplicated structs

    According to [0], compilers sometimes might produce duplicate DWARF
    definitions for exactly the same struct/union within the same
    compilation unit (CU). We've had similar issues with identical arrays
    and handled them with a similar workaround in 6b6e6b1 ("libbpf:
    Accomodate DWARF/compiler bug with duplicated identical arrays"). Do the
    same for struct/union by ensuring that two structs/unions are exactly
    the same, down to the integer values of field referenced type IDs.
    
    Solving this more generically (allowing referenced types to be
    equivalent, but using different type IDs, all within a single CU)
    requires a huge complexity increase to handle many-to-many mappings
    between canonidal and candidate type graphs. Before we invest in that,
    let's see if this approach handles all the instances of this issue in
    practice. Thankfully it's pretty rare, it seems.
    
      [0] https://lore.kernel.org/bpf/YXr2NFlJTAhHdZqq@krava/
    
    Reported-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211117194114.347675-1-andrii@kernel.org
    anakryiko authored and borkmann committed Nov 19, 2021
  4. libbpf: Add runtime APIs to query libbpf version

    Libbpf provided LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION macros to
    check libbpf version at compilation time. This doesn't cover all the
    needs, though, because version of libbpf that application is compiled
    against doesn't necessarily match the version of libbpf at runtime,
    especially if libbpf is used as a shared library.
    
    Add libbpf_major_version() and libbpf_minor_version() returning major
    and minor versions, respectively, as integers. Also add a convenience
    libbpf_version_string() for various tooling using libbpf to print out
    libbpf version in a human-readable form. Currently it will return
    "v0.6", but in the future it can contains some extra information, so the
    format itself is not part of a stable API and shouldn't be relied upon.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: John Fastabend <john.fastabend@gmail.com>
    Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Link: https://lore.kernel.org/bpf/20211118174054.2699477-1-andrii@kernel.org
    anakryiko authored and borkmann committed Nov 19, 2021

Commits on Nov 18, 2021

  1. selfetests/bpf: Adapt vmtest.sh to s390 libbpf CI changes

    [1] added s390 support to libbpf CI and added an ${ARCH} prefix to a
    number of paths and identifiers in libbpf GitHub repo, which vmtest.sh
    relies upon. Update these and make use of the new s390 support.
    
    [1] libbpf/libbpf#204
    
    Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211118115225.1349726-1-iii@linux.ibm.com
    iii-i authored and anakryiko committed Nov 18, 2021

Commits on Nov 17, 2021

  1. selftests/bpf: Fix xdpxceiver failures for no hugepages

    xsk_configure_umem() needs hugepages to work in unaligned mode. So when
    hugepages are not configured, 'unaligned' tests should be skipped which
    is determined by the helper function hugepages_present(). This function
    erroneously returns true with MAP_NORESERVE flag even when no hugepages
    are configured. The removal of this flag fixes the issue.
    
    The test TEST_TYPE_UNALIGNED_INV_DESC also needs to be skipped when
    there are no hugepages. However, this was not skipped as there was no
    check for presence of hugepages and hence was failing. The check to skip
    the test has now been added.
    
    Fixes: a4ba98d (selftests: xsk: Add test for unaligned mode)
    Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211117123613.22288-1-tirthendu.sarkar@intel.com
    tirthendu-intel authored and borkmann committed Nov 17, 2021
  2. bpf, docs: Fix ordering of bpf documentation

    This commit fixes the display of the BPF documentation in the sidebar
    when rendered as HTML.
    
    Before this patch, the sidebar would render as follows for some
    sections:
    
    | BPF Documentation
      |- BPF Type Format (BTF)
        |- BPF Type Format (BTF)
    
    This was due to creating a heading in index.rst followed by
    a sphinx toctree, where the file referenced carries the same
    title as the section heading.
    
    To fix this I applied a pattern that has been established in other
    subfolders of Documentation:
    
    1. Re-wrote index.rst to have a single toctree
    2. Split the sections out in to their own files
    
    Additionally maps.rst and programs.rst make use of a glob pattern to
    include map_* or prog_* rst files in their toctree, meaning future map
    or program type documentation will be automatically included.
    
    Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/1a1eed800e7b9dc13b458de113a489641519b0cc.1636749493.git.dave@dtucker.co.uk
    dave-tucker authored and borkmann committed Nov 17, 2021
  3. bpf, docs: Rename bpf_lsm.rst to prog_lsm.rst

    This allows for documentation relating to BPF Program Types to be
    matched by the glob pattern prog_* for inclusion in a sphinx toctree
    
    Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: KP Singh <kpsingh@kernel.org>
    Link: https://lore.kernel.org/bpf/49fe0f370a2b28500c1b60f1fdb6fb7ec90de28a.1636749493.git.dave@dtucker.co.uk
    dave-tucker authored and borkmann committed Nov 17, 2021
  4. bpf, docs: Change underline in btf to match style guide

    This changes the type of underline used to follow the guidelines in
    Documentation/doc-guide/sphinx.rst which also ensures that the headings
    are rendered at the correct level in the HTML sidebar
    
    Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/981b27485cc294206480df36fca46817e2553e39.1636749493.git.dave@dtucker.co.uk
    dave-tucker authored and borkmann committed Nov 17, 2021
  5. selftests/bpf: Mark variable as static

    Fix warnings from checkstyle.pl
    
    Signed-off-by: Yucong Sun <sunyucong@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112192535.898352-4-fallentree@fb.com
    thefallentree authored and anakryiko committed Nov 17, 2021
  6. selftests/bpf: Variable naming fix

    Change log_fd to log_fp to reflect its type correctly.
    
    Signed-off-by: Yucong Sun <sunyucong@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112192535.898352-3-fallentree@fb.com
    thefallentree authored and anakryiko committed Nov 17, 2021
  7. selftests/bpf: Move summary line after the error logs

    Makes it easier to find the summary line when there is a lot of logs to
    scroll back.
    
    Signed-off-by: Yucong Sun <sunyucong@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112192535.898352-2-fallentree@fb.com
    thefallentree authored and anakryiko committed Nov 17, 2021

Commits on Nov 16, 2021

  1. selftests/bpf: Add uprobe triggering overhead benchmarks

    Add benchmark to measure overhead of uprobes and uretprobes. Also have
    a baseline (no uprobe attached) benchmark.
    
    On my dev machine, baseline benchmark can trigger 130M user_target()
    invocations. When uprobe is attached, this falls to just 700K. With
    uretprobe, we get down to 520K:
    
      $ sudo ./bench trig-uprobe-base -a
      Summary: hits  131.289 ± 2.872M/s
    
      # UPROBE
      $ sudo ./bench -a trig-uprobe-without-nop
      Summary: hits    0.729 ± 0.007M/s
    
      $ sudo ./bench -a trig-uprobe-with-nop
      Summary: hits    1.798 ± 0.017M/s
    
      # URETPROBE
      $ sudo ./bench -a trig-uretprobe-without-nop
      Summary: hits    0.508 ± 0.012M/s
    
      $ sudo ./bench -a trig-uretprobe-with-nop
      Summary: hits    0.883 ± 0.008M/s
    
    So there is almost 2.5x performance difference between probing nop vs
    non-nop instruction for entry uprobe. And 1.7x difference for uretprobe.
    
    This means that non-nop uprobe overhead is around 1.4 microseconds for uprobe
    and 2 microseconds for non-nop uretprobe.
    
    For nop variants, uprobe and uretprobe overhead is down to 0.556 and
    1.13 microseconds, respectively.
    
    For comparison, just doing a very low-overhead syscall (with no BPF
    programs attached anywhere) gives:
    
      $ sudo ./bench trig-base -a
      Summary: hits    4.830 ± 0.036M/s
    
    So uprobes are about 2.67x slower than pure context switch.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211116013041.4072571-1-andrii@kernel.org
    anakryiko authored and borkmann committed Nov 16, 2021
  2. bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33

    In the current code, the actual max tail call count is 33 which is greater
    than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
    with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
    We can see the historical evolution from commit 04fd61a ("bpf: allow
    bpf programs to tail-call other bpf programs") and commit f9dabe0
    ("bpf: Undo off-by-one in interpreter tail call count limit"). In order
    to avoid changing existing behavior, the actual limit is 33 now, this is
    reasonable.
    
    After commit 874be05 ("bpf, tests: Add tail call test suite"), we can
    see there exists failed testcase.
    
    On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
     # echo 0 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf
     # dmesg | grep -w FAIL
     Tail call error path, max count reached jited:0 ret 34 != 33 FAIL
    
    On some archs:
     # echo 1 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf
     # dmesg | grep -w FAIL
     Tail call error path, max count reached jited:1 ret 34 != 33 FAIL
    
    Although the above failed testcase has been fixed in commit 18935a7
    ("bpf/tests: Fix error in tail call limit tests"), it would still be good
    to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
    more readable.
    
    The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
    limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
    mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
    For the riscv JIT, use RV_REG_TCC directly to save one register move as
    suggested by Björn Töpel. For the other implementations, no function changes,
    it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
    can reflect the actual max tail call count, the related tail call testcases
    in test_bpf module and selftests can work well for the interpreter and the
    JIT.
    
    Here are the test results on x86_64:
    
     # uname -m
     x86_64
     # echo 0 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf test_suite=test_tail_calls
     # dmesg | tail -1
     test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
     # rmmod test_bpf
     # echo 1 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf test_suite=test_tail_calls
     # dmesg | tail -1
     test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
     # rmmod test_bpf
     # ./test_progs -t tailcalls
     torvalds#142 tailcalls:OK
     Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED
    
    Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
    Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Acked-by: Björn Töpel <bjorn@kernel.org>
    Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
    Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
    Tiezhu Yang authored and borkmann committed Nov 16, 2021
  3. selftests/bpf: Configure dir paths via env in test_bpftool_synctypes.py

    Script test_bpftool_synctypes.py parses a number of files in the bpftool
    directory (or even elsewhere in the repo) to make sure that the list of
    types or options in those different files are consistent. Instead of
    having fixed paths, let's make the directories configurable through
    environment variable. This should make easier in the future to run the
    script in a different setup, for example on an out-of-tree bpftool
    mirror with a different layout.
    
    Signed-off-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115225844.33943-4-quentin@isovalent.com
    qmonnet authored and borkmann committed Nov 16, 2021
  4. bpftool: Update doc (use susbtitutions) and test_bpftool_synctypes.py

    test_bpftool_synctypes.py helps detecting inconsistencies in bpftool
    between the different list of types and options scattered in the
    sources, the documentation, and the bash completion. For options that
    apply to all bpftool commands, the script had a hardcoded list of
    values, and would use them to check whether the man pages are
    up-to-date. When writing the script, it felt acceptable to have this
    list in order to avoid to open and parse bpftool's main.h every time,
    and because the list of global options in bpftool doesn't change so
    often.
    
    However, this is prone to omissions, and we recently added a new
    -l|--legacy option which was described in common_options.rst, but not
    listed in the options summary of each manual page. The script did not
    complain, because it keeps comparing the hardcoded list to the (now)
    outdated list in the header file.
    
    To address the issue, this commit brings the following changes:
    
    - Options that are common to all bpftool commands (--json, --pretty, and
      --debug) are moved to a dedicated file, and used in the definition of
      a RST substitution. This substitution is used in the sources of all
      the man pages.
    
    - This list of common options is updated, with the addition of the new
      -l|--legacy option.
    
    - The script test_bpftool_synctypes.py is updated to compare:
        - Options specific to a command, found in C files, for the
          interactive help messages, with the same specific options from the
          relevant man page for that command.
        - Common options, checked just once: the list in main.h is
          compared with the new list in substitutions.rst.
    
    Signed-off-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115225844.33943-3-quentin@isovalent.com
    qmonnet authored and borkmann committed Nov 16, 2021
  5. bpftool: Add SPDX tags to RST documentation files

    Most files in the kernel repository have a SPDX tags. The files that
    don't have such a tag (or another license boilerplate) tend to fall
    under the GPL-2.0 license. In the past, bpftool's Makefile (for example)
    has been marked as GPL-2.0 for that reason, when in fact all bpftool is
    dual-licensed.
    
    To prevent a similar confusion from happening with the RST documentation
    files for bpftool, let's explicitly mark all files as dual-licensed.
    
    Signed-off-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115225844.33943-2-quentin@isovalent.com
    qmonnet authored and borkmann committed Nov 16, 2021
  6. selftests/bpf: Add a dedup selftest with equivalent structure types

    Without previous libbpf patch, the following error will occur:
    
      $ ./test_progs -t btf
      ...
      do_test_dedup:FAIL:check btf_dedup failed errno:-22#13/205 btf/dedup: btf_type_tag #5, struct:FAIL
    
    And the previous libbpf patch fixed the issue.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115163943.3922547-1-yhs@fb.com
    yonghong-song authored and borkmann committed Nov 16, 2021
  7. libbpf: Fix a couple of missed btf_type_tag handling in btf.c

    Commit 2dc1e48 ("libbpf: Support BTF_KIND_TYPE_TAG") added the
    BTF_KIND_TYPE_TAG support. But to test vmlinux build with ...
    
      #define __user __attribute__((btf_type_tag("user")))
    
    ... I needed to sync libbpf repo and manually copy libbpf sources to
    pahole. To simplify process, I used BTF_KIND_RESTRICT to simulate the
    BTF_KIND_TYPE_TAG with vmlinux build as "restrict" modifier is barely
    used in kernel.
    
    But this approach missed one case in dedup with structures where
    BTF_KIND_RESTRICT is handled and BTF_KIND_TYPE_TAG is not handled in
    btf_dedup_is_equiv(), and this will result in a pahole dedup failure.
    This patch fixed this issue and a selftest is added in the subsequent
    patch to test this scenario.
    
    The other missed handling is in btf__resolve_size(). Currently the compiler
    always emit like PTR->TYPE_TAG->... so in practice we don't hit the missing
    BTF_KIND_TYPE_TAG handling issue with compiler generated code. But lets
    add case BTF_KIND_TYPE_TAG in the switch statement to be future proof.
    
    Fixes: 2dc1e48 ("libbpf: Support BTF_KIND_TYPE_TAG")
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115163937.3922235-1-yhs@fb.com
    yonghong-song authored and borkmann committed Nov 16, 2021
  8. bpftool: Add current libbpf_strict mode to version output

    + bpftool --legacy --version
    bpftool v5.15.0
    features: libbfd, skeletons
    + bpftool --version
    bpftool v5.15.0
    features: libbfd, libbpf_strict, skeletons
    
    + bpftool --legacy --help
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    + bpftool --help
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    
    + bpftool --legacy
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    + bpftool
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    
    + bpftool --legacy version
    bpftool v5.15.0
    features: libbfd, skeletons
    + bpftool version
    bpftool v5.15.0
    features: libbfd, libbpf_strict, skeletons
    
    + bpftool --json --legacy version
    {"version":"5.15.0","features":{"libbfd":true,"libbpf_strict":false,"skeletons":true}}
    + bpftool --json version
    {"version":"5.15.0","features":{"libbfd":true,"libbpf_strict":true,"skeletons":true}}
    
    Suggested-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Quentin Monnet <quentin@isovalent.com>
    Link: https://lore.kernel.org/bpf/20211116000448.2918854-1-sdf@google.com
    fomichev authored and borkmann committed Nov 16, 2021

Commits on Nov 15, 2021

  1. Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

    Daniel Borkmann says:
    
    ====================
    pull-request: bpf-next 2021-11-15
    
    We've added 72 non-merge commits during the last 13 day(s) which contain
    a total of 171 files changed, 2728 insertions(+), 1143 deletions(-).
    
    The main changes are:
    
    1) Add btf_type_tag attributes to bring kernel annotations like __user/__rcu to
       BTF such that BPF verifier will be able to detect misuse, from Yonghong Song.
    
    2) Big batch of libbpf improvements including various fixes, future proofing APIs,
       and adding a unified, OPTS-based bpf_prog_load() low-level API, from Andrii Nakryiko.
    
    3) Add ingress_ifindex to BPF_SK_LOOKUP program type for selectively applying the
       programmable socket lookup logic to packets from a given netdev, from Mark Pashmfouroush.
    
    4) Remove the 128M upper JIT limit for BPF programs on arm64 and add selftest to
       ensure exception handling still works, from Russell King and Alan Maguire.
    
    5) Add a new bpf_find_vma() helper for tracing to map an address to the backing
       file such as shared library, from Song Liu.
    
    6) Batch of various misc fixes to bpftool, fixing a memory leak in BPF program dump,
       updating documentation and bash-completion among others, from Quentin Monnet.
    
    7) Deprecate libbpf bpf_program__get_prog_info_linear() API and migrate its users as
       the API is heavily tailored around perf and is non-generic, from Dave Marchevsky.
    
    8) Enable libbpf's strict mode by default in bpftool and add a --legacy option as an
       opt-out for more relaxed BPF program requirements, from Stanislav Fomichev.
    
    9) Fix bpftool to use libbpf_get_error() to check for errors, from Hengqi Chen.
    
    * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits)
      bpftool: Use libbpf_get_error() to check error
      bpftool: Fix mixed indentation in documentation
      bpftool: Update the lists of names for maps and prog-attach types
      bpftool: Fix indent in option lists in the documentation
      bpftool: Remove inclusion of utilities.mak from Makefiles
      bpftool: Fix memory leak in prog_dump()
      selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning
      selftests/bpf: Fix an unused-but-set-variable compiler warning
      bpf: Introduce btf_tracing_ids
      bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs
      bpftool: Enable libbpf's strict mode by default
      docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support
      selftests/bpf: Clarify llvm dependency with btf_tag selftest
      selftests/bpf: Add a C test for btf_type_tag
      selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c
      selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication
      selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests
      selftests/bpf: Test libbpf API function btf__add_type_tag()
      bpftool: Support BTF_KIND_TYPE_TAG
      libbpf: Support BTF_KIND_TYPE_TAG
      ...
    ====================
    
    Link: https://lore.kernel.org/r/20211115162008.25916-1-daniel@iogearbox.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Nov 15, 2021
  2. Revert "Merge branch 'mctp-i2c-driver'"

    This reverts commit 71812af, reversing
    changes made to cc0be1a.
    
    Wolfram Sang says:
    
    Please revert. Besides the driver in net, it modifies the I2C core
    code. This has not been acked by the I2C maintainer (in this case me).
    So, please don't pull this in via the net tree. The question raised here
    (extending SMBus calls to 255 byte) is complicated because we need ABI
    backwards compatibility.
    
    Link: https://lore.kernel.org/all/YZJ9H4eM%2FM7OXVN0@shikoro/
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Nov 15, 2021
  3. Merge branch 'generic-phylink-validation'

    Russell King says:
    
    ====================
    introduce generic phylink validation
    
    The various validate method implementations we have in phylink users
    have been quite repetitive but also prone to bugs. These patches
    introduce a generic implementation which relies solely on the
    supported_interfaces bitmap introduced during last cycle, and in the
    first patch, a bit array of MAC capabilities.
    
    MAC drivers are free to continue to do their own thing if they have
    special requirements - such as mvneta and mvpp2 which do not support
    1000base-X without AN enabled. Most implementations currently in the
    kernel can be converted to call phylink_generic_validate() directly
    from the phylink MAC operations structure once they fill in the
    supported_interfaces and mac_capabilities members of phylink_config.
    
    This series introduces the generic implementation, and converts mvneta
    and mvpp2 to use it.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Nov 15, 2021
  4. net: mvpp2: use phylink_generic_validate()

    Convert mvpp2 to use phylink_generic_validate() for the bulk of its
    validate() implementation. This network adapter has a restriction
    that for 802.3z links, autonegotiation must be enabled.
    
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Nov 15, 2021
  5. net: mvneta: use phylink_generic_validate()

    Convert mvneta to use phylink_generic_validate() for the bulk of its
    validate() implementation. This network adapter has a restriction
    that for 802.3z links, autonegotiation must be enabled.
    
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Nov 15, 2021
Older