Skip to content
Permalink
Usama-Arif/bpf…
Switch branches/tags

Commits on Jan 21, 2022

  1. selftests/bpf: add test for module helper

    This is a simple test for a module hepler that accepts
    2 pointers to integer, prints them (using printk which isn't
    directly accessible to eBPF applications) and returns their sum.
    The test has been adapted from test_ksyms_module.
    
    Signed-off-by: Usama Arif <usama.arif@bytedance.com>
    uarif1 authored and intel-lab-lkp committed Jan 21, 2022
  2. bpf: add support for module helpers in verifier

    After the kernel module registers the helper, its BTF id
    and func_proto are available during verification. During
    verification, it is checked to see if insn->imm is available
    in the list of module helper btf ids. If it is,
    check_helper_call is called, otherwise check_kfunc_call.
    The module helper function proto is obtained in check_helper_call
    via get_mod_helper_proto function.
    
    Signed-off-by: Usama Arif <usama.arif@bytedance.com>
    uarif1 authored and intel-lab-lkp committed Jan 21, 2022
  3. bpf: btf: Introduce infrastructure for module helpers

    This adds support for calling helper functions in eBPF applications
    that have been declared in a kernel module. The corresponding
    verifier changes for module helpers will be added in a later patch.
    
    Module helpers are useful as:
    - They support more argument and return types when compared to module
    kfunc.
    - This adds a way to have helper functions that would be too specialized
    for a specific usecase to merge upstream, but are functions that can have
    a constant API and can be maintained in-kernel modules.
    - The number of in-kernel helpers have grown to a large number
    (187 at the time of writing this commit). Having module helper functions
    could possibly reduce the number of in-kernel helper functions growing
    in the future and maintained upstream.
    
    When the kernel module registers the helper, the module owner,
    BTF id set of the function and function proto is stored as part of a
    btf_mod_helper entry in a btf_mod_helper_list which is part of
    struct btf. This entry can be removed in the unregister function
    while exiting the module, and can be used by the bpf verifier to
    check the helper call and get function proto.
    
    Signed-off-by: Usama Arif <usama.arif@bytedance.com>
    uarif1 authored and intel-lab-lkp committed Jan 21, 2022
  4. selftests: bpf: test BPF_PROG_QUERY for progs attached to sockmap

    Add test for querying progs attached to sockmap. we use an existing
    libbpf query interface to query prog cnt before and after progs
    attaching to sockmap and check whether the queried prog id is right.
    
    Signed-off-by: Di Zhu <zhudi2@huawei.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/r/20220119014005.1209-2-zhudi2@huawei.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Di Zhu authored and Alexei Starovoitov committed Jan 21, 2022
  5. bpf: support BPF_PROG_QUERY for progs attached to sockmap

    Right now there is no way to query whether BPF programs are
    attached to a sockmap or not.
    
    we can use the standard interface in libbpf to query, such as:
    bpf_prog_query(mapFd, BPF_SK_SKB_STREAM_PARSER, 0, NULL, ...);
    the mapFd is the fd of sockmap.
    
    Signed-off-by: Di Zhu <zhudi2@huawei.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/r/20220119014005.1209-1-zhudi2@huawei.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Di Zhu authored and Alexei Starovoitov committed Jan 21, 2022
  6. Merge branch 'libbpf: streamline netlink-based XDP APIs'

    Andrii Nakryiko says:
    
    ====================
    
    Revamp existing low-level XDP APIs provided by libbpf to follow more
    consistent naming (new APIs follow bpf_tc_xxx() approach where it makes
    sense) and be extensible without ABI breakages (OPTS-based). See patch #1 for
    details, remaining patches switch bpftool, selftests/bpf and samples/bpf to
    new APIs.
    ====================
    
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Alexei Starovoitov committed Jan 21, 2022
  7. samples/bpf: adapt samples/bpf to bpf_xdp_xxx() APIs

    Use new bpf_xdp_*() APIs across all XDP-related BPF samples.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20220120061422.2710637-5-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022
  8. selftests/bpf: switch to new libbpf XDP APIs

    Switch to using new bpf_xdp_*() APIs across all selftests. Take
    advantage of a more straightforward and user-friendly semantics of
    old_prog_fd (0 means "don't care") in few places.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20220120061422.2710637-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022
  9. bpftool: use new API for attaching XDP program

    Switch to new bpf_xdp_attach() API to avoid deprecation warnings.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20220120061422.2710637-3-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022
  10. libbpf: streamline low-level XDP APIs

    Introduce 4 new netlink-based XDP APIs for attaching, detaching, and
    querying XDP programs:
      - bpf_xdp_attach;
      - bpf_xdp_detach;
      - bpf_xdp_query;
      - bpf_xdp_query_id.
    
    These APIs replace bpf_set_link_xdp_fd, bpf_set_link_xdp_fd_opts,
    bpf_get_link_xdp_id, and bpf_get_link_xdp_info APIs ([0]). The latter
    don't follow a consistent naming pattern and some of them use
    non-extensible approaches (e.g., struct xdp_link_info which can't be
    modified without breaking libbpf ABI).
    
    The approach I took with these low-level XDP APIs is similar to what we
    did with low-level TC APIs. There is a nice duality of bpf_tc_attach vs
    bpf_xdp_attach, and so on. I left bpf_xdp_attach() to support detaching
    when -1 is specified for prog_fd for generality and convenience, but
    bpf_xdp_detach() is preferred due to clearer naming and associated
    semantics. Both bpf_xdp_attach() and bpf_xdp_detach() accept the same
    opts struct allowing to specify expected old_prog_fd.
    
    While doing the refactoring, I noticed that old APIs require users to
    specify opts with old_fd == -1 to declare "don't care about already
    attached XDP prog fd" condition. Otherwise, FD 0 is assumed, which is
    essentially never an intended behavior. So I made this behavior
    consistent with other kernel and libbpf APIs, in which zero FD means "no
    FD". This seems to be more in line with the latest thinking in BPF land
    and should cause less user confusion, hopefully.
    
    For querying, I left two APIs, both more generic bpf_xdp_query()
    allowing to query multiple IDs and attach mode, but also
    a specialization of it, bpf_xdp_query_id(), which returns only requested
    prog_id. Uses of prog_id returning bpf_get_link_xdp_id() were so
    prevalent across selftests and samples, that it seemed a very common use
    case and using bpf_xdp_query() for doing it felt very cumbersome with
    a highly branches if/else chain based on flags and attach mode.
    
    Old APIs are scheduled for deprecation in libbpf 0.8 release.
    
      [0] Closes: libbpf/libbpf#309
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Link: https://lore.kernel.org/r/20220120061422.2710637-2-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022
  11. Merge branch 'libbpf: deprecate legacy BPF map definitions'

    Andrii Nakryiko says:
    
    ====================
    
    Officially deprecate legacy BPF map definitions in libbpf. They've been slated
    for deprecation for a while in favor of more powerful BTF-defined map
    definitions and this patch set adds warnings and a way to enforce this in
    libbpf through LIBBPF_STRICT_MAP_DEFINITIONS strict mode flag.
    
    Selftests are fixed up and updated, BPF documentation is updated, bpftool's
    strict mode usage is adjusted to avoid breaking users unnecessarily.
    
    v1->v2:
      - replace missed bpf_map_def case in Documentation/bpf/btf.rst (Alexei).
    ====================
    
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Alexei Starovoitov committed Jan 21, 2022
  12. docs/bpf: update BPF map definition example

    Use BTF-defined map definition in the documentation example.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20220120060529.1890907-5-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022
  13. libbpf: deprecate legacy BPF map definitions

    Enact deprecation of legacy BPF map definition in SEC("maps") ([0]). For
    the definitions themselves introduce LIBBPF_STRICT_MAP_DEFINITIONS flag
    for libbpf strict mode. If it is set, error out on any struct
    bpf_map_def-based map definition. If not set, libbpf will print out
    a warning for each legacy BPF map to raise awareness that it goes away.
    
    For any use of BPF_ANNOTATE_KV_PAIR() macro providing a legacy way to
    associate BTF key/value type information with legacy BPF map definition,
    warn through libbpf's pr_warn() error message (but don't fail BPF object
    open).
    
    BPF-side struct bpf_map_def is marked as deprecated. User-space struct
    bpf_map_def has to be used internally in libbpf, so it is left
    untouched. It should be enough for bpf_map__def() to be marked
    deprecated to raise awareness that it goes away.
    
    bpftool is an interesting case that utilizes libbpf to open BPF ELF
    object to generate skeleton. As such, even though bpftool itself uses
    full on strict libbpf mode (LIBBPF_STRICT_ALL), it has to relax it a bit
    for BPF map definition handling to minimize unnecessary disruptions. So
    opt-out of LIBBPF_STRICT_MAP_DEFINITIONS for bpftool. User's code that
    will later use generated skeleton will make its own decision whether to
    enforce LIBBPF_STRICT_MAP_DEFINITIONS or not.
    
    There are few tests in selftests/bpf that are consciously using legacy
    BPF map definitions to test libbpf functionality. For those, temporary
    opt out of LIBBPF_STRICT_MAP_DEFINITIONS mode for the duration of those
    tests.
    
      [0] Closes: libbpf/libbpf#272
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20220120060529.1890907-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022
  14. selftests/bpf: convert remaining legacy map definitions

    Converted few remaining legacy BPF map definition to BTF-defined ones.
    For the remaining two bpf_map_def-based legacy definitions that we want
    to keep for testing purposes until libbpf 1.0 release, guard them in
    pragma to suppres deprecation warnings which will be added in libbpf in
    the next commit.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20220120060529.1890907-3-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022
  15. selftests/bpf: fail build on compilation warning

    It's very easy to miss compilation warnings without -Werror, which is
    not set for selftests. libbpf and bpftool are already strict about this,
    so make selftests/bpf also treat compilation warnings as errors to catch
    such regressions early.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20220120060529.1890907-2-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    anakryiko authored and Alexei Starovoitov committed Jan 21, 2022

Commits on Jan 20, 2022

  1. selftests/bpf: Do not fail build if CONFIG_NF_CONNTRACK=m/n

    Some users have complained that selftests fail to build when
    CONFIG_NF_CONNTRACK=m. It would be useful to allow building as long as
    it is set to module or built-in, even though in case of building as
    module, user would need to load it before running the selftest. Note
    that this also allows building selftest when CONFIG_NF_CONNTRACK is
    disabled.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220120164932.2798544-1-memxor@gmail.com
    kkdwivedi authored and anakryiko committed Jan 20, 2022
  2. selftests: bpf: Fix bind on used port

    The bind_perm BPF selftest failed when port 111/tcp was already in use
    during the test. To fix this, the test now runs in its own network name
    space.
    
    To use unshare, it is necessary to reorder the includes. The style of
    the includes is adapted to be consistent with the other prog_tests.
    
    v2: Replace deprecated CHECK macro with ASSERT_OK
    
    Fixes: 8259fde ("selftests/bpf: Verify that rebinding to port < 1024 from BPF works")
    Signed-off-by: Felix Maurer <fmaurer@redhat.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/551ee65533bb987a43f93d88eaf2368b416ccd32.1642518457.git.fmaurer@redhat.com
    fmaurer-rh authored and anakryiko committed Jan 20, 2022
  3. Merge branch 'rely on ASSERT marcos in xdp_bpf2bpf.c/xdp_adjust_tail.c'

    Lorenzo Bianconi says:
    
    ====================
    
    Rely on ASSERT* macros and get rid of deprecated CHECK ones in xdp_bpf2bpf and
    xdp_adjust_tail bpf selftests.
    This is a preliminary series for XDP multi-frags support.
    
    Changes since v1:
    - run each ASSERT test separately
    - drop unnecessary return statements
    - drop unnecessary if condition in test_xdp_bpf2bpf()
    ====================
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    anakryiko committed Jan 20, 2022
  4. bpf: selftests: Get rid of CHECK macro in xdp_bpf2bpf.c

    Rely on ASSERT* macros and get rid of deprecated CHECK ones in
    xdp_bpf2bpf bpf selftest.
    
    Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/df7e5098465016e27d91f2c69a376a35d63a7621.1642679130.git.lorenzo@kernel.org
    LorenzoBianconi authored and anakryiko committed Jan 20, 2022
  5. bpf: selftests: Get rid of CHECK macro in xdp_adjust_tail.c

    Rely on ASSERT* macros and get rid of deprecated CHECK ones in
    xdp_adjust_tail bpf selftest.
    
    Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/c0ab002ffa647a20ec9e584214bf0d4373142b54.1642679130.git.lorenzo@kernel.org
    LorenzoBianconi authored and anakryiko committed Jan 20, 2022

Commits on Jan 19, 2022

  1. Merge branch 'bpf: allow cgroup progs to export custom retval to user…

    …space'
    
    YiFei Zhu says:
    
    ====================
    
    Right now, most cgroup hooks are best used for permission checks. They
    can only reject a syscall with -EPERM, so a cause of a rejection, if
    the rejected by eBPF cgroup hooks, is ambiguous to userspace.
    Additionally, if the syscalls are implemented in eBPF, all permission
    checks and the implementation has to happen within the same filter,
    as programs executed later in the series of progs are unaware of the
    return values return by the previous progs.
    
    This patch series adds two helpers, bpf_get_retval and bpf_set_retval,
    that allows hooks to get/set the return value of syscall to userspace.
    This also allows later progs to retrieve retval set by previous progs.
    
    For legacy programs that rejects a syscall without setting the retval,
    for backwards compatibility, if a prog rejects without itself or a
    prior prog setting retval to an -err, the retval is set by the kernel
    to -EPERM.
    
    For getsockopt hooks that has ctx->retval, this variable mirrors that
    that accessed by the helpers.
    
    Additionally, the following user-visible behavior for getsockopt
    hooks has changed:
      - If a prior filter rejected the syscall, it will be visible
        in ctx->retval.
      - Attempting to change the retval arbitrarily is now allowed and
        will not cause an -EFAULT.
      - If kernel rejects a getsockopt syscall before running the hooks,
        the error will be visible in ctx->retval. Returning 0 from the
        prog will not overwrite the error to -EPERM unless there is an
        explicit call of bpf_set_retval(-EPERM)
    
    Tests have been added in this series to test the behavior of the helper
    with cgroup setsockopt getsockopt hooks.
    
    Patch 1 changes the API of macros to prepare for the next patch and
      should be a no-op.
    Patch 2 moves ctx->retval to a struct pointed to by current
      task_struct.
    Patch 3 implements the helpers.
    Patch 4 tests the behaviors of the helpers.
    Patch 5 updates a test after the test broke due to the visible changes.
    
    v1 -> v2:
      - errno -> retval
      - split one helper to get & set helpers
      - allow retval to be set arbitrarily in the general case
      - made the helper retval and context retval mirror each other
    ====================
    
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Alexei Starovoitov committed Jan 19, 2022
  2. selftests/bpf: Update sockopt_sk test to the use bpf_set_retval

    The tests would break without this patch, because at one point it calls
      getsockopt(fd, SOL_TCP, TCP_ZEROCOPY_RECEIVE, &buf, &optlen)
    This getsockopt receives the kernel-set -EINVAL. Prior to this patch
    series, the eBPF getsockopt hook's -EPERM would override kernel's
    -EINVAL, however, after this patch series, return 0's automatic
    -EPERM will not; the eBPF prog has to explicitly bpf_set_retval(-EPERM)
    if that is wanted.
    
    I also removed the explicit mentions of EPERM in the comments in the
    prog.
    
    Signed-off-by: YiFei Zhu <zhuyifei@google.com>
    Reviewed-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/4f20b77cb46812dbc2bdcd7e3fa87c7573bde55e.1639619851.git.zhuyifei@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    zhuyifei1999 authored and Alexei Starovoitov committed Jan 19, 2022
  3. selftests/bpf: Test bpf_{get,set}_retval behavior with cgroup/sockopt

    The tests checks how different ways of interacting with the helpers
    (getting retval, setting EUNATCH, EISCONN, and legacy reject
    returning 0 without setting retval), produce different results in
    both the setsockopt syscall and the retval returned by the helper.
    A few more tests verify the interaction between the retval of the
    helper and the retval in getsockopt context.
    
    Signed-off-by: YiFei Zhu <zhuyifei@google.com>
    Reviewed-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/43ec60d679ae3f4f6fd2460559c28b63cb93cd12.1639619851.git.zhuyifei@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    zhuyifei1999 authored and Alexei Starovoitov committed Jan 19, 2022
  4. bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall retur…

    …n value
    
    The helpers continue to use int for retval because all the hooks
    are int-returning rather than long-returning. The return value of
    bpf_set_retval is int for future-proofing, in case in the future
    there may be errors trying to set the retval.
    
    After the previous patch, if a program rejects a syscall by
    returning 0, an -EPERM will be generated no matter if the retval
    is already set to -err. This patch change it being forced only if
    retval is not -err. This is because we want to support, for
    example, invoking bpf_set_retval(-EINVAL) and return 0, and have
    the syscall return value be -EINVAL not -EPERM.
    
    For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
    that, if the return value is NET_XMIT_DROP, the packet is silently
    dropped. We preserve this behavior for backward compatibility
    reasons, so even if an errno is set, the errno does not return to
    caller. However, setting a non-err to retval cannot propagate so
    this is not allowed and we return a -EFAULT in that case.
    
    Signed-off-by: YiFei Zhu <zhuyifei@google.com>
    Reviewed-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/b4013fd5d16bed0b01977c1fafdeae12e1de61fb.1639619851.git.zhuyifei@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    zhuyifei1999 authored and Alexei Starovoitov committed Jan 19, 2022
  5. bpf: Move getsockopt retval to struct bpf_cg_run_ctx

    The retval value is moved to struct bpf_cg_run_ctx for ease of access
    in different prog types with different context structs layouts. The
    helper implementation (to be added in a later patch in the series) can
    simply perform a container_of from current->bpf_ctx to retrieve
    bpf_cg_run_ctx.
    
    Unfortunately, there is no easy way to access the current task_struct
    via the verifier BPF bytecode rewrite, aside from possibly calling a
    helper, so a pointer to current task is added to struct bpf_sockopt_kern
    so that the rewritten BPF bytecode can access struct bpf_cg_run_ctx with
    an indirection.
    
    For backward compatibility, if a getsockopt program rejects a syscall
    by returning 0, an -EPERM will be generated, by having the
    BPF_PROG_RUN_ARRAY_CG family macros automatically set the retval to
    -EPERM. Unlike prior to this patch, this -EPERM will be visible to
    ctx->retval for any other hooks down the line in the prog array.
    
    Additionally, the restriction that getsockopt filters can only set
    the retval to 0 is removed, considering that certain getsockopt
    implementations may return optlen. Filters are now able to set the
    value arbitrarily.
    
    Signed-off-by: YiFei Zhu <zhuyifei@google.com>
    Reviewed-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/73b0325f5c29912ccea7ea57ec1ed4d388fc1d37.1639619851.git.zhuyifei@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    zhuyifei1999 authored and Alexei Starovoitov committed Jan 19, 2022
  6. bpf: Make BPF_PROG_RUN_ARRAY return -err instead of allow boolean

    Right now BPF_PROG_RUN_ARRAY and related macros return 1 or 0
    for whether the prog array allows or rejects whatever is being
    hooked. The caller of these macros then return -EPERM or continue
    processing based on thw macro's return value. Unforunately this is
    inflexible, since -EPERM is the only err that can be returned.
    
    This patch should be a no-op; it prepares for the next patch. The
    returning of the -EPERM is moved to inside the macros, so the outer
    functions are directly returning what the macros returned if they
    are non-zero.
    
    Signed-off-by: YiFei Zhu <zhuyifei@google.com>
    Reviewed-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/788abcdca55886d1f43274c918eaa9f792a9f33b.1639619851.git.zhuyifei@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    zhuyifei1999 authored and Alexei Starovoitov committed Jan 19, 2022
  7. libbpf: Improve btf__add_btf() with an additional hashmap for strings.

    Add a hashmap to map the string offsets from a source btf to the
    string offsets from a target btf to reduce overheads.
    
    btf__add_btf() calls btf__add_str() to add strings from a source to a
    target btf.  It causes many string comparisons, and it is a major
    hotspot when adding a big btf.  btf__add_str() uses strcmp() to check
    if a hash entry is the right one.  The extra hashmap here compares
    offsets of strings, that are much cheaper.  It remembers the results
    of btf__add_str() for later uses to reduce the cost.
    
    We are parallelizing BTF encoding for pahole by creating separated btf
    instances for worker threads.  These per-thread btf instances will be
    added to the btf instance of the main thread by calling btf__add_str()
    to deduplicate and write out.  With this patch and -j4, the running
    time of pahole drops to about 6.0s from 6.6s.
    
    The following lines are the summary of 'perf stat' w/o the change.
    
           6.668126396 seconds time elapsed
    
          13.451054000 seconds user
           0.715520000 seconds sys
    
    The following lines are the summary w/ the change.
    
           5.986973919 seconds time elapsed
    
          12.939903000 seconds user
           0.724152000 seconds sys
    
    V4 fixes a bug of error checking against the pointer returned by
    hashmap__new().
    
    [v3] https://lore.kernel.org/bpf/20220118232053.2113139-1-kuifeng@fb.com/
    [v2] https://lore.kernel.org/bpf/20220114193713.461349-1-kuifeng@fb.com/
    
    Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220119180214.255634-1-kuifeng@fb.com
    Kui-Feng Lee authored and anakryiko committed Jan 19, 2022
  8. bpf/scripts: Raise an exception if the correct number of sycalls are …

    …not generated
    
    Currently the syscalls rst and subsequently man page are auto-generated
    using function documentation present in bpf.h. If the documentation for the
    syscall is missing or doesn't follow a specific format, then that syscall
    is not dumped in the auto-generated rst.
    
    This patch checks the number of syscalls documented within the header file
    with those present as part of the enum bpf_cmd and raises an Exception if
    they don't match. It is not needed with the currently documented upstream
    syscalls, but can help in debugging when developing new syscalls when
    there might be missing or misformatted documentation.
    
    The function helper_number_check is moved to the Printer parent
    class and renamed to elem_number_check as all the most derived children
    classes are using this function now.
    
    Signed-off-by: Usama Arif <usama.arif@bytedance.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Quentin Monnet <quentin@isovalent.com>
    Link: https://lore.kernel.org/bpf/20220119114442.1452088-3-usama.arif@bytedance.com
    uarif1 authored and anakryiko committed Jan 19, 2022
  9. bpf/scripts: Make description and returns section for helpers/syscall…

    …s mandatory
    
    This  enforce a minimal formatting consistency for the documentation. The
    description and returns missing for a few helpers have also been added.
    
    Signed-off-by: Usama Arif <usama.arif@bytedance.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Quentin Monnet <quentin@isovalent.com>
    Link: https://lore.kernel.org/bpf/20220119114442.1452088-2-usama.arif@bytedance.com
    uarif1 authored and anakryiko committed Jan 19, 2022
  10. uapi/bpf: Add missing description and returns for helper documentation

    Both description and returns section will become mandatory
    for helpers and syscalls in a later commit to generate man pages.
    
    This commit also adds in the documentation that BPF_PROG_RUN is
    an alias for BPF_PROG_TEST_RUN for anyone searching for the
    syscall in the generated man pages.
    
    Signed-off-by: Usama Arif <usama.arif@bytedance.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220119114442.1452088-1-usama.arif@bytedance.com
    uarif1 authored and anakryiko committed Jan 19, 2022
  11. bpftool: Adding support for BTF program names

    `bpftool prog list` and other bpftool subcommands that show
    BPF program names currently get them from bpf_prog_info.name.
    That field is limited to 16 (BPF_OBJ_NAME_LEN) chars which leads
    to truncated names since many progs have much longer names.
    
    The idea of this change is to improve all bpftool commands that
    output prog name so that bpftool uses info from BTF to print
    program names if available.
    
    It tries bpf_prog_info.name first and fall back to btf only if
    the name is suspected to be truncated (has 15 chars length).
    
    Right now `bpftool p show id <id>` returns capped prog name
    
    <id>: kprobe  name example_cap_cap  tag 712e...
    ...
    
    With this change it would return
    
    <id>: kprobe  name example_cap_capable  tag 712e...
    ...
    
    Note, other commands that print prog names (e.g. "bpftool
    cgroup tree") are also addressed in this change.
    
    Signed-off-by: Raman Shukhau <ramasha@fb.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220119100255.1068997-1-ramasha@fb.com
    Raman Shukhau authored and anakryiko committed Jan 19, 2022
  12. libbpf: Define BTF_KIND_* constants in btf.h to avoid compilation errors

    The btf.h header included with libbpf contains inline helper functions to
    check for various BTF kinds. These helpers directly reference the
    BTF_KIND_* constants defined in the kernel header, and because the header
    file is included in user applications, this happens in the user application
    compile units.
    
    This presents a problem if a user application is compiled on a system with
    older kernel headers because the constants are not available. To avoid
    this, add #defines of the constants directly in btf.h before using them.
    
    Since the kernel header moved to an enum for BTF_KIND_*, the #defines can
    shadow the enum values without any errors, so we only need #ifndef guards
    for the constants that predates the conversion to enum. We group these so
    there's only one guard for groups of values that were added together.
    
      [0] Closes: libbpf/libbpf#436
    
    Fixes: 223f903 ("bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG")
    Fixes: 5b84bd1 ("libbpf: Add support for BTF_KIND_TAG")
    Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Link: https://lore.kernel.org/bpf/20220118141327.34231-1-toke@redhat.com
    tohojo authored and anakryiko committed Jan 19, 2022

Commits on Jan 18, 2022

  1. Merge branch 'bpf: Batching iter for AF_UNIX sockets.'

    Kuniyuki Iwashima says:
    
    ====================
    
    Last year the commit afd20b9 ("af_unix: Replace the big lock with
    small locks.") landed on bpf-next.  Now we can use a batching algorithm
    for AF_UNIX bpf iter as TCP bpf iter.
    
    Changelog:
    - Add the 1st patch.
    - Call unix_get_first() in .start()/.next() to always acquire a lock in
      each iteration in the 2nd patch.
    ====================
    
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Alexei Starovoitov committed Jan 18, 2022
  2. selftest/bpf: Fix a stale comment.

    The commit b8a58aa ("af_unix: Cut unix_validate_addr() out of
    unix_mkname().") moved the bound test part into unix_validate_addr().
    
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
    Link: https://lore.kernel.org/r/20220113002849.4384-6-kuniyu@amazon.co.jp
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    q2ven authored and Alexei Starovoitov committed Jan 18, 2022
  3. selftest/bpf: Test batching and bpf_(get|set)sockopt in bpf unix iter.

    This patch adds a test for the batching and bpf_(get|set)sockopt in bpf
    unix iter.
    
    It does the following.
    
      1. Creates an abstract UNIX domain socket
      2. Call bpf_setsockopt()
      3. Call bpf_getsockopt() and save the value
      4. Call setsockopt()
      5. Call getsockopt() and save the value
      6. Compare the saved values
    
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
    Link: https://lore.kernel.org/r/20220113002849.4384-5-kuniyu@amazon.co.jp
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    q2ven authored and Alexei Starovoitov committed Jan 18, 2022
Older