Skip to content
Permalink
Beau-Belgrave/…
Switch branches/tags

Commits on Jan 11, 2022

  1. user_events: Add documentation file

    Add a documentation file about user_events with example code, etc.
    explaining how it may be used.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  2. user_events: Add sample code for typical usage

    Add sample code for user_events typical usage to show how to register
    and monitor status, as well as to write out data.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  3. user_events: Add self-test for validator boundaries

    Tests to ensure validator boundary cases are working correctly within
    close and far bounds. Ensures __data_loc and __rel_loc strings are
    null terminated and within range. Ensures min size checks work as
    expected.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  4. user_events: Add self-test for perf_event integration

    Tests perf can be attached to and written out correctly. Ensures attach
    updates status bits in user programs.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  5. user_events: Add self-test for dynamic_events integration

    Tests matching deletes, creation of basic and complex types. Ensures
    common patterns work correctly when interacting with dynamic_events
    file.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  6. user_events: Add self-test for ftrace integration

    Tests basic functionality of registering/deregistering, status and
    writing data out via ftrace mechanisms within user_events.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  7. user_events: Validate user payloads for size and null termination

    Add validation to ensure data is at or greater than the min size for the
    fields of the event. If a dynamic array is used and is a type of char,
    ensure null termination of the array exists.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  8. user_events: Optimize writing events by only copying data once

    Pass iterator through to probes to allow copying data directly to the
    probe buffers instead of taking multiple copies. Enables eBPF user and
    raw iterator types out to programs for no-copy scenarios.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  9. user_events: Add basic perf and eBPF support

    Adds support to write out user_event data to perf_probe/perf files as
    well as to any attached eBPF program.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  10. user_events: Handle matching arguments from dyn_events

    Ensures that when dynamic events requests a match with arguments that
    they match what is in the user_event.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  11. user_events: Add print_fmt generation support for basic types

    Addes print_fmt format generation for basic types that are supported for
    user processes. Only supports sizes that are the same on 32 and 64 bit.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022
  12. user_events: Add minimal support for trace_event into ftrace

    Minimal support for interacting with dynamic events, trace_event and
    ftrace. Core outline of flow between user process, ioctl and trace_event
    APIs.
    
    User mode processes that wish to use trace events to get data into
    ftrace, perf, eBPF, etc are limited to uprobes today. The user events
    features enables an ABI for user mode processes to create and write to
    trace events that are isolated from kernel level trace events. This
    enables a faster path for tracing from user mode data as well as opens
    managed code to participate in trace events, where stub locations are
    dynamic.
    
    User processes often want to trace only when it's useful. To enable this
    a set of pages are mapped into the user process space that indicate the
    current state of the user events that have been registered. User
    processes can check if their event is hooked to a trace/probe, and if it
    is, emit the event data out via the write() syscall.
    
    Two new files are introduced into tracefs to accomplish this:
    user_events_status - This file is mmap'd into participating user mode
    processes to indicate event status.
    
    user_events_data - This file is opened and register/delete ioctl's are
    issued to create/open/delete trace events that can be used for tracing.
    
    The typical scenario is on process start to mmap user_events_status. Processes
    then register the events they plan to use via the REG ioctl. The ioctl reads
    and updates the passed in user_reg struct. The status_index of the struct is
    used to know the byte in the status page to check for that event. The
    write_index of the struct is used to describe that event when writing out to
    the fd that was used for the ioctl call. The data must always include this
    index first when writing out data for an event. Data can be written either by
    write() or by writev().
    
    For example, in memory:
    int index;
    char data[];
    
    Psuedo code example of typical usage:
    struct user_reg reg;
    
    int page_fd = open("user_events_status", O_RDWR);
    char *page_data = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_SHARED, page_fd, 0);
    close(page_fd);
    
    int data_fd = open("user_events_data", O_RDWR);
    
    reg.size = sizeof(reg);
    reg.name_args = (__u64)"test";
    
    ioctl(data_fd, DIAG_IOCSREG, &reg);
    int status_id = reg.status_index;
    int write_id = reg.write_index;
    
    struct iovec io[2];
    io[0].iov_base = &write_id;
    io[0].iov_len = sizeof(write_id);
    io[1].iov_base = payload;
    io[1].iov_len = sizeof(payload);
    
    if (page_data[status_id])
    	writev(data_fd, io, 2);
    
    User events are also exposed via the dynamic_events tracefs file for
    both create and delete. Current status is exposed via the user_events_status
    tracefs file.
    
    Simple example to register a user event via dynamic_events:
    	echo u:test >> dynamic_events
    	cat dynamic_events
    	u:test
    
    If an event is hooked to a probe, the probe hooked shows up:
    	echo 1 > events/user_events/test/enable
    	cat user_events_status
    	1:test # Used by ftrace
    
    	Active: 1
    	Busy: 1
    	Max: 4096
    
    If an event is not hooked to a probe, no probe status shows up:
    	echo 0 > events/user_events/test/enable
    	cat user_events_status
    	1:test
    
    	Active: 1
    	Busy: 0
    	Max: 4096
    
    Users can describe the trace event format via the following format:
    	name[:FLAG1[,FLAG2...] [field1[;field2...]]
    
    Each field has the following format:
    	type name
    
    Example for char array with a size of 20 named msg:
    	echo 'u:detailed char[20] msg' >> dynamic_events
    	cat dynamic_events
    	u:detailed char[20] msg
    
    Data offsets are based on the data written out via write() and will be
    updated to reflect the correct offset in the trace_event fields. For dynamic
    data it is recommended to use the new __rel_loc data type. This type will be
    the same as __data_loc, but the offset is relative to this entry. This allows
    user_events to not worry about what common fields are being inserted before
    the data.
    
    The above format is valid for both the ioctl and the dynamic_events file.
    
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Beau Belgrave authored and intel-lab-lkp committed Jan 11, 2022

Commits on Dec 11, 2021

  1. tracing: Have existing event_command.parse() implementations use helpers

    Simplify the existing event_command.parse() implementations by having
    them make use of the helper functions previously introduced.
    
    Link: https://lkml.kernel.org/r/c62ecf84dee35ff23b14ae3db0bb0006c35caff7.1639170140.git.zanussi@kernel.org
    
    Signed-off-by: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Tom Zanussi authored and rostedt committed Dec 11, 2021
  2. tracing: Add helper functions to simplify event_command.parse() callb…

    …ack handling
    
    The event_command.parse() callback is responsible for parsing and
    registering triggers.  The existing command implementions for this
    callback duplicate a lot of the same code, so to clean up and
    consolidate those implementations, introduce a handful of helper
    functions for implementors to use.
    
    This also makes it easier for new commands to be implemented and
    allows them to focus more on the customizations they provide rather
    than obscuring and complicating it with boilerplate code.
    
    Link: https://lkml.kernel.org/r/4579e8fbf60ca9f471a3ff680ffc5c469be83f06.1639170140.git.zanussi@kernel.org
    
    Signed-off-by: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Tom Zanussi authored and rostedt committed Dec 11, 2021
  3. tracing: Change event_trigger_ops func() to trigger()

    The name of the func() callback on event_trigger_ops is too generic
    and is easily confused with other callbacks with that name, so change
    it to something that reflects its actual purpose.
    
    In this case, the main purpose of the callback is to implement an
    event trigger, so call it trigger() instead.
    
    Also add some more documentation to event_trigger_ops describing the
    callbacks a bit better.
    
    Link: https://lkml.kernel.org/r/36ab812e3ee74ee03ae0043fda41a858ee728c00.1639170140.git.zanussi@kernel.org
    
    Signed-off-by: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Tom Zanussi authored and rostedt committed Dec 11, 2021
  4. tracing: Change event_command func() to parse()

    The name of the func() callback on event_command is too generic and is
    easily confused with other callbacks with that name, so change it to
    something that reflects its actual purpose.
    
    In this case, the main purpose of the callback is to parse an event
    command, so call it parse() instead.
    
    Link: https://lkml.kernel.org/r/7784e321840752ed88aac0b349c0c685fc9247b1.1639170140.git.zanussi@kernel.org
    
    Signed-off-by: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Tom Zanussi authored and rostedt committed Dec 11, 2021
  5. tracing: Use trace_iterator_reset() in tracing_read_pipe()

    Currently tracing_read_pipe() open codes trace_iterator_reset(). Just have
    it use trace_iterator_reset() instead.
    
    Link: https://lkml.kernel.org/r/20211210202616.64d432d2@gandalf.local.home
    
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Dec 11, 2021
  6. tracing: Use memset_startat helper in trace_iterator_reset()

    Make use of memset_startat helper to simplify the code, there should be
    no functional change as a result of this patch.
    
    Link: https://lkml.kernel.org/r/20211210012245.207489-1-xiujianfeng@huawei.com
    
    Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Xiu Jianfeng authored and rostedt committed Dec 11, 2021
  7. tracing: Do not let synth_events block other dyn_event systems during…

    … create
    
    synth_events is returning -EINVAL if the dyn_event create command does
    not contain ' \t'. This prevents other systems from getting called back.
    synth_events needs to return -ECANCELED in these cases when the command
    is not targeting the synth_event system.
    
    Link: https://lore.kernel.org/linux-trace-devel/20210930223821.11025-1-beaub@linux.microsoft.com
    
    Fixes: c9e759b ("tracing: Rework synthetic event command parsing")
    Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Beau Belgrave authored and rostedt committed Dec 11, 2021
  8. tracing: Iterate trace_[ku]probe objects directly

    As suggested by Linus [1] using list_for_each_entry to iterate
    directly trace_[ku]probe objects so we can skip another call to
    container_of in these loops.
    
    [1] https://lore.kernel.org/r/CAHk-=wjakjw6-rDzDDBsuMoDCqd+9ogifR_EE1F0K-jYek1CdA@mail.gmail.com
    
    Link: https://lkml.kernel.org/r/20211125202852.406405-1-jolsa@kernel.org
    
    Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Jiri Olsa authored and rostedt committed Dec 11, 2021
  9. tracefs: Use d_inode() helper function to get the dentry inode

    Instead of referencing the inode from a dentry via dentry->d_inode, use
    the helper function d_inode(dentry) instead. This is the considered the
    correct way to access it.
    
    Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
    Reported: https://lore.kernel.org/all/20211208104454.nhxyvmmn6d2qhpwl@wittgenstein/
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Dec 11, 2021
  10. script/sorttable: Code style improvements

    Modified the code style issue of if() {},
    keep the code style consistent.
    
    Link: https://lkml.kernel.org/r/20211207151348.54921-3-yinan@linux.alibaba.com
    
    Signed-off-by: Yinan Liu <yinan@linux.alibaba.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Yinan Liu authored and rostedt committed Dec 11, 2021

Commits on Dec 8, 2021

  1. tracing: Make trace_marker{,_raw} stream-like

    The tracing marker files are write-only streams with no meaningful
    concept of file position.  Using stream_open() to mark them as
    stream-link indicates this and has the added advantage that a single
    file descriptor can now be used from multiple threads without contention
    thanks to clearing FMODE_ATOMIC_POS.
    
    Note that this has the potential to break existing userspace by since
    both lseek(2) and pwrite(2) will now return ESPIPE when previously lseek
    would have updated the stored offset and pwrite would have appended to
    the trace.  A survey of libtracefs and several other projects found to
    use trace_marker(_raw) [1][2][3] suggests that everyone limits
    themselves to calling write(2) and close(2) on these file descriptors so
    there is a good chance this will go unnoticed and the benefits of
    reduced overhead and lock contention seem worth the risk.
    
    [1] https://github.com/google/perfetto
    [2] https://github.com/intel/media-driver/
    [3] https://w1.fi/cgit/hostap/
    
    Link: https://lkml.kernel.org/r/20211207142558.347029-1-john@metanate.com
    
    Signed-off-by: John Keeping <john@metanate.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    johnkeeping authored and rostedt committed Dec 8, 2021

Commits on Dec 6, 2021

  1. tracing: Switch to kvfree_rcu() API

    Instead of invoking a synchronize_rcu() to free a pointer
    after a grace period we can directly make use of new API
    that does the same but in more efficient way.
    
    Link: https://lkml.kernel.org/r/20211124110308.2053-10-urezki@gmail.com
    
    Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    urezki authored and rostedt committed Dec 6, 2021
  2. tracing: Fix synth_event_add_val() kernel-doc comment

    It's named field here.
    
    Link: https://lkml.kernel.org/r/20210516022410.64271-1-hqjagain@gmail.com
    
    Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Qiujun Huang authored and rostedt committed Dec 6, 2021
  3. tracing/uprobes: Use trace_event_buffer_reserve() helper

    To be consistent with kprobes and eprobes, use
    trace_event_buffer_reserver() and trace_event_buffer_commit(). This will
    ensure that any updates to trace events will also be implemented on uprobe
    events.
    
    Link: https://lkml.kernel.org/r/20211206162440.69fbf96c@gandalf.local.home
    
    Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Dec 6, 2021
  4. tracing/kprobes: Do not open code event reserve logic

    As kprobe events use trace_event_buffer_commit() to commit the event to
    the ftrace ring buffer, for consistency, it should use
    trace_event_buffer_reserve() to allocate it, as the two functions are
    related.
    
    Link: https://lkml.kernel.org/r/20211130024319.257430762@goodmis.org
    
    Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Dec 6, 2021
  5. tracing: Have eprobes use filtering logic of trace events

    The eprobes open code the reserving of the event on the ring buffer for
    ftrace instead of using the ftrace event wrappers, which means that it
    doesn't get affected by the filters, breaking the filtering logic on user
    space.
    
    Link: https://lkml.kernel.org/r/20211130024319.068451680@goodmis.org
    
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Dec 6, 2021
  6. tracing: Disable preemption when using the filter buffer

    In case trace_event_buffer_lock_reserve() is called with preemption
    enabled, the algorithm that defines the usage of the per cpu filter buffer
    may fail if the task schedules to another CPU after determining which
    buffer it will use.
    
    Disable preemption when using the filter buffer. And because that same
    buffer must be used throughout the call, keep preemption disabled until
    the filter buffer is released.
    
    This will also keep the semantics between the use case of when the filter
    buffer is used, and when the ring buffer itself is used, as that case also
    disables preemption until the ring buffer is released.
    
    Link: https://lkml.kernel.org/r/20211130024318.880190623@goodmis.org
    
    [ Fixed warning of assignment in if statement
      Reported-by: kernel test robot <lkp@intel.com> ]
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Dec 6, 2021
  7. tracing: Use __this_cpu_read() in trace_event_buffer_lock_reserver()

    The value read by this_cpu_read() is used later and its use is expected to
    stay on the same CPU as being read. But this_cpu_read() does not warn if
    it is called without preemption disabled, where as __this_cpu_read() will
    check if preemption is disabled on CONFIG_DEBUG_PREEMPT
    
    Currently all callers have preemption disabled, but there may be new
    callers in the future that may not.
    
    Link: https://lkml.kernel.org/r/20211130024318.698165354@goodmis.org
    
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Dec 6, 2021
  8. tools/perf: Add '__rel_loc' event field parsing support

    Add new '__rel_loc' dynamic data location attribute support.
    This type attribute is similar to the '__data_loc' but records the
    offset from the field itself.
    The libtraceevent adds TEP_FIELD_IS_RELATIVE to the
    'tep_format_field::flags' with TEP_FIELD_IS_DYNAMIC for'__rel_loc'.
    
    Link: https://lkml.kernel.org/r/163757344810.510314.12449413842136229871.stgit@devnote2
    
    Cc: Beau Belgrave <beaub@linux.microsoft.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    mhiramat authored and rostedt committed Dec 6, 2021
  9. libtraceevent: Add __rel_loc relative location attribute support

    Add '__rel_loc' new dynamic data location attribute which encodes
    the data location from the next to the field itself. This is similar
    to the '__data_loc' but the location offset is not from the event
    entry but from the next of the field.
    
    This patch adds '__rel_loc' decoding support in the libtraceevent.
    
    Link: https://lkml.kernel.org/r/163757343994.510314.13241077597729303802.stgit@devnote2
    
    Cc: Beau Belgrave <beaub@linux.microsoft.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    mhiramat authored and rostedt committed Dec 6, 2021
  10. samples/trace_event: Add '__rel_loc' using sample event

    Add '__rel_loc' using sample event for testing.
    User can use this for testing purpose. There is
    no reason to use this macro from the kernel.
    
    Link: https://lkml.kernel.org/r/163757343050.510314.2876529802471645178.stgit@devnote2
    
    Cc: Beau Belgrave <beaub@linux.microsoft.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    mhiramat authored and rostedt committed Dec 6, 2021
  11. tracing: Add '__rel_loc' using trace event macros

    Add '__rel_loc' using trace event macros. These macros are usually
    not used in the kernel, except for testing purpose.
    This also add "rel_" variant of macros for dynamic_array string,
    and bitmask.
    
    Link: https://lkml.kernel.org/r/163757342119.510314.816029622439099016.stgit@devnote2
    
    Cc: Beau Belgrave <beaub@linux.microsoft.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    mhiramat authored and rostedt committed Dec 6, 2021
  12. tracing: Support __rel_loc relative dynamic data location attribute

    Add '__rel_loc' new dynamic data location attribute which encodes
    the data location from the next to the field itself.
    
    The '__data_loc' is used for encoding the dynamic data location on
    the trace event record. But '__data_loc' is not useful if the writer
    doesn't know the event header (e.g. user event), because it records
    the dynamic data offset from the entry of the record, not the field
    itself.
    
    This new '__rel_loc' attribute encodes the data location relatively
    from the next of the field. For example, when there is a record like
    below (the number in the parentheses is the size of fields)
    
     |header(N)|common(M)|fields(K)|__data_loc(4)|fields(L)|data(G)|
    
    In this case, '__data_loc' field will be
    
     __data_loc = (G << 16) | (N+M+K+4+L)
    
    If '__rel_loc' is used, this will be
    
     |header(N)|common(M)|fields(K)|__rel_loc(4)|fields(L)|data(G)|
    
    where
    
     __rel_loc = (G << 16) | (L)
    
    This case shows L bytes after the '__rel_loc' attribute  field,
    if there is no fields after the __rel_loc field, L must be 0.
    
    This is relatively easy (and no need to consider the kernel header
    change) when the event data fields are composed by user who doesn't
    know header and common fields.
    
    Link: https://lkml.kernel.org/r/163757341258.510314.4214431827833229956.stgit@devnote2
    
    Cc: Beau Belgrave <beaub@linux.microsoft.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    mhiramat authored and rostedt committed Dec 6, 2021
Older