Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rasdaemon: Fix poll() on per_cpu trace_pipe_raw blocks indefinitely #86

Closed
wants to merge 1 commit into from
Closed

Conversation

shijujose4
Copy link
Contributor

The error events are not received in the rasdaemon since kernel 6.1-rc6. This issue is firstly detected and reported, when testing the CXL error events in the rasdaemon.

Debugging showed, poll() on trace_pipe_raw in the ras-events.c do not return and this issue is seen after the commit
42fb0a1e84ff525ebe560e2baf9451ab69127e2b ("tracing/ring-buffer: Have polling block on watermark").

This issue is also verified using a test application for poll() and select() on per_cpu trace_pipe_raw.

There is also a bug reported on this issue,
https://lore.kernel.org/all/31eb3b12-3350-90a4-a0d9-d1494db7cf74@oracle.com/

This issue occurs for the per_cpu case, which calls the ring_buffer_poll_wait(), in kernel/trace/ring_buffer.c, with the buffer_percent > 0 and then wait until the percentage of pages are available. The default value set for the buffer_percent is 50 in the kernel/trace/trace.c. However poll() does not return even met the percentage of pages condition.

As a fix, rasdaemon set buffer_percent as 0 through the /sys/kernel/debug/tracing/instances/rasdaemon/buffer_percent, then the task will wake up as soon as data is added to any of the specific cpu buffer and poll() on per_cpu/cpuX/trace_pipe_raw does not block indefinitely.

Dependency on the kernel fix commit
3e46d910d8acf94e5360126593b68bf4fee4c4a1("tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw")

Signed-off-by: Shiju Jose shiju.jose@huawei.com

The error events are not received in the rasdaemon since kernel 6.1-rc6.
This issue is firstly detected and reported, when testing the CXL error
events in the rasdaemon.

Debugging showed, poll() on trace_pipe_raw in the ras-events.c do not
return and this issue is seen after the commit
42fb0a1e84ff525ebe560e2baf9451ab69127e2b ("tracing/ring-buffer: Have
polling block on watermark").

This issue is also verified using a test application for poll()
and select() on per_cpu trace_pipe_raw.

There is also a bug reported on this issue,
https://lore.kernel.org/all/31eb3b12-3350-90a4-a0d9-d1494db7cf74@oracle.com/

This issue occurs for the per_cpu case, which calls the ring_buffer_poll_wait(),
in kernel/trace/ring_buffer.c, with the buffer_percent > 0 and then wait until
the percentage of pages are available. The default value set for the
buffer_percent is 50 in the kernel/trace/trace.c. However poll() does not return
even met the percentage of pages condition.

As a fix, rasdaemon set buffer_percent as 0 through the
/sys/kernel/debug/tracing/instances/rasdaemon/buffer_percent, then the
task will wake up as soon as data is added to any of the specific cpu
buffer and poll() on per_cpu/cpuX/trace_pipe_raw does not block
indefinitely.

Dependency on the kernel fix commit
3e46d910d8acf94e5360126593b68bf4fee4c4a1("tracing: Fix poll() and select()
do not work on per_cpu trace_pipe and trace_pipe_raw")

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
@mchehab
Copy link
Owner

mchehab commented Feb 18, 2023

Merged, thanks!

@mchehab mchehab closed this Feb 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants