rasdaemon: Fix poll() on per_cpu trace_pipe_raw blocks indefinitely #86
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The error events are not received in the rasdaemon since kernel 6.1-rc6. This issue is firstly detected and reported, when testing the CXL error events in the rasdaemon.
Debugging showed, poll() on trace_pipe_raw in the ras-events.c do not return and this issue is seen after the commit
42fb0a1e84ff525ebe560e2baf9451ab69127e2b ("tracing/ring-buffer: Have polling block on watermark").
This issue is also verified using a test application for poll() and select() on per_cpu trace_pipe_raw.
There is also a bug reported on this issue,
https://lore.kernel.org/all/31eb3b12-3350-90a4-a0d9-d1494db7cf74@oracle.com/
This issue occurs for the per_cpu case, which calls the ring_buffer_poll_wait(), in kernel/trace/ring_buffer.c, with the buffer_percent > 0 and then wait until the percentage of pages are available. The default value set for the buffer_percent is 50 in the kernel/trace/trace.c. However poll() does not return even met the percentage of pages condition.
As a fix, rasdaemon set buffer_percent as 0 through the /sys/kernel/debug/tracing/instances/rasdaemon/buffer_percent, then the task will wake up as soon as data is added to any of the specific cpu buffer and poll() on per_cpu/cpuX/trace_pipe_raw does not block indefinitely.
Dependency on the kernel fix commit
3e46d910d8acf94e5360126593b68bf4fee4c4a1("tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw")
Signed-off-by: Shiju Jose shiju.jose@huawei.com