Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logging: log_core: regression in handling of CONFIG_LOG_BLOCK_IN_THREAD_TIMEOUT_MS #63965

Closed
cfriedt opened this issue Oct 15, 2023 · 0 comments · Fixed by #63958
Closed

logging: log_core: regression in handling of CONFIG_LOG_BLOCK_IN_THREAD_TIMEOUT_MS #63965

cfriedt opened this issue Oct 15, 2023 · 0 comments · Fixed by #63958
Assignees
Labels
area: Logging bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug Regression Something, which was working, does not anymore

Comments

@cfriedt
Copy link
Member

cfriedt commented Oct 15, 2023

Describe the bug
With deferred logging, the CONFIG_LOG_BLOCK_IN_THREAD_TIMEOUT_MS was introduced in order to block the log processing thread and preserve all messages.

For reference, the blocking feature was described in #15196 and added in #16194. Then the logging v2 API was added in f6a40ae which did not carry-over the feature / fix.

There does not seem to be any test coverage for this feature though, and a regression appeared that would hang systems.

Please also mention any information which could help others to understand
the problem you're facing:

To Reproduce

  1. Check out tests: logging: add log_blocking testsuite to demonstrate bugfix #63959
  2. Revert the commit that includes the fix.
  3. west build -p auto -b tests/subsys/logging/log_blocking/
  4. See error

Expected behavior
The testsuite should succeed with all permutations specified in the testcase.yaml provided.

Impact
This has been a massive pain point for us, since logging behaviour changed quite unexpectedly in the v2.x series.

Per policy, not just for diagnostics, we need enable the collection of logs from our devices, and nobody had been able to narrow down why this changed for .. quite a while. We have been evaluating 3.x releases recently in order to prepare moving from LTSv2. This was the one regression that is holding us back.

Logs and console output
Working on a testsuite at the moment, but the simplest way to reproduce this would be to verify the following failure in any testsuite.

// we expect this assertion to succeed, but it is  current, unintended behaviour of the log core
__ASSERT(K_TIMEOUT_EQ(K_NO_WAIT, K_MSEC(-1)), "K_NO_WAIT is not equal to K_MSEC(-1)");
// we expect this assertion to fail, but it is the intended behaviour of the log core
__ASSERT(K_TIMEOUT_EQ(K_FOREVER, K_MSEC(-1)), "K_FOREVER is not equal to K_MSEC(-1)");
*** Booting Zephyr OS build v3.5.0-rc3-8-g3aa2d62d2096 ***
Running TESTSUITE log_blocking
===================================================================
ASSERTION FAIL [((((k_timeout_t) { .ticks = (((k_ticks_t) -1)) })).ticks == (((k_timeout_t) { .ticks = ((k_ticks_t)((1) ? ( ((100) == (1000)) ? (uint64_t) ((((-1) > (0)) ? (-1) : (0))) : ((1000) > (100) && (1000) % (100) == 0U) ? (((uint64_t) ((((-1) > (0)) ? (-1) : (0))) + ((0) ? ((1000) / (100)) / 2 : (1) ? ((1000) / (100)) - 1 : 0)) / ((1000) / (100))) : ((100) > (1000) && (100) % (1000) == 0U) ? (uint64_t) ((((-1) > (0)) ? (-1) : (0)))*((100) / (1000)) : ((((((365 * 24ULL * 3600ULL * 1000) + ((0xffffffffU)) - 1) / ((0xffffffffU))) * 100) <= (0xffffffffU)) ? (((uint64_t) ((((-1) > (0)) ? (-1) : (0)))*(100) + ((0) ? (1000) / 2 : (1) ? (1000) - 1 : 0)) / (1000)) : (((uint64_t) ((((-1) > (0)) ? (-1) : (0))) / (1000))*(100) + (((uint64_t) ((((-1) > (0)) ? (-1) : (0))) % (1000))*(100) + ((0) ? (1000) / 2 : (1) ? (1000) - 1 : 0)) / (1000))) ) : (((uint64_t) ((((-1) > (0)) ? (-1) : (0))) / (1000))*(100) + (((uint64_t) ((((-1) > (0)) ? (-1) : (0))) % (1000))*(100) + ((0) ? (1000) / 2 : (1) ? (1000) - 1 : 0)) / (1000)) )) })).ticks)] @ WEST_TOPDIR/zephyr/tests/subsys/logging/log_blocking/src/main.c:107
        K_FOREVER is not equal to K_MSEC(-1)

    Assertion failed at WEST_TOPDIR/zephyr/tests/subsys/logging/log_blocking/src/main.c:55: panic: false is false

There is no test coverage to point to (yet)

Environment (please complete the following information):

  • OS: Linux
  • Toolchain: Zephyr SDK v0.16.1
  • Commit SHA or Version used: 90b9809

Additional context

@cfriedt cfriedt added bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug area: Logging Regression Something, which was working, does not anymore labels Oct 15, 2023
@cfriedt cfriedt added backport v2.7-branch Request backport to the v2.7-branch backport v3.4-branch Backport to the v3.4-branch and removed backport v2.7-branch Request backport to the v2.7-branch backport v3.4-branch Backport to the v3.4-branch labels Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Logging bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug Regression Something, which was working, does not anymore
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants