-
Notifications
You must be signed in to change notification settings - Fork 4
[PWCI] "[V2] net/intel/e1000: reduce the optimization level for gcc > 11" #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Running rte_exit() in a forked process means that shared memory will be released by the child process before the parent process does the same. This issue has been seen recently when some GHA virtual machine (with some mlx5 devices) runs the debug_autotest unit test. Instead, run rte_panic() and rte_exit() from a new DPDK process spawned like for other recursive unit tests. Bugzilla ID: 1796 Fixes: af75078 ("first public release") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
The e1000 PMD stopped working under Ubuntu-24.04 (using gcc-13) when compiled with -O3 (default level for all DPDK code). There is a crash when starting testpmd: > (gdb) bt > #0 rte_read32_relaxed (addr=0x1100800e00) at ../sources/lib/eal/include/generic/rte_io.h:290 > #1 rte_read32 (addr=0x1100800e00) at ../sources/lib/eal/include/generic/rte_io.h:345 > #2 e1000_read_addr (addr=0x1100800e00) at ../sources/drivers/net/intel/e1000/base/e1000_osdep.h:106 > #3 e1000_id_led_init_generic (hw=0x1586788c0) at ../sources/drivers/net/intel/e1000/base/e1000_mac.c:1844 > #4 0x000062aaf653c85f in e1000_init_hw_82540 (hw=0x1586788c0) > at ../sources/drivers/net/intel/e1000/base/e1000_82540.c:308 > #5 0x000062aaf6db8227 in em_hardware_init (hw=hw@entry=0x1586788c0) > at ../sources/drivers/net/intel/e1000/em_ethdev.c:920 > #6 0x000062aaf65340ff in em_hw_init (hw=0x1586788c0) at ../sources/drivers/net/intel/e1000/em_ethdev.c:445 > #7 eth_em_dev_init (eth_dev=eth_dev@entry=0x62aaff346000 <rte_eth_devices>) > at ../sources/drivers/net/intel/e1000/em_ethdev.c:314 > #8 0x000062aaf6db8b71 in rte_eth_dev_pci_generic_probe (private_data_size=11240, > dev_init=0x62aaf6db8310 <eth_em_dev_init>, pci_dev=0x62ab2853dd90) at ../sources/lib/ethdev/ethdev_pci.h:150 > #9 eth_em_pci_probe (pci_drv=<optimized out>, pci_dev=0x62ab2853dd90) > at ../sources/drivers/net/intel/e1000/em_ethdev.c:365 > #10 0x000062aaf646adf5 in rte_pci_probe_one_driver (dr=dr@entry=0x62aaf82d8020 <rte_em_pmd>, > dev=dev@entry=0x62ab2853dd90) at ../sources/drivers/bus/pci/pci_common.c:299 > #11 0x000062aaf6a15f7d in pci_probe_all_drivers (dev=0x62ab2853dd90) at ../sources/drivers/bus/pci/pci_common.c:383 > #12 pci_probe () at ../sources/drivers/bus/pci/pci_common.c:410 > #13 0x000062aaf7a485f3 in rte_bus_probe () at ../sources/lib/eal/common/eal_common_bus.c:84 > #14 0x000062aaf670585d in rte_eal_init (argc=argc@entry=146, argv=argv@entry=0x7fffca468898) > at ../sources/lib/eal/linux/eal.c:1253 The crash is linked to the use of gcc-13: under Ubuntu-24.04 testpmd compiled with gcc-11 from the same DPDK tree works as expected. The perfect solution would be for someone to investigate why the PMD crashes. However, this depends on Maintainer availability. A less-perfect solution is to reduce the optimization level (like another proposal for net/qede: see Link). Note: if more regressions are seen in less-frequently used PMDs, maybe we should switch the default optimization level to -O1, (tree-wide) and only rise the optimization level for actively maintained PMDs, which are proven to work as expected with higher optimization levels. Link: http://patches.dpdk.org/project/dpdk/patch/20250909054023.3263401-1-thierry.herbelot@6wind.com/ Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Reviewer's GuideThis PR refactors the test_debug suite to replace raw fork calls with a unified process_dup-based recursion mechanism using environment variables, introduces new test registrations for panic/exit paths, updates process helper API, and lowers optimization to -O1 for gcc >= 12 in the e1000 driver build scripts to avoid crashes. Class diagram for updated process helper API and test_debug suiteclassDiagram
class ProcessHelper {
+process_dup()
+set_env_var()
+get_env_var()
}
class TestDebug {
+register_panic_test()
+register_exit_test()
-use process_dup for recursion
}
ProcessHelper <|-- TestDebug
TestDebug o-- "Environment Variables"
TestDebug o-- "Test Registration"
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
WalkthroughUpdates test infrastructure: adds test_exit and test_panic APIs, integrates them into the test dispatcher, and refactors non-Windows debug tests to use process-based execution with env propagation; Windows stubs skip these tests. Minor inline spec change in process.h. Meson build scripts add GCC≥12-specific -O1 flags for e1000 driver builds. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Runner as Test Runner
participant Debug as test_debug (non-Windows)
participant Proc as process_dup Child
participant Env as Env/Exit Code
Runner->>Debug: invoke test_debug / test_exit / test_panic
alt Non-Windows path
Debug->>Proc: spawn with process_dup(test_args)
Proc-->>Env: set exit/env signaling result
Debug-->>Runner: return based on Env/exit
else Windows path
Debug-->>Runner: skip (not supported)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Update the Meson version_compare check to '>11.0' instead of '>=12.0' so it accurately reflects the "gcc > 11" condition described in the comments.
- Extract the repeated optimization-flag logic into a shared Meson function or variable to avoid duplicating the '-O1' override in both e1000/meson.build and e1000/base/meson.build.
- Use snprintf instead of sprintf for formatting into the fixed-size
buf
andcore
buffers to prevent potential overflow.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Update the Meson version_compare check to '>11.0' instead of '>=12.0' so it accurately reflects the "gcc > 11" condition described in the comments.
- Extract the repeated optimization-flag logic into a shared Meson function or variable to avoid duplicating the '-O1' override in both e1000/meson.build and e1000/base/meson.build.
- Use snprintf instead of sprintf for formatting into the fixed-size `buf` and `core` buffers to prevent potential overflow.
## Individual Comments
### Comment 1
<location> `drivers/net/intel/e1000/meson.build:28-29` </location>
<code_context>
'e1000_vf.c',
)
+# testpmd crashes with gcc > 11 with compiling with default -O3 or -O2
+if (cc.get_id() == 'gcc' and cc.version().version_compare('>=12.0'))
+ base_cflags += '-O1'
+endif
</code_context>
<issue_to_address>
**suggestion:** Check for duplicate or conflicting optimization flags.
Review the build configuration to confirm that '-O1' is the effective optimization level, considering possible precedence of other flags.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
if (cc.get_id() == 'gcc' and cc.version().version_compare('>=12.0')) | ||
cflags += '-O1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Check for duplicate or conflicting optimization flags.
Review the build configuration to confirm that '-O1' is the effective optimization level, considering possible precedence of other flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
drivers/net/intel/e1000/base/meson.build (1)
25-28
: Clarify the comment and consider documenting the duplication.This block duplicates the workaround from the parent meson.build file. While this is appropriate for scoping build flags to different compilation units, the same issues apply:
- The comment states "gcc > 11" but the condition checks for
>=12.0
. Consider updating to "gcc >= 12" for consistency.- Verify whether GCC 11.x point releases also exhibit the crash issue.
Apply this diff to align the comment:
-# testpmd crashes with gcc > 11 with compiling with default -O3 or -O2 +# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2Additionally, consider adding a cross-reference comment to indicate this workaround is also applied in the parent meson.build:
-# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2 +# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2 +# Note: Same workaround is applied in parent meson.build for non-base sourcesdrivers/net/intel/e1000/meson.build (1)
27-30
: Align comment with version check
- Change the comment to:
-# testpmd crashes with gcc > 11 when compiling with default -O3 or -O2 +# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2
- No known GCC 11.x–specific e1000 PMD crashes are documented; if you encounter issues on GCC 11.x, verify your DPDK version and capture a backtrace.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
app/test/process.h
(1 hunks)app/test/test.c
(1 hunks)app/test/test.h
(1 hunks)app/test/test_debug.c
(5 hunks)drivers/net/intel/e1000/base/meson.build
(1 hunks)drivers/net/intel/e1000/meson.build
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
app/test/test.c (1)
app/test/test_debug.c (4)
test_panic
(11-16)test_panic
(51-72)test_exit
(17-22)test_exit
(93-115)
app/test/test_debug.c (2)
app/test/process.h (1)
process_dup
(72-200)lib/eal/common/eal_common_lcore.c (1)
rte_get_main_lcore
(23-26)
app/test/test.h (1)
app/test/test_debug.c (4)
test_exit
(17-22)test_exit
(93-115)test_panic
(11-16)test_panic
(51-72)
🪛 ast-grep (0.39.5)
app/test/test_debug.c
[warning] 104-104: The getenv
function returns NULL on error and this line dereferences the return value without checking for NULL.
Context: strtol(getenv("TEST_DEBUG_EXIT_VAL"), NULL, 0)
Note: [CWE-476] NULL Pointer Dereference. [REFERENCES]
- https://wiki.sei.cmu.edu/confluence/display/c/EXP34-C.+Do+not+dereference+null+pointers
(null-library-function-c)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (10)
app/test/process.h (1)
206-206
: LGTM: Inline specifier prevents multiple definition issues.Adding
inline
to a static function defined in a header prevents potential multiple definition errors if this header is included in multiple translation units, while allowing the compiler to inline the function for optimization.app/test/test.c (1)
83-84
: LGTM: Test actions correctly registered.The new test actions for
test_panic
andtest_exit
are properly integrated into the recursive test dispatcher, enabling environment-driven invocation consistent with the implementations inapp/test/test_debug.c
.app/test/test.h (1)
177-179
: LGTM: Function declarations match implementations.The declarations for
test_exit
andtest_panic
are consistent with their implementations inapp/test/test_debug.c
and follow the established pattern for test functions.app/test/test_debug.c (7)
11-22
: LGTM: Windows stubs appropriately skip unsupported tests.The stubs for
test_panic
andtest_exit
correctly returnTEST_SKIPPED
on Windows, where debug features are not supported, maintaining consistency with the existingtest_debug
stub.
40-42
: LGTM: Header includes updated for refactored implementation.The inclusion of
rte_lcore.h
(forrte_get_main_lcore()
) andprocess.h
(forprocess_dup()
) correctly supports the refactored test execution approach.
49-49
: LGTM: Shared test arguments array.The
test_args
array centralizes test argument management for bothtest_panic
andtest_exit
, initialized intest_debug()
and used in process-based test execution.
51-72
: LGTM: Refactored to use centralized process execution.The function is now public (matching the declaration in
test.h
) and usesprocess_dup
for cleaner, more maintainable process-based test execution. The logic correctly verifies that the child process exits abnormally whenrte_panic
is called.
74-91
: LGTM: Environment-based exit value propagation.The refactored implementation uses environment variables to pass the expected exit value to the child process, which is cleaner than previous approaches. The buffer size of 5 is adequate for the test values (-1 to 255).
93-115
: LGTM: Refactored with environment-based recursion.The function is now public and properly handles the recursive test case by reading
TEST_DEBUG_EXIT_VAL
from the environment and callingrte_exit
with the specified value. The NULL check at lines 102-103 ensures safe usage ofgetenv
at line 105, making the static analysis warning a false positive.
138-175
: LGTM: Test arguments properly initialized for cross-platform execution.The
test_args
array is correctly initialized with platform-specific prefixes (BSD:--no-shconf
, Linux:--file-prefix=debug
) and hugepage-aware memory configuration. The main lcore ID is properly retrieved, and all test invocations are appropriate.
NOTE: This is an auto submission for "[V2] net/intel/e1000: reduce the optimization level for gcc > 11".
See "http://patchwork.dpdk.org/project/dpdk/list/?series=36299" for details.
Summary by Sourcery
Refactor the debug test harness to use process spawning for panic and exit tests with cross-platform support, register the new test commands, and adjust the Intel e1000 driver Meson builds to use -O1 for GCC >=12 to prevent testpmd crashes.
Enhancements:
Build:
Tests:
Summary by CodeRabbit
Bug Fixes
Tests
Refactor