-
Notifications
You must be signed in to change notification settings - Fork 4
[PWCI] "net/intel/e1000: reduce the optimization level for gcc > 11" #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Running rte_exit() in a forked process means that shared memory will be released by the child process before the parent process does the same. This issue has been seen recently when some GHA virtual machine (with some mlx5 devices) runs the debug_autotest unit test. Instead, run rte_panic() and rte_exit() from a new DPDK process spawned like for other recursive unit tests. Bugzilla ID: 1796 Fixes: af75078 ("first public release") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
The e1000 PMD stopped working under Ubuntu-24.04 (using gcc-13) when compiled with -O3 (default level for all DPDK code). There is a crash when starting testpmd: (gdb) bt #0 rte_read32_relaxed (addr=0x1100800e00) at ../sources/lib/eal/include/generic/rte_io.h:290 #1 rte_read32 (addr=0x1100800e00) at ../sources/lib/eal/include/generic/rte_io.h:345 #2 e1000_read_addr (addr=0x1100800e00) at ../sources/drivers/net/intel/e1000/base/e1000_osdep.h:106 #3 e1000_id_led_init_generic (hw=0x1586788c0) at ../sources/drivers/net/intel/e1000/base/e1000_mac.c:1844 #4 0x000062aaf653c85f in e1000_init_hw_82540 (hw=0x1586788c0) at ../sources/drivers/net/intel/e1000/base/e1000_82540.c:308 #5 0x000062aaf6db8227 in em_hardware_init (hw=hw@entry=0x1586788c0) at ../sources/drivers/net/intel/e1000/em_ethdev.c:920 #6 0x000062aaf65340ff in em_hw_init (hw=0x1586788c0) at ../sources/drivers/net/intel/e1000/em_ethdev.c:445 #7 eth_em_dev_init (eth_dev=eth_dev@entry=0x62aaff346000 <rte_eth_devices>) at ../sources/drivers/net/intel/e1000/em_ethdev.c:314 #8 0x000062aaf6db8b71 in rte_eth_dev_pci_generic_probe (private_data_size=11240, dev_init=0x62aaf6db8310 <eth_em_dev_init>, pci_dev=0x62ab2853dd90) at ../sources/lib/ethdev/ethdev_pci.h:150 #9 eth_em_pci_probe (pci_drv=<optimized out>, pci_dev=0x62ab2853dd90) at ../sources/drivers/net/intel/e1000/em_ethdev.c:365 #10 0x000062aaf646adf5 in rte_pci_probe_one_driver (dr=dr@entry=0x62aaf82d8020 <rte_em_pmd>, dev=dev@entry=0x62ab2853dd90) at ../sources/drivers/bus/pci/pci_common.c:299 #11 0x000062aaf6a15f7d in pci_probe_all_drivers (dev=0x62ab2853dd90) at ../sources/drivers/bus/pci/pci_common.c:383 #12 pci_probe () at ../sources/drivers/bus/pci/pci_common.c:410 #13 0x000062aaf7a485f3 in rte_bus_probe () at ../sources/lib/eal/common/eal_common_bus.c:84 #14 0x000062aaf670585d in rte_eal_init (argc=argc@entry=146, argv=argv@entry=0x7fffca468898) at ../sources/lib/eal/linux/eal.c:1253 The crash is linked to the use of gcc-13: uner Ubuntu-24.04 testpmd compiled with gcc-11 from the same DPDK tree works as expected. The perfect solution would be for someone to investigate why the PMD crashes. However, this depends on Maintainer availability. A less-perfect solution is to reduce the optimization level (like another proposal for net/qede: see Link). Note: if more regressions are seen in less-frequently used PMDs, maybe we should switch the default optimization level to -O1, (tree-wide) and only rise the optimization level for actively maintained PMDs, which are proven to work as expected with higher optimization levels. Link: http://patches.dpdk.org/project/dpdk/patch/20250909054023.3263401-1-thierry.herbelot@6wind.com/ Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Reviewer's GuideThis PR modifies the Intel e1000 driver Meson builds to reduce the optimization level to -O1 for GCC ≥12 and refactors the debug test suite to replace fork-based panic/exit checks with a recursive process_dup mechanism, adds Windows test stubs, and inlines the get_current_prefix helper. Class diagram for updated test debug suiteclassDiagram
class TestDebug {
+process_dup()
+get_current_prefix()
-fork-based panic/exit checks (removed)
}
class WindowsTestStub {
+stub methods for Windows
}
TestDebug --> WindowsTestStub : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
WalkthroughAdds two new test actions (test_panic, test_exit) and exposes their declarations. Refactors non-Windows test_debug flow to use a process helper and env-driven recursion; provides Windows stubs. Inlines a Linux-only helper in process.h. Updates e1000 Meson builds to apply -O1 for gcc>=12. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Tester as test_debug (non-Windows)
participant Env as Environment
participant Proc as process_dup (child)
participant Actions as test_* actions
Tester->>Env: Read do_recursive_call/action
alt first run (no env)
Tester->>Proc: spawn child with env(action=...)
Proc->>Actions: dispatch action (e.g., test_panic/test_exit)
Actions-->>Proc: return/exit
Proc-->>Tester: exit status
else recursive run
Tester->>Actions: call mapped action directly
Actions-->>Tester: return/exit
end
note over Actions: New actions: test_panic, test_exit
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- This PR mixes debug-test refactoring and build-optimization tweaks—consider splitting into two focused PRs for easier review.
- Rather than inserting empty-string placeholders into test_args, build the argv array dynamically so you only pass the flags you actually need.
- In your meson.build changes, append optimization flags using list notation (e.g. cflags += ['-O1']) to ensure Meson treats them as single flags.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- This PR mixes debug-test refactoring and build-optimization tweaks—consider splitting into two focused PRs for easier review.
- Rather than inserting empty-string placeholders into test_args, build the argv array dynamically so you only pass the flags you actually need.
- In your meson.build changes, append optimization flags using list notation (e.g. cflags += ['-O1']) to ensure Meson treats them as single flags.
## Individual Comments
### Comment 1
<location> `drivers/net/intel/e1000/meson.build:28-29` </location>
<code_context>
'e1000_vf.c',
)
+# testpmd crashes with gcc > 11 with compiling with default -O3 or -O2
+if (cc.get_id() == 'gcc' and cc.version().version_compare('>=12.0'))
+ base_cflags += '-O1'
+endif
</code_context>
<issue_to_address>
**suggestion:** Check for possible override of existing optimization flags.
Ensure that higher optimization flags are removed from 'cflags' before adding '-O1' to avoid unintended compiler behavior.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
if (cc.get_id() == 'gcc' and cc.version().version_compare('>=12.0')) | ||
cflags += '-O1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Check for possible override of existing optimization flags.
Ensure that higher optimization flags are removed from 'cflags' before adding '-O1' to avoid unintended compiler behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/test/test_debug.c (1)
102-105
: Consider caching getenv result (optional).The static analysis tool flags a potential NULL dereference at Line 105, but this is a false positive—Line 102 panics if
getenv
returns NULL. However, for slightly better clarity and to silence the warning, you could cache the result:+ const char *exit_val_str = getenv("TEST_DEBUG_EXIT_VAL"); + if (!exit_val_str) - if (!getenv("TEST_DEBUG_EXIT_VAL")) rte_panic("No exit value set in env\n"); - exit_val = strtol(getenv("TEST_DEBUG_EXIT_VAL"), NULL, 0); + exit_val = strtol(exit_val_str, NULL, 0);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
app/test/process.h
(1 hunks)app/test/test.c
(1 hunks)app/test/test.h
(1 hunks)app/test/test_debug.c
(5 hunks)drivers/net/intel/e1000/base/meson.build
(1 hunks)drivers/net/intel/e1000/meson.build
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
app/test/test.c (1)
app/test/test_debug.c (4)
test_panic
(11-16)test_panic
(51-72)test_exit
(17-22)test_exit
(93-115)
app/test/test.h (1)
app/test/test_debug.c (4)
test_exit
(17-22)test_exit
(93-115)test_panic
(11-16)test_panic
(51-72)
app/test/test_debug.c (2)
app/test/process.h (1)
process_dup
(72-200)lib/eal/common/eal_common_lcore.c (1)
rte_get_main_lcore
(23-26)
🪛 ast-grep (0.39.5)
app/test/test_debug.c
[warning] 104-104: The getenv
function returns NULL on error and this line dereferences the return value without checking for NULL.
Context: strtol(getenv("TEST_DEBUG_EXIT_VAL"), NULL, 0)
Note: [CWE-476] NULL Pointer Dereference. [REFERENCES]
- https://wiki.sei.cmu.edu/confluence/display/c/EXP34-C.+Do+not+dereference+null+pointers
(null-library-function-c)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (10)
app/test/test.h (1)
177-177
: LGTM! Clean API additions.The new public function declarations for
test_exit
andtest_panic
properly expose the test entry points, aligning with the implementations intest_debug.c
and the environment-action mappings intest.c
.Also applies to: 179-179
app/test/test.c (1)
83-84
: LGTM! Environment-action mappings correctly wired.The new mappings for
test_panic
andtest_exit
are consistent with existing patterns and correctly reference the newly exposed functions fromtest.h
.drivers/net/intel/e1000/meson.build (1)
27-30
: LGTM! GCC optimization workaround correctly implemented.The version check (
>=12.0
) correctly addresses the testpmd crash issue with GCC > 11 at higher optimization levels. This mitigation is consistent with the similar change inbase/meson.build
.drivers/net/intel/e1000/base/meson.build (1)
25-28
: LGTM! Consistent GCC workaround.This optimization adjustment mirrors the pattern in the parent
meson.build
and correctly mitigates the testpmd crash issue on GCC >= 12.app/test/process.h (1)
206-206
: LGTM! Proper inline declaration for header-defined helper.Changing to
static inline
prevents multiple definition issues when this header is included in multiple translation units, which aligns with the new usage pattern whereprocess.h
is now included intest_debug.c
.app/test/test_debug.c (5)
10-22
: LGTM! Windows stub implementations.The Windows-specific stubs properly skip unsupported debug tests with clear messaging, maintaining cross-platform compatibility.
40-42
: LGTM! Clean refactor to process-based testing.The include changes and
test_args
array introduction properly support the transition from fork-based toprocess_dup
-based multi-process testing.Also applies to: 49-49
51-72
: LGTM! test_panic correctly refactored.The function now properly uses
process_dup
with environment-driven control flow and is correctly exposed as a public function to align with the declaration intest.h
.
93-115
: LGTM! test_exit correctly refactored.The function now uses environment-driven recursion control and is properly exposed as public. The logic correctly handles multiple exit value test cases.
140-165
: LGTM! Proper test initialization.The test argument setup correctly handles both FreeBSD (no-shconf) and Linux (file-prefix) cases, and appropriately configures hugepage options based on availability.
NOTE: This is an auto submission for "net/intel/e1000: reduce the optimization level for gcc > 11".
See "http://patchwork.dpdk.org/project/dpdk/list/?series=36298" for details.
Summary by Sourcery
Refactor the debug test suite to use recursive process_dup calls instead of fork, improve cross-platform skipping and environment-based test flow, and reduce GCC optimization levels to O1 for the Intel e1000 driver when built with GCC>=12 to avoid testpmd crashes.
Bug Fixes:
Enhancements:
Build:
Tests:
Summary by CodeRabbit