Skip to content

Conversation

ovsrobot
Copy link
Owner

@ovsrobot ovsrobot commented Oct 6, 2025

NOTE: This is an auto submission for "[V2] net/intel/e1000: reduce the optimization level for gcc > 11".

See "http://patchwork.dpdk.org/project/dpdk/list/?series=36299" for details.

Summary by Sourcery

Refactor the debug test harness to use process spawning for panic and exit tests with cross-platform support, register the new test commands, and adjust the Intel e1000 driver Meson builds to use -O1 for GCC >=12 to prevent testpmd crashes.

Enhancements:

  • Replace fork-based panic and exit tests with a recursive process_dup mechanism using environment variables
  • Add Windows and FreeBSD-specific handling to skip unsupported debug tests or adjust command-line prefixes
  • Register test_panic and test_exit commands in the test command list

Build:

  • Lower the optimization level to -O1 for Intel e1000 driver builds on GCC versions 12 and above in Meson build files

Tests:

  • Add cross-platform stub implementations for test_panic and test_exit on Windows to skip tests
  • Inline get_current_prefix in process.h and update environment-variable logic for recursive test invocation

Summary by CodeRabbit

  • Bug Fixes

    • Improved stability for Intel e1000 driver when built with GCC 12+, reducing crashes by adjusting optimization settings.
  • Tests

    • Added new test actions: test_panic and test_exit.
    • Windows builds now safely skip these tests.
  • Refactor

    • Streamlined debug test execution across platforms with process-based execution and environment propagation for exit values, preserving behavior on non-Windows.

david-marchand and others added 2 commits October 5, 2025 18:26
Running rte_exit() in a forked process means that shared memory will be
released by the child process before the parent process does the same.
This issue has been seen recently when some GHA virtual machine (with
some mlx5 devices) runs the debug_autotest unit test.

Instead, run rte_panic() and rte_exit() from a new DPDK process spawned
like for other recursive unit tests.

Bugzilla ID: 1796
Fixes: af75078 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
The e1000 PMD stopped working under Ubuntu-24.04 (using gcc-13) when
compiled with -O3 (default level for all DPDK code). There is a crash
when starting testpmd:

> (gdb) bt
> #0  rte_read32_relaxed (addr=0x1100800e00) at ../sources/lib/eal/include/generic/rte_io.h:290
> #1  rte_read32 (addr=0x1100800e00) at ../sources/lib/eal/include/generic/rte_io.h:345
> #2  e1000_read_addr (addr=0x1100800e00) at ../sources/drivers/net/intel/e1000/base/e1000_osdep.h:106
> #3  e1000_id_led_init_generic (hw=0x1586788c0) at ../sources/drivers/net/intel/e1000/base/e1000_mac.c:1844
> #4  0x000062aaf653c85f in e1000_init_hw_82540 (hw=0x1586788c0)
>     at ../sources/drivers/net/intel/e1000/base/e1000_82540.c:308
> #5  0x000062aaf6db8227 in em_hardware_init (hw=hw@entry=0x1586788c0)
>     at ../sources/drivers/net/intel/e1000/em_ethdev.c:920
> #6  0x000062aaf65340ff in em_hw_init (hw=0x1586788c0) at ../sources/drivers/net/intel/e1000/em_ethdev.c:445
> #7  eth_em_dev_init (eth_dev=eth_dev@entry=0x62aaff346000 <rte_eth_devices>)
>     at ../sources/drivers/net/intel/e1000/em_ethdev.c:314
> #8  0x000062aaf6db8b71 in rte_eth_dev_pci_generic_probe (private_data_size=11240,
>     dev_init=0x62aaf6db8310 <eth_em_dev_init>, pci_dev=0x62ab2853dd90) at ../sources/lib/ethdev/ethdev_pci.h:150
> #9  eth_em_pci_probe (pci_drv=<optimized out>, pci_dev=0x62ab2853dd90)
>     at ../sources/drivers/net/intel/e1000/em_ethdev.c:365
> #10 0x000062aaf646adf5 in rte_pci_probe_one_driver (dr=dr@entry=0x62aaf82d8020 <rte_em_pmd>,
>     dev=dev@entry=0x62ab2853dd90) at ../sources/drivers/bus/pci/pci_common.c:299
> #11 0x000062aaf6a15f7d in pci_probe_all_drivers (dev=0x62ab2853dd90) at ../sources/drivers/bus/pci/pci_common.c:383
> #12 pci_probe () at ../sources/drivers/bus/pci/pci_common.c:410
> #13 0x000062aaf7a485f3 in rte_bus_probe () at ../sources/lib/eal/common/eal_common_bus.c:84
> #14 0x000062aaf670585d in rte_eal_init (argc=argc@entry=146, argv=argv@entry=0x7fffca468898)
>     at ../sources/lib/eal/linux/eal.c:1253

The crash is linked to the use of gcc-13: under Ubuntu-24.04 testpmd
compiled with gcc-11 from the same DPDK tree works as expected.

The perfect solution would be for someone to investigate why the
PMD crashes. However, this depends on Maintainer availability.

A less-perfect solution is to reduce the optimization level
(like another proposal for net/qede: see Link).

Note: if more regressions are seen in less-frequently used PMDs,
      maybe we should switch the default optimization level to -O1,
      (tree-wide) and only rise the optimization level for actively
      maintained PMDs, which are proven to work as expected with
      higher optimization levels.

Link: http://patches.dpdk.org/project/dpdk/patch/20250909054023.3263401-1-thierry.herbelot@6wind.com/
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
Copy link

sourcery-ai bot commented Oct 6, 2025

Reviewer's Guide

This PR refactors the test_debug suite to replace raw fork calls with a unified process_dup-based recursion mechanism using environment variables, introduces new test registrations for panic/exit paths, updates process helper API, and lowers optimization to -O1 for gcc >= 12 in the e1000 driver build scripts to avoid crashes.

Class diagram for updated process helper API and test_debug suite

classDiagram
    class ProcessHelper {
        +process_dup()
        +set_env_var()
        +get_env_var()
    }
    class TestDebug {
        +register_panic_test()
        +register_exit_test()
        -use process_dup for recursion
    }
    ProcessHelper <|-- TestDebug
    TestDebug o-- "Environment Variables"
    TestDebug o-- "Test Registration"
Loading

File-Level Changes

Change Details Files
Refactor debug tests to use process_dup and environment-driven recursion
  • Replace fork() usage with process_dup and RTE_RECURSIVE_ENV_VAR checks
  • Populate test_args array with EAL startup flags based on environment (hugepages and prefix)
  • Use TEST_DEBUG_EXIT_VAL environment variable to pass expected exit codes to child processes
  • Add test_panic and test_exit implementations for Windows and recursive branch
  • Register new test_panic and test_exit commands in test.c and test.h
app/test/test_debug.c
app/test/process.h
app/test/test.c
app/test/test.h
Lower optimization level to -O1 for gcc >= 12 in e1000 driver builds
  • Check gcc version in meson.build and append '-O1' to cflags if version >= 12
  • Apply same logic for both top-level and base e1000 driver build files
drivers/net/intel/e1000/meson.build
drivers/net/intel/e1000/base/meson.build
Convert get_current_prefix helper to inline
  • Change static char * to static inline char * for get_current_prefix in process.h
app/test/process.h

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

coderabbitai bot commented Oct 6, 2025

Walkthrough

Updates test infrastructure: adds test_exit and test_panic APIs, integrates them into the test dispatcher, and refactors non-Windows debug tests to use process-based execution with env propagation; Windows stubs skip these tests. Minor inline spec change in process.h. Meson build scripts add GCC≥12-specific -O1 flags for e1000 driver builds.

Changes

Cohort / File(s) Summary
Test actions and APIs
app/test/test.h, app/test/test.c
Declares int test_exit(void) and int test_panic(void) and registers them in the recursive test dispatcher actions array.
Debug test execution refactor
app/test/test_debug.c
Non-Windows: switches to process-based execution (process_dup), shares test args, and uses env-based propagation for exit values; adjusts includes. Windows: provides stubs for test_panic/test_exit and skips debug flow.
Process helper signature tweak
app/test/process.h
Changes get_current_prefix declaration to static inline char * under RTE_EXEC_ENV_LINUX; body unchanged.
e1000 build flags (GCC≥12)
drivers/net/intel/e1000/base/meson.build, drivers/net/intel/e1000/meson.build
Adds conditional -O1 to cflags/base_cflags when compiler is GCC version ≥12.0; no runtime code changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Runner as Test Runner
  participant Debug as test_debug (non-Windows)
  participant Proc as process_dup Child
  participant Env as Env/Exit Code

  Runner->>Debug: invoke test_debug / test_exit / test_panic
  alt Non-Windows path
    Debug->>Proc: spawn with process_dup(test_args)
    Proc-->>Env: set exit/env signaling result
    Debug-->>Runner: return based on Env/exit
  else Windows path
    Debug-->>Runner: skip (not supported)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I thump my paw—new tests to try,
Forks now hop to process sky.
On Windows, naps—we skip the fray,
While GCC’s breeze blows -O1 our way.
With tidy headers, prefixes fine,
I nibble carrots—green build vine. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title mentions only the e1000 driver build flag change and includes automation markers and quotes, but it does not reflect the significant test framework additions in app/test nor follow concise titling conventions. Rename the PR to a concise, descriptive title that highlights both the test API additions (test_panic/test_exit) and the driver optimization change, and remove automation tags and unnecessary quotes, or split into separate PRs focused on each area.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch series_36299

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Update the Meson version_compare check to '>11.0' instead of '>=12.0' so it accurately reflects the "gcc > 11" condition described in the comments.
  • Extract the repeated optimization-flag logic into a shared Meson function or variable to avoid duplicating the '-O1' override in both e1000/meson.build and e1000/base/meson.build.
  • Use snprintf instead of sprintf for formatting into the fixed-size buf and core buffers to prevent potential overflow.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Update the Meson version_compare check to '>11.0' instead of '>=12.0' so it accurately reflects the "gcc > 11" condition described in the comments.
- Extract the repeated optimization-flag logic into a shared Meson function or variable to avoid duplicating the '-O1' override in both e1000/meson.build and e1000/base/meson.build.
- Use snprintf instead of sprintf for formatting into the fixed-size `buf` and `core` buffers to prevent potential overflow.

## Individual Comments

### Comment 1
<location> `drivers/net/intel/e1000/meson.build:28-29` </location>
<code_context>
         'e1000_vf.c',
 )
+# testpmd crashes with gcc > 11 with compiling with default -O3 or -O2
+if (cc.get_id() == 'gcc' and cc.version().version_compare('>=12.0'))
+    base_cflags += '-O1'
+endif
</code_context>

<issue_to_address>
**suggestion:** Check for duplicate or conflicting optimization flags.

Review the build configuration to confirm that '-O1' is the effective optimization level, considering possible precedence of other flags.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +28 to +29
if (cc.get_id() == 'gcc' and cc.version().version_compare('>=12.0'))
cflags += '-O1'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Check for duplicate or conflicting optimization flags.

Review the build configuration to confirm that '-O1' is the effective optimization level, considering possible precedence of other flags.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
drivers/net/intel/e1000/base/meson.build (1)

25-28: Clarify the comment and consider documenting the duplication.

This block duplicates the workaround from the parent meson.build file. While this is appropriate for scoping build flags to different compilation units, the same issues apply:

  1. The comment states "gcc > 11" but the condition checks for >=12.0. Consider updating to "gcc >= 12" for consistency.
  2. Verify whether GCC 11.x point releases also exhibit the crash issue.

Apply this diff to align the comment:

-# testpmd crashes with gcc > 11 with compiling with default -O3 or -O2
+# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2

Additionally, consider adding a cross-reference comment to indicate this workaround is also applied in the parent meson.build:

-# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2
+# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2
+# Note: Same workaround is applied in parent meson.build for non-base sources
drivers/net/intel/e1000/meson.build (1)

27-30: Align comment with version check

  • Change the comment to:
-# testpmd crashes with gcc > 11 when compiling with default -O3 or -O2
+# testpmd crashes with gcc >= 12 when compiling with default -O3 or -O2
  • No known GCC 11.x–specific e1000 PMD crashes are documented; if you encounter issues on GCC 11.x, verify your DPDK version and capture a backtrace.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76393b9 and d9fdda9.

📒 Files selected for processing (6)
  • app/test/process.h (1 hunks)
  • app/test/test.c (1 hunks)
  • app/test/test.h (1 hunks)
  • app/test/test_debug.c (5 hunks)
  • drivers/net/intel/e1000/base/meson.build (1 hunks)
  • drivers/net/intel/e1000/meson.build (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
app/test/test.c (1)
app/test/test_debug.c (4)
  • test_panic (11-16)
  • test_panic (51-72)
  • test_exit (17-22)
  • test_exit (93-115)
app/test/test_debug.c (2)
app/test/process.h (1)
  • process_dup (72-200)
lib/eal/common/eal_common_lcore.c (1)
  • rte_get_main_lcore (23-26)
app/test/test.h (1)
app/test/test_debug.c (4)
  • test_exit (17-22)
  • test_exit (93-115)
  • test_panic (11-16)
  • test_panic (51-72)
🪛 ast-grep (0.39.5)
app/test/test_debug.c

[warning] 104-104: The getenv function returns NULL on error and this line dereferences the return value without checking for NULL.
Context: strtol(getenv("TEST_DEBUG_EXIT_VAL"), NULL, 0)
Note: [CWE-476] NULL Pointer Dereference. [REFERENCES]
- https://wiki.sei.cmu.edu/confluence/display/c/EXP34-C.+Do+not+dereference+null+pointers

(null-library-function-c)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (10)
app/test/process.h (1)

206-206: LGTM: Inline specifier prevents multiple definition issues.

Adding inline to a static function defined in a header prevents potential multiple definition errors if this header is included in multiple translation units, while allowing the compiler to inline the function for optimization.

app/test/test.c (1)

83-84: LGTM: Test actions correctly registered.

The new test actions for test_panic and test_exit are properly integrated into the recursive test dispatcher, enabling environment-driven invocation consistent with the implementations in app/test/test_debug.c.

app/test/test.h (1)

177-179: LGTM: Function declarations match implementations.

The declarations for test_exit and test_panic are consistent with their implementations in app/test/test_debug.c and follow the established pattern for test functions.

app/test/test_debug.c (7)

11-22: LGTM: Windows stubs appropriately skip unsupported tests.

The stubs for test_panic and test_exit correctly return TEST_SKIPPED on Windows, where debug features are not supported, maintaining consistency with the existing test_debug stub.


40-42: LGTM: Header includes updated for refactored implementation.

The inclusion of rte_lcore.h (for rte_get_main_lcore()) and process.h (for process_dup()) correctly supports the refactored test execution approach.


49-49: LGTM: Shared test arguments array.

The test_args array centralizes test argument management for both test_panic and test_exit, initialized in test_debug() and used in process-based test execution.


51-72: LGTM: Refactored to use centralized process execution.

The function is now public (matching the declaration in test.h) and uses process_dup for cleaner, more maintainable process-based test execution. The logic correctly verifies that the child process exits abnormally when rte_panic is called.


74-91: LGTM: Environment-based exit value propagation.

The refactored implementation uses environment variables to pass the expected exit value to the child process, which is cleaner than previous approaches. The buffer size of 5 is adequate for the test values (-1 to 255).


93-115: LGTM: Refactored with environment-based recursion.

The function is now public and properly handles the recursive test case by reading TEST_DEBUG_EXIT_VAL from the environment and calling rte_exit with the specified value. The NULL check at lines 102-103 ensures safe usage of getenv at line 105, making the static analysis warning a false positive.


138-175: LGTM: Test arguments properly initialized for cross-platform execution.

The test_args array is correctly initialized with platform-specific prefixes (BSD: --no-shconf, Linux: --file-prefix=debug) and hugepage-aware memory configuration. The main lcore ID is properly retrieved, and all test invocations are appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants