[PWCI] "Optimization Summary for RISC-V rte_memcpy" #195

ovsrobot · 2025-10-16T10:34:10Z

Auto-submission for "http://patchwork.dpdk.org/project/dpdk/list/?series=36374"

Summary by CodeRabbit

Release Notes

New Features
- DMA enqueue/dequeue operations support for batch DMA processing.
- Automatic lcore ID remapping to contiguous ranges with new EAL option.
- Multiple command file support in testpmd with echo/no-echo variants.
- RISC-V V-ISA optimized memcpy and prefetch extension detection.
- Argparse library: CORELIST type and getopt-style argument reordering.
- Graph dispatch model callback notifications on packet enqueue.
API Changes
- pcapng interface registration now requires explicit link type parameter.
- DMA silent mode configuration migrated to flags-based approach.
Bug Fixes
- Timing-safe memory comparisons in cryptographic operations.

After an upgrade to MinGW version 13, some compilation errors appear: drivers/bus/pci/windows/pci.c:362:58: error: 'GUID_DEVCLASS_NETUIO' undeclared drivers/bus/pci/windows/pci_netuio.c:57:39: error: 'GUID_DEVINTERFACE_NETUIO' undeclared The cause is MinGW has set NTDDI_VERSION to the highest version without defining the expected NetUIO constants. The case MinGW64 is added to define NetUIO constants. Some comments are improved to better track includes requirements. Fixes: 6605c7f ("bus/pci: fix build with Windows SDK >= 10.0.20253") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

After an upgrade to MinGW version 13, compilation breaks: drivers/net/mlx5/windows/mlx5_ethdev_os.c:285:69: error: 'dev_link.<U1000>.<Uaf00>.link_autoneg' may be used uninitialized This is because link_autoneg is never set in mlx5_link_update(). It can be set to the previous value (no change). Also it does not make sense to check this value to return the update status as it does not change. Fixes: 6fbd737 ("net/mlx5: support link update on Windows") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

After an upgrade to MinGW version 13, compilation breaks: In function 'rte_bbdev_queue_ops_dump': lib/bbdev/rte_bbdev.c:1269:63: error: '%s' directive argument is null [-Werror=format-overflow=] fprintf(f, " Enqueue Status Counters %s %" PRIu64 "\n", The enqueue status string may be null if the index is too high, because RTE_BBDEV_ENQ_STATUS_SIZE_MAX is defined to include padding for future enum insertion. This padding case must be checked to avoid printing a dump of a non-existing status. The type of the variable i is also changed to the enum required by the function rte_bbdev_enqueue_status_str(). Fixes: 353e363 ("bbdev: add queue debug dump") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

Add enqueue/dequeue operations that use struct rte_dma_op to communicate with the DMA device. These operations need to be enabled at DMA device configuration time by setting the flag rte_dma_conf::enable_enq_deq if the device supports RTE_DMA_CAPA_OPS_ENQ_DEQ capability. When the DMA device is configured with RTE_DMA_CFG_FLAG_ENQ_DEQ flag, the enqueue/dequeue operations should be used to perform DMA operations. All other operations i.e., rte_dma_copy, rte_dma_copy_sg, rte_dma_fill, rte_dma_submit, rte_dma_completed, rte_dma_completed_status are not supported. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

Add enqueue dequeue operations tests. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

This patch adds an option to measure performance of enq/deq operations in the benchmark app. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

Implement DMA enqueue/dequeue operations when application enables it via configuration. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

Migrate all invocations of rte_event_dma_adapter_op API to rte_dma_op. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Amit Prakash Shukla <amitprakashs@marvell.com>

Stop dmadev may fail if there are pending DMA transfers, we need make sure there are no pending DMA transfers when stop. This commit uses following scheme: 1. flag stop proc so that new request will not process. 2. setting drop flag for all descriptor to quick complete. 3. waiting dmadev to complete. Fixes: 3c5f5f0 ("dma/hisilicon: add control path") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

In the runtest() function, it will return error if the test return value is less than 0. But the 'copy' and 'error_handling' testcase may return 1 if failed because its internal use or(||) operation. This commit fix it by treating non-zero as error in runtest() function. Fixes: 1b86a66 ("test/dma: add more comprehensive copy tests") Fixes: 99d7ec4 ("test/dma: add failure handling tests") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

The default ringsize is 512, some DMA devices may not support, this commit support auto adapt ringsize. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

There is no standard, cross-platform function to get the basename of a file path across all the supported DPDK platforms, Linux, BSD and Windows. Both Linux and BSD have a "basename" function in standard library, except: * Linux has two different basename functions, a POSIX version (which may or may not modify args), and a GNU one which is guaranteed *not* to modify the input arg and returns pointer to internal storage. * FreeBSD has just the one basename function, but, to be different, it is guaranteed *always* to modify the argument and re-use it for output. * Windows just doesn't have a basename function, but provides _split_path as a similar function, but with many differences over basename, e.g. splitting off extension, returning empty basename if path ends in "/" etc. etc. Therefore, rather than just trying to implement basename for windows, which opens the question as to whether to emulate GNU and *never* modify arg, or emulate BSD and *always* modify arg, this patchset introduces "rte_basename" which should have defined behaviour on all platforms. The patch also introduces a set of test cases to confirm consistent behaviour on all platforms too. The behaviour is as in doxygen docs. Essentially: - does not modify input path buffer - returns output in a separate output buffer - uses snprintf and strlcpy style return value to indicate truncation Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

The echoing of commands contained in a file loaded via the "load" command on testpmd CLI is governed by whether or not a cmdline-file or cmdline-file-noecho had been passed to testpmd on the commandline. Remove the use of a global setting for this, and explicitly add a "load_echo" command to match the "load" command. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

While testpmd allows a set of pre-prepared commands to be passed into it at startup via the "cmdline-file" (and cmdline-file-noecho) parameters, this is currently limited to a single file. By extending this support to allow the parameter to be passed multiple (up to 16) times, we enable users to have a library of pre-canned cmdline files and pass multiple of these in to the same run, e.g. have one cmdline file per NIC feature and enable multiple features by passing in multiple filenames. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Two small improvements for the cmdline file processing in testpmd. * Now that we support multiple files, change the prompt to indicate what file is currently being processed, and just print an EOF message when done. * When not echoing, the "Read" verb in the message "Read CLI commands..." is a little ambiguous, as it could mean "I have read", or "Go and read", i.e. job done or job about to start. Tweak the text to "Finished reading" which is unambiguous. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Replace use of the Linux/BSD-only basename function with "rte_basename". Since the DPDK basename function is guaranteed to do a copy into the destination buffer of known length, we can sometimes simplify the code, or make it safer, as part of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Use the new "rte_basename" function which has common behaviour on all supported platforms. As extra benefit, it removes the need for recreating the socket path twice, since rte_basename guarantees not to modify its argument. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Even though the idxd_bus.c file is linux-only, we can simplify the code through using rte_basename rather than just basename. Since rte_basename does not modify its parameter, we can eliminate the strdup and free, and since it also handles NULL sanely, we can remove an extra NULL check. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Previously, the AMD power monitor implementation did not enable the timeout, causing the lcore to remain in a wait state until an external monitoring event occurred or an interrupt was received. This patch enables the timeout-based exit condition, allowing the lcore to automatically wake up after the specified period. The maximum supported timeout value is 2^32 - 1. Fixes: c7ed1ce ("eal/x86: add power intrinsics for AMD") Cc: stable@dpdk.org Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

The unit test for EAL init was passing in a NULL argv[0] value which is invalid. Update to have a valid string for argv[0]. Fixes: 104a92b ("app: add reentrancy tests") Fixes: 978ead0 ("test/func_reentrancy: fix EAL init call") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>

Introduce a string define for the currently running OS, or execution environment. Originally, with old make build system, CONFIG_RTE_EXEC_ENV used to hold this name string, but the variable seems to have been missed in the meson build system, until commit cadb255 ("eal: add OS defines for C conditional checks") which introduced the RTE_EXEC_ENV for a different purpose. Now we can fix the docs with the new name reference. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>

Make the function to print out the help text for an argparse object a public function, which takes as a new parameter the file stream on which to print. This can be used in future to allow application to extend their own help information. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>

When the arguments passed to argparse include -h/--help then usage information is automatically printed. Provide the capability for the user to supply their own help function for this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>

The different value types supported by the argparse library are not discussed in the documentation, so add some reference material about them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>

Core lists are widely used in DPDK, so add support for parsing them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>

Allow argparse to behave like getopt in ignoring non-flag arguments and moving them to the end of the list. This will allow use in DPDK for EAL args where, for example: ./dpdk-test -l 1,2 lcores_autotest --no-huge -- argparse_autotest will return from getopt processing in rte_eal_init as: ./dpdk-test -l 1,2 --no-huge -- lcores_autotest argparse_autotest with EAL init function then replacing the "--" with argv[0] before returning to the app with the return value of 5, for the 5 args (including the "--") processed. Since this is a significant behavioural change for argparse, which, unlike getopt, has the idea of positional arguments, we place this behaviour behind a new flag for argparse. If this new "ignore_non_flag_args" option is set, then positional arguments are disallowed and getopt behaviour is used for non-flag args. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

When passing an optional parameter to a argparse argument, the only way to do so has been to pass the argument using "=arg" on the end of the option, for example "--foo=bar". However, for short options, i.e. those of the form "-x", this requirement is unusual as getopt also supports taking additional characters after the option as the parameter, e.g. "-xarg". Add support for this form of parameter for short options to argparse. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>

To simplify future rework of the EAL arg handling, add a long-option equivalent for each short option that doesn't already have one. When updating the docs with the new long options, standardize the format of options which have both short and long variants, and drop the deprecated service-coremask option from the docs, rather than adding its new long option. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>

Create eal_option_list.h, containing all the possible EAL parameters, and basic info about them, such as type, whether they take a parameter or not. Each entry is defined using a macro, which will be then interpreted when the file is included. First time this header in included in the eal_common_options.c file, the macros are defined in such a way as to define field elements for an "eal_init_args" structure, where each value is either a string type, if it takes a parameter, or boolean type if it doesn't. For those elements that take multiple values, i.e. are passed multiple times, we put them in a TAILQ. The second time of inclusion is for TAILQ initialization, as it is required before using such object. The third time of inclusion, the macros are defined so as to define the arguments in an rte_argparse structure for EAL. For the basic string and boolean types, we just store the values in the appropriate field in the previous defined "eal_init_args" structure. For the list elements, we use the argparse callback to process those elements, adding them to the TAILQ as they are encountered. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>

DPDK traditionally has iterated through all args and processed them as they appear in the commandline. The arg processing logic can be simplified if instead we initially gather all arguments into a structure which is then processed with the arguments dealt with in a fixed/known order. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>

When rte_eal_init fails part way through, any saved EAL arguments need to be freed, and the run_once flag needs to be set back to zero again. The former task was never done on failure, and the latter was only done on some occasions. Rework the error handling to always go to an err_out label where cleanup is done. To prevent memory leaks from the saved arguments when eal_init is called twice, the check for multiple calls must be done first before the argument saving and parsing is done. This patch modifies all three eal.c files. Windows doesn't actually need changes, since there are no args saved, and no run_once sentinel value, but updating it keeps it consistent with FreeBSD and Linux versions. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Remove the separate function to check combinations of cmdline parameters. Instead, just do those checks when parsing the parameters since we have all info about what parameters are provided at that point. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Use a utility function and macro to simplify the code for checking for conflicting cmdline options. The checking can also be done at the initial argument collating stage, shortening the argument processing function which is very much on the long side. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

The existing "available_cores" function in eal_common_options.c has general code for converting a set of cores to a printable string. Generalize this code into an "eal_cpuset_to_str" function which takes a cpuset as parameter, and create a new "available_cores" function using this new utility function. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Use the new cpuset to string function to replace other thread-specific functions which did similar work. Also use it for additional debug logs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Rather than working off arrays of lcore values, update the coremask handling to use rte_cpuset_t. The use of the array provides no additional value for coremasks, since the order of the bits set will always be from lowest core number to highest, making the tracking of what core was 0, 1, 2 etc. pointless. As part of this change, move the prototype for the parse_coremask function from rte_eal.h to eal_options.h, for two reasons: 1. This is an internal function that should not be used by apps and so there is no need to have it in a public header. 2. If the compilation command is run without the _GNU_SOURCE flag, we miss definitions from sched.h and so get build errors for the rte_cpuset_t type when compiling examples against an installed DPDK. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>

Add the new "--remap-lcore-ids" or "-R" option to signify that lcore ids are to start at a low range. This takes an optional parameter to specify the value to start from. Initially, use this to adjust the core mappings when using a coremask. As well as remapping within the range of RTE_MAX_LCORE, to allow use of cores >=RTE_MAX_LCORE we need to expand the internal arrays of cores to be CPU_SETSIZE big, and limit the checks for RTE_MAX_LCORE to only cases where we are not remapping. This requires an additional parameter to the internal API rte_eal_parse_coremask, having a small knock-on effect on DLB driver as the only other user of this API. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>

Allow the "-R" flag to be used with core lists so that we can specify lists with core ids >= RTE_MAX_LCORE in them, e.g. -l 140-148. In order to do so, we limit the use of the "@" symbol for per-core remapping when specifying a core list for automatic/global remapping, rather than dealing with the situation of some cores being manually remapped and others automatically. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

When run without an coremask or core list parameters, EAL detects the working set of cores allowed for the current process and uses that as a core list for use. Currently, this does not work for cores which are numbered >= RTE_MAX_LCORE. We can fix this by allowing the remap option to also be used with these auto-detected cores. In the process we simplify the code by taking advantage of that fact that we know up-front when processing parameters if any core parameter is passed, so we no longer need a flag and to auto-detect at the end. Instead, we take advantage of the fact that our core-arg processing routines now all work with cpusets, to have a simple tri-state option for remapping (if we don't have manual remapping done by --lcores-parameter): * parse coremask to cpuset * parse core list to cpuset * query OS for current thread's cpuset Once that is done, we have common code for processing, and optionally remapping, the resulting cpuset. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

The new function rte_memeq_timingsafe() is based on FreeBSD API to avoid timing side-channel attacks. Bugzilla ID: 1773 Signed-off-by: Kai Ji <kai.ji@intel.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>

Replace memcmp() with rte_memeq_timingsafe() in cryptographic authentication verification operations across ipsec-mb drivers. Note: OpenSSL crypto driver already uses CRYPTO_memcmp() which provides equivalent timing attack resistance and is left unchanged. Note: scheduler driver memcmp stays unchanged as its not secret data comparison and actually faster with no timing attack risk. Bugzilla ID: 1773 Signed-off-by: Kai Ji <kai.ji@intel.com>

Allow the user to specify protocol link type when creating pcapng files. This change is needed to specify the protocol type in the pcapng file, DLT_EN10MB specifies ethernet packets only. This will allow dissectors for other protocols to be used on files generated by pcapng. Includes a breaking change to rte_pcapng_add_interface to add link_type parameter. Existing calls to the function have been updated to pass DLT_EN10MB for the link type argument. Fixes: d1da6d0 ("pcapng: require per-interface information") Signed-off-by: Dylan Schneider <schneide@qti.qualcomm.com> Acked-by: Stephen Hemminger <stephen@networplumber.org>

rte_graph_model_mcore_dispatch_core_bind relied on rte_lcore_is_enabled. Yet, "rte_lcore_is_enabled" only checks for EAL threads, which forbids external threads (NON EAL) to run a part of the graph. Verify if the lcore role is not "ROLE_OFF", and return relevant error code otherwise. Signed-off-by: Christophe Fontaine <cfontain@redhat.com>

In dispatch mode, the assumption is that the graphs will be polled continuously: there is no way to know when packets are transferred from one graph to another. Introduce a callback to be notified when packets are enqueue in the work queue: this can be useful if we want to wake up a thread. This sleep/wake up mechanism is deferred to the application. Signed-off-by: Christophe Fontaine <cfontain@redhat.com>

This patch implements RISC-V vector intrinsics to accelerate memory copy operations for byte range (129~1600). Signed-off-by: Qiguo Chen <chen.qiguo@zte.com.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>

Benchmark results show 10~15% reduction in execution time for data sizes (129~1024) Signed-off-by: Qiguo Chen <chen.qiguo@zte.com.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>

sourcery-ai

Sorry @ovsrobot, your pull request is larger than the review limit of 150000 diff characters

coderabbitai · 2025-10-16T10:34:46Z

Walkthrough

This pull request introduces comprehensive support for DMA enqueue/dequeue operations throughout the library and application layers, refactors EAL argument parsing to use the argparse library, enhances argparse with CORELIST value type and non-flag argument handling, updates the pcapng API to accept configurable link types, implements timing-safe memory comparison functions, and adds RISC-V vector instruction-based memcpy optimizations.

Changes

Cohort / File(s)	Summary
DMA Enqueue/Dequeue Operations `lib/dmadev/rte_dmadev.h`, `lib/dmadev/rte_dmadev.c`, `lib/dmadev/rte_dmadev_core.h`, `lib/dmadev/rte_dmadev_trace*.h`, `lib/dmadev/rte_dmadev_trace_points.c`	New `rte_dma_op` structure, `RTE_DMA_CAPA_OPS_ENQ_DEQ` capability flag, `RTE_DMA_CFG_FLAG_ENQ_DEQ` configuration flag, and inline APIs `rte_dma_enqueue_ops`/`rte_dma_dequeue_ops` for batch DMA operation submission/completion. Replaces hardcoded `enable_silent` with flag-based configuration. Adds trace points for enqueue/dequeue operations.
DMA Driver Updates `drivers/dma/cnxk/cnxk_dmadev.c`, `drivers/dma/cnxk/cnxk_dmadev_fp.c`, `drivers/dma/dpaa/dpaa_qdma.c`, `drivers/dma/dpaa2/dpaa2_qdma.c`, `drivers/dma/hisilicon/hisi_dmadev.c`, `drivers/dma/idxd/idxd_bus.c`	Implements ENQ_DEQ support in CNXK driver with ops array tracking and CN10K-specific paths. Updates DPAA and DPAA2 drivers to use flag-based silent mode configuration. Adds stop processing flag and graceful stop routine to Hisilicon driver. Simplifies path prefix handling in IDXD driver using `rte_basename`.
DMA Test & Benchmark `app/test-dma-perf/benchmark.c`, `app/test-dma-perf/main.c`, `app/test-dma-perf/main.h`, `app/test-dma-perf/config.ini`, `app/test/test_dmadev.c`, `app/test/test_dmadev_api.c`	Adds DMA OPS support to test-dma-perf with new `do_dma_enq_deq_mem_copy` work function, per-worker `dma_ops` arrays, and configuration option `use_enq_deq_ops`. Introduces `test_enq_deq_ops` test function with ENQ_DEQ capability checks and mempool-based ops allocation. Updates device configuration to use flags instead of `enable_silent`.
Argparse Library Enhancements `lib/argparse/rte_argparse.h`, `lib/argparse/rte_argparse.c`	Adds `RTE_ARGPARSE_VALUE_TYPE_CORELIST` for CPU set parsing, new `ignore_non_flag_args` flag for getopt-style argument reordering, custom `print_help` callback support, and stream-based `rte_argparse_print_help` public API. Refactors help output to use FILE* stream parameter. Implements non-flag argument collection and end-of-argv repositioning.
EAL Argument Parsing Refactoring `lib/eal/common/eal_common_options.c`, `lib/eal/common/eal_option_list.h`, `lib/eal/common/eal_options.h`, `lib/eal/common/eal_private.h`, `lib/eal/common/eal_common_thread.c`, `lib/eal/include/rte_eal.h`	Migrates EAL option parsing from getopt-based to argparse-based system. Introduces `eal_collate_args` and `eal_cpuset_to_str` utilities. Updates `rte_eal_parse_coremask` signature from `int cores` array to `rte_cpuset_t cpuset` bitmap. Centralizes option definition via macro-driven `eal_option_list.h`. Refactors core/lcore mapping to use cpuset instead of array-based indices.
EAL Platform Initialization `lib/eal/freebsd/eal.c`, `lib/eal/linux/eal.c`, `lib/eal/windows/eal.c`	Adds re-entry guards via `run_once` flag. Introduces early argument saving and collation. Centralizes error handling to unified `err_out` cleanup path. Updates log messages for consistency and adds explicit `strerror` output in error cases.
String & Basename Utilities `lib/eal/include/rte_string_fns.h`, `lib/eal/unix/rte_basename.c`, `lib/eal/windows/rte_basename.c`, `lib/eal/unix/meson.build`, `lib/eal/windows/meson.build`, `app/test/test_string_fns.c`	Implements new public API `rte_basename(path, buf, buflen)` with platform-specific Unix and Windows implementations. Adds unit test coverage for basename functionality.
Memory Safety `lib/eal/include/rte_memory.h`	Adds `rte_memeq_timingsafe` constant-time memory comparison function to prevent timing-based side-channel attacks.
Crypto Driver Timing-Safe Comparisons `drivers/crypto/ipsec_mb/pmd_aesni_gcm.c`, `drivers/crypto/ipsec_mb/pmd_aesni_mb.c`, `drivers/crypto/ipsec_mb/pmd_snow3g.c`, `drivers/crypto/ipsec_mb/pmd_zuc.c`	Replaces direct `memcmp` authentication tag verification with timing-safe `rte_memeq_timingsafe` comparisons across multiple algorithms.
Event-DMA Adapter `lib/eventdev/rte_event_dma_adapter.h`, `lib/eventdev/rte_event_dma_adapter.c`, `app/test-eventdev/test_perf_common.h`, `app/test-eventdev/test_perf_common.c`, `app/test/test_event_dma_adapter.c`	Removes `rte_event_dma_adapter_op` structure; replaces with generic `rte_dma_op` throughout adapter paths. Updates circular buffer type, mempool allocation, and enqueue/dequeue signatures.
Pcapng Link Type Configuration `lib/pcapng/rte_pcapng.h`, `lib/pcapng/rte_pcapng.c`, `lib/graph/graph_pcap.c`, `app/dumpcap/main.c`, `app/test/test_pcapng.c`	Adds `uint16_t link_type` parameter to `rte_pcapng_add_interface` API. Defines `DLT_EN10MB` macro. Updates all call sites to pass explicit link type (replacing hardcoded Ethernet type).
Testpmd Command File Enhancements `app/test-pmd/testpmd.h`, `app/test-pmd/testpmd.c`, `app/test-pmd/parameters.c`, `app/test-pmd/cmdline.c`	Replaces single cmdline file with array-based multi-file support via `struct cmdline_file_info`. Adds per-file `echo` flag control. Introduces `load_echo` command. Updates `cmdline_read_from_file` signature to accept `bool echo` parameter. Accumulates files in array with bounds checking.
Graph Dispatch Callbacks `lib/graph/rte_graph.h`, `lib/graph/rte_graph_worker_common.h`, `lib/graph/graph.c`, `lib/graph/rte_graph_model_mcore_dispatch.c`	Adds `packets_enqueued_cb` callback type and dispatch notification support via `notify_cb` and `cb_priv` fields in graph dispatch structure. Invokes callback after node enqueue completion.
RISC-V Memcpy Optimization `lib/eal/riscv/include/rte_memcpy.h`	Implements vector instruction-based memcpy using RISC-V V-ISA with prefetching, dynamic VLENB detection, and initialization-time algorithm selection. Adds `riscv_vlenb()` for runtime feature detection.
RISC-V Platform Configuration `config/riscv/meson.build`	Adds prefetch extension (zicbop) detection for RISC-V with compiler version checks (GCC ≥14.1.0, Clang ≥18.1.0). Sets `RTE_RISCV_FEATURE_PREFETCH` flag when available.
Platform & Build Configuration `config/meson.build`, `lib/meson.build`, `lib/eal/meson.build`	Sets `RTE_EXEC_ENV_NAME` string configuration. Adds 'argparse' to `always_enable` library list. Replaces 'log' build dependency with 'argparse' for EAL.
Thread & MWAIT Updates `lib/eal/common/eal_common_lcore.c`, `lib/eal/common/eal_common_memory.c`, `lib/eal/common/eal_common_proc.c`, `lib/eal/linux/eal_memory.c`, `lib/eal/x86/rte_power_intrinsics.c`	Refactors thread affinity dumping to use `eal_cpuset_to_str`. Updates lcore output formatting. Adjusts memory error messages. Modifies `amd_mwaitx` to use non-zero timeout parameter values.
Event Adapter & Event Device `app/test/test_argparse.c`, `app/test/test_func_reentrancy.c`	Adds comprehensive corelist and ignore_non_flag_args test coverage. Marks unused function parameters with `__rte_unused`.
Documentation & Release Notes `doc/guides/contributing/design.rst`, `doc/guides/linux_gsg/eal_args.include.rst`, `doc/guides/prog_guide/argparse_lib.rst`, `doc/guides/prog_guide/dmadev.rst`, `doc/guides/prog_guide/eventdev/event_dma_adapter.rst`, `doc/guides/rel_notes/release_25_11.rst`, `doc/guides/testpmd_app_ug/run_app.rst`, `doc/guides/testpmd_app_ug/testpmd_funcs.rst`, `doc/guides/tools/dmaperf.rst`	Updates macro documentation (RTE_EXEC_ENV to RTE_EXEC_ENV_NAME). Expands EAL argument documentation with new lcore remapping option and combined short/long forms. Adds argparse CORELIST type documentation, help/usage customization, and non-flag argument handling. Adds DMA enqueue/dequeue operations usage examples. Updates testpmd cmdline file documentation.
Miscellaneous `.mailmap`, `benchmark_report.txt`, `drivers/bus/pci/windows/pci.c`, `drivers/bus/pci/windows/pci_netuio.h`, `drivers/event/dlb2/dlb2_priv.h`, `drivers/event/dlb2/pf/base/dlb2_resource.c`, `drivers/net/mlx5/windows/mlx5_ethdev_os.c`, `lib/bbdev/rte_bbdev.c`	Adds mailmap entries. Includes benchmark report data. Reorders Windows PCI bus includes. Updates DLB2 coremask parsing to use cpuset bitmaps. Removes autoneg comparison in MLX5 link detection. Refactors bbdev loop index type to enum.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

This is a substantial, multi-faceted change spanning core library refactoring (EAL argument parsing, DMA framework), new feature implementation (DMA OPS support, argparse enhancements), driver-level changes across multiple DMA/event/crypto drivers, and interconnected public API modifications. The changes demand understanding of: (1) EAL initialization and argument parsing flows; (2) DMA device operation enqueue/dequeue semantics; (3) argparse library architecture and its integration into EAL; (4) driver-specific OPS implementations across CNXK/DPAA/Hisilicon; (5) pcapng API contract changes; (6) event-adapter operation structure migration. High heterogeneity of logic across subsystems (no single repetitive pattern), complex state management in drivers, and interconnected API changes amplify review complexity.

Poem

🐰 Hop along the DMA path, ops enqueued with care,
Argparse parses corelists, fairness everywhere,
Timing-safe secrets guarded, no side-channel leaks,
EAL reborn refactored, stronger every week!
Graph callbacks notify, RISC-V vectors fly,
Twenty-five-eleven blooms—what a lovely sky! ✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	The title focuses solely on an “Optimization Summary for RISC-V rte_memcpy” while the actual changeset spans multiple subsystems (DMA OPS support, pcapng API extensions, CLI enhancements, documentation updates, and more) far beyond RISC-V memcpy. This makes the title misleading and insufficiently descriptive of the PR’s scope.	Revise the title to capture the key areas affected by this patch series—such as DMA enqueue/dequeue support, pcapng link-type extension, CLI and config enhancements, and RISC-V memcpy optimization—to provide a clear summary of the PR’s scope.
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.14% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch series_36374

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 26

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

app/test/test_dmadev.c (1)
95-101: Use correct vchan when reading stats

rte_dma_stats_get(dev_id, 0, &stats) ignores the vchan under test. Use vchan, not 0.

Apply this diff:
-        rte_dma_stats_get(dev_id, 0, &stats);
+        rte_dma_stats_get(dev_id, vchan, &stats);
drivers/dma/cnxk/cnxk_dmadev.c (1)
53-57: Memory leak: c_desc.ops never freed (and error path not fully cleaned).

cnxk_dmadev_vchan_free() frees only compl_ptr; ops array leaks on reconfigure/close.

vchan_setup() error path frees compl_ptr but doesn’t null it.

Fix by freeing ops in vchan_free and nulling compl_ptr in the error path.
@@
 	for (; i < num_vchans; i++) {
 		dpi_conf = &dpivf->conf[i];
 		rte_free(dpi_conf->c_desc.compl_ptr);
 		dpi_conf->c_desc.compl_ptr = NULL;
+		rte_free(dpi_conf->c_desc.ops);
+		dpi_conf->c_desc.ops = NULL;
 	}
@@
-	if (dpi_conf->c_desc.ops == NULL) {
+	if (dpi_conf->c_desc.ops == NULL) {
 		plt_err("Failed to allocate for ops array");
-		rte_free(dpi_conf->c_desc.compl_ptr);
+		rte_free(dpi_conf->c_desc.compl_ptr);
+		dpi_conf->c_desc.compl_ptr = NULL;
 		return -ENOMEM;
 	}
Also applies to: 276-282
app/test-dma-perf/benchmark.c (1)
989-1043: Free dma_ops and underlying ops blob in cleanup.

Prevent leaks by freeing both the pointer array and ops storage.
@@
 	/* free sges for mbufs */
 	rte_free(src_sges);
 	src_sges = NULL;
 
 	rte_free(dst_sges);
 	dst_sges = NULL;
 
+	/* free DMA ops (if allocated for OPS path) */
+	if (dma_ops != NULL) {
+		/* first element points to the start of the contiguous ops blob */
+		if (dma_ops[0] != NULL)
+			rte_free(dma_ops[0]);
+		rte_free(dma_ops);
+		dma_ops = NULL;
+	}
+
 	/* free the worker parameters */
 	for (i = 0; i < nb_workers; i++) {
 		rte_free(lcores[i]);
 		lcores[i] = NULL;
 	}

🧹 Nitpick comments (33)

drivers/bus/pci/windows/pci_netuio.h (1)

8-8: LGTM! MinGW-w64 compatibility addition is appropriate.

The addition of || defined(__MINGW64__) ensures that the GUID definitions are included when building with MinGW-w64, even when targeting sufficiently recent Windows versions. This is a reasonable compatibility fix since MinGW-w64 headers may not provide these definitions.

Minor nitpick: There's an extra space before defined(__MINGW64__).
lib/bbdev/rte_bbdev.c (1)
1250-1250: Consider using an integer type for the loop variable.

While using enum rte_bbdev_enqueue_status as the loop index is more self-documenting, incrementing an enum with i++ is not idiomatic C. Although enums are treated as integers and this works correctly (especially with the NULL guard at lines 1269-1270), it can create values outside the defined enum range during iteration.

For better clarity and type safety, consider one of these alternatives:

Option 1 (recommended): Use an integer type and cast when calling the function
-	enum rte_bbdev_enqueue_status i;
+	unsigned int i;
 	struct rte_bbdev *dev = get_dev(dev_id);
 
 	VALID_DEV_OR_RET_ERR(dev, dev_id);
Option 2: Explicitly cast in the loop
-	for (i = 0; i < RTE_BBDEV_ENQ_STATUS_SIZE_MAX; i++) {
-		const char *status_str = rte_bbdev_enqueue_status_str(i);
+	for (i = (enum rte_bbdev_enqueue_status)0; 
+	     i < RTE_BBDEV_ENQ_STATUS_SIZE_MAX; 
+	     i = (enum rte_bbdev_enqueue_status)((int)i + 1)) {
+		const char *status_str = rte_bbdev_enqueue_status_str((enum rte_bbdev_enqueue_status)i);
Also applies to: 1267-1267
app/test/test_func_reentrancy.c (1)
69-73: Consider using a simple string literal for the program name.

The use of __FILE__ as the program name expands to the full source file path (e.g., /path/to/app/test/test_func_reentrancy.c), which is unconventional for argv[0]. While this doesn't affect the test's functionality (since it expects rte_eal_init to fail), using a clearer, more conventional name would improve readability.

Apply this diff for better clarity:
 test_eal_init_once(__rte_unused void *arg)
 {
-	char appname[] = __FILE__;
 	unsigned lcore_self =  rte_lcore_id();
-	char *argv[] = { appname, NULL };
+	char *argv[] = { "test_eal_init", NULL };
drivers/event/dlb2/pf/base/dlb2_resource.c (1)
925-948: Consider consolidating redundant counter variables.

The local variable set_count is always incremented in lockstep with hw->num_prod_cores at lines 947-948. Since they track the same value, you could simplify the code by using hw->num_prod_cores directly as the index.

Current code:
hw->prod_core_list[set_count++] = i;
hw->num_prod_cores++;
Could be simplified to:
hw->prod_core_list[hw->num_prod_cores++] = i;
This eliminates the extra variable and makes the code more concise without changing behavior.
drivers/dma/hisilicon/hisi_dmadev.h (1)
144-144: Document the SQE_DROP_FLAG purpose and behavior.

The new SQE_DROP_FLAG lacks documentation explaining its purpose and how the hardware interprets it. Based on the stop function usage (line 466 in hisi_dmadev.c), it appears to signal the hardware to quickly complete/discard operations, but this should be explicitly documented.

Add a comment explaining the flag:
+/* SQE_DROP_FLAG signals hardware to quickly complete/discard the operation */
 #define SQE_DROP_FLAG	BIT(4)
config/riscv/meson.build (1)
151-159: Align indentation with the V extension block.

The zicbop detection block uses inconsistent indentation compared to the V extension block above (lines 135-146). Specifically:

Line 152 uses 4-space continuation vs 8-space in line 136

Line 153 uses 2-space indent vs 4-space in line 137

Line 154 uses 6-space continuation vs 12-space in line 138

For consistency and maintainability, match the indentation pattern used in the V extension block.

Apply this diff to fix the indentation:
 if (riscv_extension_macros and
-    (cc.get_define('__riscv_zicbop', args: machine_args) != ''))
-  if ((cc.get_id() == 'gcc' and cc.version().version_compare('>=14.1.0'))
-      or (cc.get_id() == 'clang' and cc.version().version_compare('>=18.1.0')))
-      message('Compiling with the zicbop extension')
-      machine_args += ['-DRTE_RISCV_FEATURE_PREFETCH']
-  else
-    warning('Detected zicbop extension but cannot use because intrinsics are not available (present in GCC 14.1.0+ and Clang 18.1.0+)')
-  endif
+        (cc.get_define('__riscv_zicbop', args: machine_args) != ''))
+    if ((cc.get_id() == 'gcc' and cc.version().version_compare('>=14.1.0'))
+            or (cc.get_id() == 'clang' and cc.version().version_compare('>=18.1.0')))
+        message('Compiling with the zicbop extension')
+        machine_args += ['-DRTE_RISCV_FEATURE_PREFETCH']
+    else
+        warning('Detected zicbop extension but cannot use because intrinsics are not available (present in GCC 14.1.0+ and Clang 18.1.0+)')
+    endif
 endif
lib/eal/riscv/include/rte_memcpy.h (3)
262-269: Prefer copying only the misalignment (dstofss) instead of a fixed 64B prologue

The alignment prologue copies 64 bytes but advances pointers by dstofss (≤63), causing a redundant 64B load/store when only dstofss bytes are needed. It’s safe but wasteful.

Proposed change:
     dstofss = (uintptr_t)dst & ALIGNMENT_MASK_64;
     if (dstofss > 0) {
         dstofss = 64 - dstofss;
         n -= dstofss;
-        _rte_mov64(dst, src);
+        _rte_mov(dst, src, (uint32_t)dstofss);
         src = (const uint8_t *)src + dstofss;
         dst = (uint8_t *)dst + dstofss;
     }
23-25: Don’t disable the entire RVV path if PREFETCH is unavailable

Undefining RTE_RISCV_FEATURE_V when RTE_RISCV_FEATURE_PREFETCH is absent disables otherwise valid RVV copies on platforms lacking prefetch support.

Suggestion:

Keep RTE_RISCV_FEATURE_V intact.

Make prefetch helpers no-ops when RTE_RISCV_FEATURE_PREFETCH is undefined (e.g., provide empty inline functions/macros).
This preserves RVV acceleration without relying on prefetch availability.

356-362: RTE_INIT in a header creates per-TU constructors; consider centralizing init

Placing RTE_INIT in a header produces one constructor per translation unit, increasing init overhead and binary size.

Move the feature detection routine to a single C file (e.g., riscv eal source) and expose the result via an inline getter or extern variable used by the inline fast paths.
drivers/dma/idxd/idxd_bus.c (1)
263-268: Consider using PATH_MAX instead of magic number.

The refactoring successfully eliminates dynamic allocation, but the fixed buffer size of 256 is a magic number. Consider using PATH_MAX (already available via the <libgen.h> include) for better clarity and consistency with POSIX standards.

Apply this diff to use PATH_MAX:
-	char prefix[256];
+	char prefix[PATH_MAX];
 	int retval = 0;
 	size_t prefixlen;
 
 	prefixlen = rte_basename(rte_eal_get_runtime_dir(), prefix, sizeof(prefix));
doc/guides/testpmd_app_ug/testpmd_funcs.rst (1)

70-73: Clarify echo/noecho semantics and align examples

Please add one sentence stating the behavioral difference:

--cmdline-file echoes each command before execution; --cmdline-file-noecho suppresses it.

Similarly, load echoes; load_echo explicitly echoes each command.

Consider adding a brief “with echo” vs “no echo” example, so users see expected console output.

Also applies to: 77-77, 98-100, 110-111

doc/guides/linux_gsg/eal_args.include.rst (1)

70-91: Document how remapping affects main/service lcore options

Please note explicitly that when -R/--remap-lcore-ids is used, --main-lcore and --service-corelist values refer to remapped lcore-ids (not physical CPUs).

Keep the restriction note, but also add a short example showing --service-corelist after remap.

Also applies to: 92-98

drivers/dma/cnxk/cnxk_dmadev.h (1)

91-97: Confirm ring mask semantics and ops array sizing

max_cnt is used as a mask (head/tail & max_cnt). Please ensure:

max_cnt is (ring_size - 1) and ring_size is a power of two.

c_desc->ops is allocated for ring_size entries (max_cnt + 1).

If not already guaranteed at setup, add an assert or comment to prevent future misuse.

lib/eal/common/eal_options.h (1)

23-25: Include header for rte_cpuset_t to avoid ordering issues

This header declares rte_eal_parse_coremask using rte_cpuset_t but does not include a header that defines it. Add an include (e.g., rte_common.h or rte_os.h) here to prevent compilation-order pitfalls in future includes.
lib/graph/rte_graph_worker_common.h (1)
71-73: Use uintptr_t for cb_priv for pointer-size portability

If cb_priv carries a pointer/cookie, use uintptr_t instead of uint64_t to avoid assumptions on pointer size across architectures.
-            packets_enqueued_cb notify_cb; /**< Callback when packet crosses lcores. */
-            uint64_t cb_priv;       /**< Opaque parameter for notify_cb. */
+            packets_enqueued_cb notify_cb; /**< Callback when packet crosses lcores. */
+            uintptr_t cb_priv;      /**< Opaque parameter for notify_cb. */
app/test-pmd/testpmd.c (1)

112-114: Multi-file cmdline processing looks solid.

Globals and loop usage are correct; error path breaks out cleanly. If partial success is acceptable, consider continuing on errors and reporting all failures.

Also applies to: 4583-4589
doc/guides/prog_guide/dmadev.rst (1)
111-142: Consider clarifying alignment requirements in the example.

The documentation and code example are clear and comprehensive. However, the example at line 123 uses rte_zmalloc without specifying alignment:
op = rte_zmalloc(sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2), 0);
Given that the structure includes trailing scatter-gather elements, it would be clearer to specify the alignment explicitly or add a comment explaining the alignment requirements, especially since the mempool creation at line 2253 in app/test-eventdev/test_perf_common.c uses a similar pattern.

Apply this diff to make alignment explicit:
-   op = rte_zmalloc(sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2), 0);
+   op = rte_zmalloc(NULL, sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2),
+                    RTE_CACHE_LINE_SIZE);
lib/graph/graph.c (1)

348-349: Consider error code semantics.

The code uses ENOLINK (Link has been severed) for an invalid lcore. While technically correct, EINVAL (Invalid argument) might be more conventional and clearer for "invalid lcore" errors.

The validation now checks rte_lcore_has_role(lcore, ROLE_OFF) to detect disabled lcores, which is appropriate for rejecting invalid core bindings.
app/test/test_dmadev_api.c (1)
703-759: Consider adding clarifying comments for the ops memory management flow.

The test allocates an ops structure, enqueues it, sets the pointer to NULL, dequeues it, and then frees it. While this pattern correctly tests that dequeue returns the operation pointer, the ops = NULL assignment at line 743 may not be immediately clear to future readers.

Consider adding a comment before line 743 to clarify the intent:
 	rte_delay_us_sleep(TEST_WAIT_US_VAL);
 
+	/* Set ops to NULL to verify that dequeue properly returns the operation pointer */
 	ops = NULL;
 	/* Dequeue operations */
app/test/test_dmadev.c (3)
1075-1085: Pool name collisions across devices/runs

Hardcoded names "pkt_pool"/"ops_pool" can collide when multiple devices run the test. Include dev_id/vchan in names.

Apply this diff:
-    pkt_pool = rte_pktmbuf_pool_create("pkt_pool", info.max_desc * 2, 0, 0,
+    char pkt_pool_name[32];
+    snprintf(pkt_pool_name, sizeof(pkt_pool_name), "pkt_pool_%d_%u", dev_id, vchan);
+    pkt_pool = rte_pktmbuf_pool_create(pkt_pool_name, info.max_desc * 2, 0, 0,
                                        CPY_LEN + RTE_PKTMBUF_HEADROOM, rte_socket_id());
@@
-    ops_pool = rte_mempool_create("ops_pool", info.max_desc,
+    char ops_pool_name[32];
+    snprintf(ops_pool_name, sizeof(ops_pool_name), "ops_pool_%d_%u", dev_id, vchan);
+    ops_pool = rte_mempool_create(ops_pool_name, info.max_desc,
                                   sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2),
                                   0, 0, NULL, NULL, NULL, NULL, rte_socket_id(), 0);
1118-1126: Confirm submit semantics for enqueue_ops/dequeue_ops

No rte_dma_submit() is called between enqueue and dequeue. Some PMDs may require submit even in ENQ_DEQ mode. Confirm API guarantees or add submit.

If required, add after completing an enqueue burst:
+            if (rte_dma_submit(dev_id, vchan) < 0)
+                ERR_RETURN("Error with rte_dma_submit()\n");
Please confirm with dmadev ENQ/DEQ API docs or driver behavior.

Also applies to: 1134-1138

1169-1194: prepare_enq_deq_ops: add minimal capacity check

If max_desc <= BURST_SIZE the test logic will fail (see “max” calc). Add a guard and skip.

Apply this diff:
     int ret = rte_dma_info_get(dev_id, &info);
     if (ret != 0)
         ERR_RETURN("Error with rte_dma_info_get()\n");
 
     qconf.direction = RTE_DMA_DIR_MEM_TO_MEM;
     qconf.nb_desc = info.max_desc;
+    if (qconf.nb_desc <= BURST_SIZE)
+        ERR_RETURN("Insufficient descriptors for ENQ/DEQ test (need > %u, have %u)\n",
+                   BURST_SIZE, qconf.nb_desc);
drivers/dma/cnxk/cnxk_dmadev.c (1)
452-464: Typo in function name: cnxk_damdev_burst_capacity → cnxk_dmadev_burst_capacity.

Purely cosmetic, but aligns with existing naming (dmadev).
-static uint16_t
-cnxk_damdev_burst_capacity(const void *dev_private, uint16_t vchan)
+static uint16_t
+cnxk_dmadev_burst_capacity(const void *dev_private, uint16_t vchan)
@@
-	dev->fp_obj->burst_capacity = cnxk_damdev_burst_capacity;
+	dev->fp_obj->burst_capacity = cnxk_dmadev_burst_capacity;
Also applies to: 555-556
drivers/dma/cnxk/cnxk_dmadev_fp.c (1)
692-729: Mask nb_src/nb_dst to hardware limits (consistency with other paths).

Other paths use CNXK_DPI_MAX_POINTER masking. Enqueue should do the same to avoid invalid cmd lengths.
-		src = op->nb_src;
-		dst = op->nb_dst;
+		src = op->nb_src & CNXK_DPI_MAX_POINTER;
+		dst = op->nb_dst & CNXK_DPI_MAX_POINTER;
@@
-		src = op->nb_src;
-		dst = op->nb_dst;
+		src = op->nb_src & CNXK_DPI_MAX_POINTER;
+		dst = op->nb_dst & CNXK_DPI_MAX_POINTER;
Also applies to: 753-787
lib/eal/common/eal_common_options.c (1)
1996-2001: Guard against NULL from available_cores() in error log.

available_cores() may return NULL; passing it to “%s” is unsafe.
-			char *available = available_cores();
-			EAL_LOG(ERR, "invalid coremask or core-list parameter, please check specified cores are part of %s",
-					available);
-			free(available);
+			char *available = available_cores();
+			EAL_LOG(ERR,
+				"invalid coremask or core-list parameter, please check specified cores are part of %s",
+				available ? available : "(no available cores)");
+			if (available)
+				free(available);
lib/dmadev/rte_dmadev.c (1)
804-817: Add name for new capability RTE_DMA_CAPA_OPS_ENQ_DEQ in dump.

Currently prints “unknown”.
 	} capa_names[] = {
 		{ RTE_DMA_CAPA_MEM_TO_MEM,  "mem2mem" },
@@
 		{ RTE_DMA_CAPA_OPS_COPY_SG, "copy_sg" },
 		{ RTE_DMA_CAPA_OPS_FILL,    "fill"    },
+		{ RTE_DMA_CAPA_OPS_ENQ_DEQ, "ops_enq_deq" },
 	};
app/test-dma-perf/benchmark.c (1)
409-462: Align throughput counter with completions (optional).

Other paths count completions; here total_cpl is updated on enqueue. Consider incrementing on dequeues to keep consistency.
-			while (enq < sz) {
+			while (enq < sz) {
 				do {
 					deq = rte_dma_dequeue_ops(dev_id, 0, op, DEQ_SZ);
-					tdeq += deq;
+					tdeq += deq;
+					worker_info->total_cpl += deq;
 				} while (deq);
@@
-			tenq += enq;
-
-			worker_info->total_cpl += enq;
+			tenq += enq;
lib/dmadev/rte_dmadev.h (5)
282-293: Config flags addition is good, but update related docs

Flags look fine. Please also update any lingering references to struct fields (e.g., “silent_mode”) to point to RTE_DMA_CFG_FLAG_SILENT in this header.

360-362: Public struct change may break ABI/API

Replacing former config fields with a flags uint64_t changes the layout of public struct rte_dma_conf. Confirm ABI policy/major version window and update release notes accordingly.

805-861: Public rte_dma_op FAM should use [] instead of [0]

Zero-length arrays are non‑standard. Prefer a C99 flexible array member for portability.

Apply this diff:
-    struct rte_dma_sge src_dst_seg[0];
+    struct rte_dma_sge src_dst_seg[];
1221-1229: Doc nit: refer to the new config flag name

Replace “rte_dma_conf::enable_enq_deq” with “RTE_DMA_CFG_FLAG_ENQ_DEQ in rte_dma_conf::flags” to match the new API.

1261-1264: Doc nit: refer to the new config flag name

Replace “rte_dma_conf::enable_enq_deq” with “RTE_DMA_CFG_FLAG_ENQ_DEQ in rte_dma_conf::flags”.
lib/argparse/rte_argparse.c (1)
711-717: argv compaction: consider null-terminating the handled slice

After moving args, set argv[ret] = NULL to terminate the handled portion. Not required for correctness here, but improves safety for downstream consumers that expect a NULL sentinel.

Apply this diff:
-    ret = j; /* only return args actually handled */
+    ret = j; /* only return args actually handled */
+    argv[ret] = NULL;
     /* Then put contents of the args_to_move array into the argv in the space left. */
Also applies to: 801-813

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b403dd and b1502b3.

📒 Files selected for processing (91)

.mailmap (2 hunks)
app/dumpcap/main.c (1 hunks)
app/test-dma-perf/benchmark.c (9 hunks)
app/test-dma-perf/config.ini (2 hunks)
app/test-dma-perf/main.c (3 hunks)
app/test-dma-perf/main.h (1 hunks)
app/test-eventdev/test_perf_common.c (2 hunks)
app/test-eventdev/test_perf_common.h (2 hunks)
app/test-pmd/cmdline.c (4 hunks)
app/test-pmd/parameters.c (1 hunks)
app/test-pmd/testpmd.c (2 hunks)
app/test-pmd/testpmd.h (3 hunks)
app/test/process.h (1 hunks)
app/test/test_argparse.c (5 hunks)
app/test/test_dmadev.c (7 hunks)
app/test/test_dmadev_api.c (12 hunks)
app/test/test_event_dma_adapter.c (3 hunks)
app/test/test_func_reentrancy.c (1 hunks)
app/test/test_pcapng.c (2 hunks)
app/test/test_string_fns.c (2 hunks)
benchmark_report.txt (1 hunks)
config/meson.build (1 hunks)
config/riscv/meson.build (1 hunks)
doc/guides/contributing/design.rst (1 hunks)
doc/guides/linux_gsg/eal_args.include.rst (5 hunks)
doc/guides/prog_guide/argparse_lib.rst (5 hunks)
doc/guides/prog_guide/dmadev.rst (1 hunks)
doc/guides/prog_guide/eventdev/event_dma_adapter.rst (2 hunks)
doc/guides/rel_notes/release_25_11.rst (3 hunks)
doc/guides/testpmd_app_ug/run_app.rst (1 hunks)
doc/guides/testpmd_app_ug/testpmd_funcs.rst (3 hunks)
doc/guides/tools/dmaperf.rst (3 hunks)
drivers/bus/pci/windows/pci.c (1 hunks)
drivers/bus/pci/windows/pci_netuio.h (1 hunks)
drivers/crypto/ipsec_mb/pmd_aesni_gcm.c (2 hunks)
drivers/crypto/ipsec_mb/pmd_aesni_mb.c (2 hunks)
drivers/crypto/ipsec_mb/pmd_snow3g.c (1 hunks)
drivers/crypto/ipsec_mb/pmd_zuc.c (1 hunks)
drivers/dma/cnxk/cnxk_dmadev.c (8 hunks)
drivers/dma/cnxk/cnxk_dmadev.h (2 hunks)
drivers/dma/cnxk/cnxk_dmadev_fp.c (13 hunks)
drivers/dma/dpaa/dpaa_qdma.c (1 hunks)
drivers/dma/dpaa2/dpaa2_qdma.c (1 hunks)
drivers/dma/hisilicon/hisi_dmadev.c (4 hunks)
drivers/dma/hisilicon/hisi_dmadev.h (2 hunks)
drivers/dma/idxd/idxd_bus.c (1 hunks)
drivers/event/dlb2/dlb2_priv.h (1 hunks)
drivers/event/dlb2/pf/base/dlb2_resource.c (1 hunks)
drivers/net/mlx5/windows/mlx5_ethdev_os.c (1 hunks)
lib/argparse/rte_argparse.c (14 hunks)
lib/argparse/rte_argparse.h (3 hunks)
lib/bbdev/rte_bbdev.c (2 hunks)
lib/dmadev/rte_dmadev.c (5 hunks)
lib/dmadev/rte_dmadev.h (4 hunks)
lib/dmadev/rte_dmadev_core.h (2 hunks)
lib/dmadev/rte_dmadev_trace.h (1 hunks)
lib/dmadev/rte_dmadev_trace_fp.h (1 hunks)
lib/dmadev/rte_dmadev_trace_points.c (1 hunks)
lib/eal/common/eal_common_lcore.c (2 hunks)
lib/eal/common/eal_common_memory.c (1 hunks)
lib/eal/common/eal_common_options.c (20 hunks)
lib/eal/common/eal_common_proc.c (1 hunks)
lib/eal/common/eal_common_thread.c (1 hunks)
lib/eal/common/eal_option_list.h (1 hunks)
lib/eal/common/eal_options.h (1 hunks)
lib/eal/common/eal_private.h (1 hunks)
lib/eal/common/eal_thread.h (1 hunks)
lib/eal/freebsd/eal.c (13 hunks)
lib/eal/include/rte_eal.h (0 hunks)
lib/eal/include/rte_memory.h (1 hunks)
lib/eal/include/rte_string_fns.h (1 hunks)
lib/eal/linux/eal.c (14 hunks)
lib/eal/linux/eal_memory.c (1 hunks)
lib/eal/meson.build (1 hunks)
lib/eal/riscv/include/rte_memcpy.h (2 hunks)
lib/eal/unix/meson.build (1 hunks)
lib/eal/unix/rte_basename.c (1 hunks)
lib/eal/windows/eal.c (9 hunks)
lib/eal/windows/meson.build (1 hunks)
lib/eal/windows/rte_basename.c (1 hunks)
lib/eal/x86/rte_power_intrinsics.c (1 hunks)
lib/eventdev/rte_event_dma_adapter.c (8 hunks)
lib/eventdev/rte_event_dma_adapter.h (0 hunks)
lib/graph/graph.c (2 hunks)
lib/graph/graph_pcap.c (1 hunks)
lib/graph/rte_graph.h (2 hunks)
lib/graph/rte_graph_model_mcore_dispatch.c (1 hunks)
lib/graph/rte_graph_worker_common.h (1 hunks)
lib/meson.build (1 hunks)
lib/pcapng/rte_pcapng.c (2 hunks)
lib/pcapng/rte_pcapng.h (3 hunks)

💤 Files with no reviewable changes (2)

lib/eventdev/rte_event_dma_adapter.h
lib/eal/include/rte_eal.h

🧰 Additional context used

🧬 Code graph analysis (25)

lib/eal/common/eal_private.h (1)

lib/eal/common/eal_common_options.c (2)

eal_collate_args (218-256)

eal_cpuset_to_str (1619-1685)

lib/eal/windows/rte_basename.c (1)

lib/eal/unix/rte_basename.c (1)

rte_basename (14-37)

app/test/test_string_fns.c (2)

lib/eal/unix/rte_basename.c (1)

rte_basename (14-37)

lib/eal/windows/rte_basename.c (1)

rte_basename (9-53)

lib/dmadev/rte_dmadev_trace_fp.h (1)

lib/eal/include/rte_trace_point.h (2)

void (319-347)

void (349-363)

lib/eal/common/eal_common_lcore.c (1)

lib/eal/common/eal_common_options.c (1)

eal_cpuset_to_str (1619-1685)

app/test-pmd/testpmd.c (1)

app/test-pmd/cmdline.c (3)

i (2355-2380)

i (6373-6375)

cmdline_read_from_file (14199-14253)

drivers/dma/cnxk/cnxk_dmadev.h (1)

drivers/dma/cnxk/cnxk_dmadev_fp.c (3)

cnxk_dma_ops_enqueue (678-738)

cn10k_dma_ops_enqueue (740-787)

cnxk_dma_ops_dequeue (789-821)

app/test/test_func_reentrancy.c (1)

lib/eal/include/rte_lcore.h (1)

rte_lcore_id (77-81)

lib/eal/unix/rte_basename.c (1)

lib/eal/windows/rte_basename.c (1)

rte_basename (9-53)

lib/argparse/rte_argparse.h (1)

lib/argparse/rte_argparse.c (1)

rte_argparse_print_help (903-919)

lib/graph/graph.c (1)

lib/graph/rte_graph_model_mcore_dispatch.c (1)

graph_sched_wq_create (9-62)

lib/eal/freebsd/eal.c (3)

lib/eal/linux/eal.c (1)

rte_eal_init_alert (482-485)

lib/eal/windows/eal.c (1)

rte_eal_init_alert (91-95)

lib/eal/common/eal_common_options.c (7)

eal_save_args (319-323)

eal_save_args (360-406)

eal_collate_args (218-256)

eal_parse_log_options (1727-1769)

eal_parse_args (1894-2244)

eal_clean_saved_args (325-329)

eal_clean_saved_args (408-428)

lib/eal/include/rte_string_fns.h (2)

lib/eal/unix/rte_basename.c (1)

rte_basename (14-37)

lib/eal/windows/rte_basename.c (1)

rte_basename (9-53)

app/test/test_dmadev_api.c (1)

lib/dmadev/rte_dmadev.h (2)

rte_dma_enqueue_ops (1241-1258)

rte_dma_dequeue_ops (1276-1293)

lib/eal/linux/eal.c (4)

lib/eal/freebsd/eal.c (2)

rte_eal_init_alert (400-403)

rte_config_init (291-330)

lib/eal/windows/eal.c (1)

rte_eal_init_alert (91-95)

lib/eal/common/eal_common_options.c (10)

eal_save_args (319-323)

eal_save_args (360-406)

eal_collate_args (218-256)

eal_parse_log_options (1727-1769)

eal_parse_args (1894-2244)

eal_plugins_init (553-557)

eal_plugins_init (715-751)

eal_option_device_parse (456-474)

eal_clean_saved_args (325-329)

eal_clean_saved_args (408-428)

lib/eal/common/eal_common_lcore.c (1)

rte_eal_cpu_init (150-232)

app/test/test_argparse.c (1)

lib/argparse/rte_argparse.c (2)

rte_argparse_parse (922-961)

rte_argparse_parse_type (964-979)

drivers/dma/cnxk/cnxk_dmadev.c (2)

drivers/dma/cnxk/cnxk_dmadev_fp.c (7)

cnxk_dmadev_copy (242-296)

cnxk_dmadev_copy_sg (298-352)

cn10k_dmadev_copy (354-398)

cn10k_dmadev_copy_sg (400-441)

cnxk_dma_ops_enqueue (678-738)

cnxk_dma_ops_dequeue (789-821)

cn10k_dma_ops_enqueue (740-787)

drivers/common/cnxk/roc_model.h (1)

roc_model_is_cn10k (139-149)

app/test-pmd/testpmd.h (1)

app/test-pmd/cmdline.c (1)

cmdline_read_from_file (14199-14253)

app/test/test_dmadev.c (3)

lib/mempool/rte_mempool.c (1)

rte_mempool_create (984-1017)

lib/mbuf/rte_mbuf.h (2)

rte_pktmbuf_alloc_bulk (1006-1046)

rte_pktmbuf_free (1492-1504)

lib/dmadev/rte_dmadev.h (2)

rte_dma_enqueue_ops (1241-1258)

rte_dma_dequeue_ops (1276-1293)

lib/eal/common/eal_options.h (4)

lib/eal/common/eal_common_options.c (12)

eal_parse_log_options (1727-1769)

eal_parse_args (1894-2244)

eal_option_device_parse (456-474)

eal_adjust_config (2289-2305)

eal_cleanup_config (2279-2287)

eal_plugins_init (553-557)

eal_plugins_init (715-751)

eal_save_args (319-323)

eal_save_args (360-406)

eal_clean_saved_args (325-329)

eal_clean_saved_args (408-428)

handle_eal_info_request (338-358)

lib/eal/linux/eal.c (1)

eal_proc_type_detect (348-372)

lib/eal/freebsd/eal.c (1)

eal_proc_type_detect (264-288)

lib/eal/windows/eal.c (1)

eal_proc_type_detect (46-76)

app/test-dma-perf/benchmark.c (1)

lib/dmadev/rte_dmadev.h (2)

rte_dma_enqueue_ops (1241-1258)

rte_dma_dequeue_ops (1276-1293)

lib/eal/windows/eal.c (1)

lib/eal/common/eal_common_options.c (8)

eal_save_args (319-323)

eal_save_args (360-406)

eal_collate_args (218-256)

eal_parse_log_options (1727-1769)

eal_parse_args (1894-2244)

eal_option_device_parse (456-474)

eal_clean_saved_args (325-329)

eal_clean_saved_args (408-428)

lib/eal/common/eal_common_thread.c (1)

lib/eal/common/eal_common_options.c (1)

eal_cpuset_to_str (1619-1685)

lib/eal/riscv/include/rte_memcpy.h (1)

lib/eal/arm/include/rte_memcpy_32.h (1)

rte_memcpy (297-301)

lib/eal/common/eal_common_options.c (3)

lib/argparse/rte_argparse.c (3)

rte_argparse_print_help (903-919)

rte_argparse_parse (922-961)

rte_argparse_parse_type (964-979)

lib/eal/linux/eal.c (2)

i (847-874)

eal_proc_type_detect (348-372)

lib/eal/common/eal_common_lcore.c (1)

rte_get_next_lcore (106-123)

coderabbitai · 2025-10-16T10:43:59Z

app/test-pmd/cmdline.c

+int
+cmdline_read_from_file(const char *filename, bool echo)
 {
 	struct cmdline *cl;
 	int fd = -1;
+	int ret = 0;

 	/* cmdline_file_new does not produce any output
 	 * so when echoing is requested we open filename directly
 	 * and then pass that to cmdline_new with stdout as the output path.
 	 */
-	if (!echo_cmdline_file) {
+	if (!echo) {
 		cl = cmdline_file_new(main_ctx, "testpmd> ", filename);
 	} else {
+		/* use basename(filename) as prompt */
+		char prompt[32] = "[] ";
+
+		rte_basename(filename, &prompt[1], sizeof(prompt) - strlen(prompt));
+		strlcat(prompt, "] ", sizeof(prompt));
+
 		fd = open(filename, O_RDONLY);
 		if (fd < 0) {
 			fprintf(stderr, "Failed to open file %s: %s\n",
 				filename, strerror(errno));
-			return;
+			return -1;
 		}

-		cl = cmdline_new(main_ctx, "testpmd> ", fd, STDOUT_FILENO);
+		cl = cmdline_new(main_ctx, prompt, fd, STDOUT_FILENO);
 	}
 	if (cl == NULL) {
 		fprintf(stderr,
 			"Failed to create file based cmdline context: %s\n",
 			filename);
+		ret = -1;
 		goto end;
 	}

 	cmdline_interact(cl);
-	cmdline_quit(cl);
+	/* when done, if we have echo, we only need to print end of file,
+	 * but if no echo, we need to use printf and include the filename.
+	 */
+	if (echo)
+		cmdline_printf(cl, "<End-Of-File>\n");
+	else
+		printf("Finished reading CLI commands from %s\n", filename);

+	cmdline_quit(cl);
 	cmdline_free(cl);

-	printf("Read CLI commands from %s\n", filename);

 end:
 	if (fd >= 0)
 		close(fd);
+	return ret;
 }


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Possible FD ownership and stdout-close hazard in cmdline_read_from_file (echo path)

When echo is true, cmdline_new is created with s_out = STDOUT_FILENO and s_in = fd. Many cmdline implementations close s_in/s_out on cmdline_free. If so:

cmdline_free could close STDOUT_FILENO, breaking subsequent output.

cmdline_free may also close the input fd while this function also closes it at end, risking double-close.

Recommend dup the FDs before passing them to cmdline_new, and only close your originals here. Also avoid handing cmdline_new a raw STDOUT by duplicating it.

Suggested patch:

int cmdline_read_from_file(const char *filename, bool echo) { struct cmdline *cl; int fd = -1; + int in_fd = -1; + int out_fd = -1; int ret = 0; @@ - if (!echo) { + if (!echo) { cl = cmdline_file_new(main_ctx, "testpmd> ", filename); } else { /* use basename(filename) as prompt */ char prompt[32] = "[] "; rte_basename(filename, &prompt[1], sizeof(prompt) - strlen(prompt)); strlcat(prompt, "] ", sizeof(prompt)); fd = open(filename, O_RDONLY); if (fd < 0) { fprintf(stderr, "Failed to open file %s: %s\n", filename, strerror(errno)); return -1; } - cl = cmdline_new(main_ctx, prompt, fd, STDOUT_FILENO); + in_fd = dup(fd); + if (in_fd < 0) { + fprintf(stderr, "Failed to dup input fd for %s: %s\n", + filename, strerror(errno)); + ret = -1; + goto end; + } + out_fd = dup(STDOUT_FILENO); + if (out_fd < 0) { + fprintf(stderr, "Failed to dup stdout: %s\n", strerror(errno)); + ret = -1; + goto end; + } + cl = cmdline_new(main_ctx, prompt, in_fd, out_fd); + if (cl == NULL) { + fprintf(stderr, + "Failed to create file based cmdline context: %s\n", + filename); + ret = -1; + goto end; + } } - if (cl == NULL) { - fprintf(stderr, - "Failed to create file based cmdline context: %s\n", - filename); - ret = -1; - goto end; - } @@ - cmdline_quit(cl); - cmdline_free(cl); + cmdline_quit(cl); + cmdline_free(cl); + /* cl took ownership of in_fd/out_fd (dups). Prevent closing them below. */ + in_fd = -1; + out_fd = -1; end: if (fd >= 0) close(fd); + if (in_fd >= 0) + close(in_fd); + if (out_fd >= 0) + close(out_fd); return ret; }

This prevents stdout from being closed and avoids double-closing the input stream.

To confirm whether cmdline_free closes s_in/s_out in your environment, you can grep the cmdline implementation:

🏁 Script executed:

#!/bin/bash # Inspect cmdline_free to see if it closes file descriptors rg -n -C2 -g '!**/build/**' -g '!**/node_modules/**' 'cmdline_free\s*\(' rg -n -C3 -g '!**/build/**' -g '!**/node_modules/**' 'close\s*\(\s*(.*s_in|.*s_out|.*fd' lib cmdline

Length of output: 3991

I need to examine the cmdline implementation to verify the FD handling concerns. Let me search more carefully.

--- 🏁 Script executed: ```shell #!/bin/bash # View the cmdline_free implementation cat lib/cmdline/cmdline.c | head -150 | tail -50

Length of output: 797

🏁 Script executed:

#!/bin/bash # Search for the cmdline struct definition and s_in/s_out fields rg -n -B2 -A10 'struct cmdline' lib/cmdline/cmdline.c | head -60

Length of output: 1400

🏁 Script executed:

#!/bin/bash # Search for close() calls in cmdline.c to see if FDs are closed rg -n 'close\s*\(' lib/cmdline/cmdline.c

Length of output: 46

🏁 Script executed:

#!/bin/bash # Look for s_in or s_out usage in cmdline_free or related functions rg -n -A5 'cmdline_free' lib/cmdline/cmdline.c

Length of output: 146

Avoid double-closing the input FD (no need to dup STDOUT_FILENO)
cmdline_free closes s_in if >2 but skips FDs ≤2, so passing your raw fd to cmdline_new and then calling close(fd) at the end will double-close the file. It also never closes STDOUT_FILENO, so duplicating STDOUT_FILENO isn’t needed. Fix by either dup’ing only the input FD before cmdline_new (and closing the original) or by setting fd = -1 after cmdline_free to skip the final close(fd).

🤖 Prompt for AI Agents

In app/test-pmd/cmdline.c around lines 14199-14253, the code passes the raw input fd to cmdline_new and later closes fd again, risking a double-close because cmdline_free may close s_in for FDs > 2; fix by duplicating only the input FD before calling cmdline_new (int in_dup = dup(fd)), pass in_dup to cmdline_new while immediately closing the original fd, and avoid duplicating or touching STDOUT_FILENO so it is not closed; alternatively, if you prefer minimal change, set fd = -1 after cmdline_free to prevent the final close(fd) from running.

coderabbitai · 2025-10-16T10:43:59Z

app/test/test_dmadev.c

+#define ROUNDS	   2E7
+#define CPY_LEN	   64
+	struct rte_mempool *ops_pool, *pkt_pool;


⚠️ Potential issue | 🟠 Major

ROUNDS: avoid floating literal and reduce test duration

2E7 is a double, causing implicit conversions. Also 20M/max iterations is excessive for app/test. Use an integer literal and consider a lower default.

Apply this minimal type fix:

-#define ROUNDS 2E7 +#define ROUNDS 20000000U

Optionally reduce to a smaller value or derive from capabilities to keep tests fast.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#define ROUNDS 2E7

#define CPY_LEN 64

struct rte_mempool *ops_pool, *pkt_pool;

#define ROUNDS 20000000U

#define CPY_LEN 64

struct rte_mempool *ops_pool, *pkt_pool;

🤖 Prompt for AI Agents

In app/test/test_dmadev.c around lines 1059-1061, the macro ROUNDS is defined as the floating-point literal 2E7 which triggers implicit conversions and is unnecessarily large for tests; change ROUNDS to an integer literal (for example 20000000U) or a smaller integer default (e.g., 2000000U) and ensure any code using ROUNDS expects an integer type (use size_t/unsigned long as appropriate) to avoid floating-to-integer conversions and shorten test duration.

coderabbitai · 2025-10-16T10:44:00Z

app/test/test_dmadev.c

+	max = info.max_desc - BURST_SIZE;
+	tenq = 0;


⚠️ Potential issue | 🟠 Major

Guard against too-small descriptor rings

max = info.max_desc - BURST_SIZE can be <= 0; loops using n != max will misbehave. Add a capacity check.

Apply this diff:

- max = info.max_desc - BURST_SIZE; + if (info.max_desc <= BURST_SIZE) + ERR_RETURN("Device queue too small for BURST_SIZE=%u (max_desc=%u)\n", + BURST_SIZE, info.max_desc); + max = info.max_desc - BURST_SIZE;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

max = info.max_desc - BURST_SIZE;

tenq = 0;

if (info.max_desc <= BURST_SIZE)

ERR_RETURN("Device queue too small for BURST_SIZE=%u (max_desc=%u)\n",

BURST_SIZE, info.max_desc);

max = info.max_desc - BURST_SIZE;

tenq = 0;

🤖 Prompt for AI Agents

In app/test/test_dmadev.c around lines 1086-1087, the computation max = info.max_desc - BURST_SIZE can produce a non-positive value which breaks loops that iterate until n != max; add a capacity check immediately after reading info.max_desc to ensure info.max_desc > BURST_SIZE. If the check fails, handle it by logging/printing a clear message and skipping or aborting the test (return non-zero or exit) rather than continuing with a non-positive max; otherwise compute max as before and proceed.

coderabbitai · 2025-10-16T10:44:00Z

app/test/test_dmadev.c

+				memset((void *)(uintptr_t)ops[j]->src_dst_seg[0].addr,
+				       rte_rand() & 0xFF, CPY_LEN);
+				memset((void *)(uintptr_t)ops[j]->src_dst_seg[1].addr, 0, CPY_LEN);
+			}


⚠️ Potential issue | 🔴 Critical

IOVA used as VA for memset/memcmp — breakage when IOVA != VA

memset/memcmp on (void *)(uintptr_t)addr assumes VA==IOVA. Use mbuf data pointers.

Apply this diff:

- memset((void *)(uintptr_t)ops[j]->src_dst_seg[0].addr, - rte_rand() & 0xFF, CPY_LEN); - memset((void *)(uintptr_t)ops[j]->src_dst_seg[1].addr, 0, CPY_LEN); + char *src_va = rte_pktmbuf_mtod(mbufs[j], char *); + char *dst_va = rte_pktmbuf_mtod(mbufs[j + BURST_SIZE], char *); + memset(src_va, rte_rand() & 0xFF, CPY_LEN); + memset(dst_va, 0, CPY_LEN); @@ - /* check the data is correct */ - ssg = ops[j]->src_dst_seg[0]; - dsg = ops[j]->src_dst_seg[1]; - if (memcmp((void *)(uintptr_t)ssg.addr, (void *)(uintptr_t)dsg.addr, - ssg.length) != 0) + /* check the data is correct */ + struct rte_mbuf *src_m = (struct rte_mbuf *)(uintptr_t)ops[j]->user_meta; + struct rte_mbuf *dst_m = (struct rte_mbuf *)(uintptr_t)ops[j]->event_meta; + char *src_va2 = rte_pktmbuf_mtod(src_m, char *); + char *dst_va2 = rte_pktmbuf_mtod(dst_m, char *); + if (memcmp(src_va2, dst_va2, CPY_LEN) != 0) ERR_RETURN("Error with copy operation\n");

Also applies to: 1144-1151

🤖 Prompt for AI Agents

In app/test/test_dmadev.c around lines 1113-1116 (and similarly 1144-1151), the code casts src_dst_seg[].addr (an IOVA) to a VA for memset/memcmp which breaks when IOVA != VA; instead obtain the packet buffer's virtual data pointer (use the mbuf associated with the segment and rte_pktmbuf_mtod or rte_pktmbuf_mtod_offset to compute the VA at the segment offset), ensure CPY_LEN does not exceed the mbuf data_len, and call memset/memcmp on that VA; update both occurrences accordingly so you use mbuf data pointers rather than casting IOVA to a pointer.

coderabbitai · 2025-10-16T10:44:00Z

app/test/test_dmadev.c

+	printf("\n");
+
+	rte_mempool_free(pkt_pool);
+	rte_mempool_free(ops_pool);
+
+	return 0;


⚠️ Potential issue | 🟠 Major

Free resources on early error paths

ERR_RETURN returns immediately and leaks pkt_pool/ops_pool. Convert to a cleanup path that always frees pools.

Example pattern:

- rte_mempool_free(pkt_pool); - rte_mempool_free(ops_pool); - return 0; +out: + rte_mempool_free(pkt_pool); + rte_mempool_free(ops_pool); + return ret; +err: + ret = -1; + goto out;

Replace ERR_RETURN inside this function with setting ret and goto err.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In app/test/test_dmadev.c around lines 1159-1164, several ERR_RETURN calls return early and leak pkt_pool and ops_pool; change those ERR_RETURN occurrences to set an appropriate ret value and goto err so that the common cleanup at the end (which frees pkt_pool and ops_pool) always runs. Update the function so the err label performs rte_mempool_free(pkt_pool) and rte_mempool_free(ops_pool) only if the pools are non-NULL, and ensure all early-exit paths jump to err instead of returning directly.

coderabbitai · 2025-10-16T10:44:03Z

lib/eal/unix/rte_basename.c

+	/* basename is on the end, so if path is too long, use only last PATH_MAX bytes */
+	const size_t pathlen = strlen(path);
+	if (pathlen > PATH_MAX)
+		path = &path[pathlen - PATH_MAX];


⚠️ Potential issue | 🟡 Minor

Truncation may cut path mid-component.

When a path exceeds PATH_MAX, the code truncates to the last PATH_MAX bytes. This could split a path component (e.g., /very/long/path/dirname might become name if truncated mid-component), potentially producing incorrect basename results.

Consider documenting this edge case behavior or exploring alternative truncation strategies that preserve path component boundaries.

🤖 Prompt for AI Agents

In lib/eal/unix/rte_basename.c around lines 23 to 26, the current truncation to the last PATH_MAX bytes can cut a path component in half and yield an incorrect basename; change the truncation so it preserves component boundaries by, when pathlen > PATH_MAX, locating the first '/' character inside the retained tail (i.e., find the first '/' at or after pathlen - PATH_MAX) and set path to the character after that slash (or to the start of the retained tail if no slash is found), or alternatively document this edge-case behavior in the function comment if preserving boundaries is undesired.

coderabbitai · 2025-10-16T10:44:03Z

lib/eal/windows/rte_basename.c

+	int ret = _splitpath_s(path, NULL, 0, dir, sizeof(dir),
+			fname, sizeof(fname), ext, sizeof(ext));
+	if (ret != 0)
+		return (size_t)-1;


⚠️ Potential issue | 🟡 Minor

Error return value (size_t)-1 may cause confusion.

Returning (size_t)-1 on error results in a very large positive number (SIZE_MAX) rather than a negative error code, which may not be the expected behavior for callers checking for errors.

Consider documenting this error return value clearly in the function's public header documentation, or using a different error indication mechanism if the API allows it.

🤖 Prompt for AI Agents

In lib/eal/windows/rte_basename.c around lines 26-29 the function returns (size_t)-1 on error (which is SIZE_MAX and can be confusing); either update the function's public header/doc comment to explicitly state that the function returns SIZE_MAX on error and callers must check for that value, or change the implementation to return a more conventional error indicator (e.g., return 0 and set errno appropriately such as EINVAL) and update all callers/tests to handle the new convention; if you choose the latter, add errno.h include and ensure the header doc matches the new behavior.

coderabbitai · 2025-10-16T10:44:03Z

lib/eal/x86/rte_power_intrinsics.c

 static void amd_mwaitx(const uint64_t timeout)
 {
-	RTE_SET_USED(timeout);
 #if defined(RTE_TOOLCHAIN_MSVC) || defined(__MWAITX__)
-	_mm_mwaitx(0, 0, 0);
+	_mm_mwaitx(2, 0, (uint32_t)timeout);
 #else
 	asm volatile(".byte 0x0f, 0x01, 0xfb;"
 			: /* ignore rflags */
 			: "a"(0), /* enter C1 */
-			"c"(0)); /* no time-out */
+			"b"((uint32_t)timeout),
+			"c"(2)); /* enable time-out */
 #endif
 }


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Verify timeout behavior with test coverage.

The previous implementation completely ignored the timeout parameter (hence the removed RTE_SET_USED(timeout) mentioned in the AI summary). The new implementation enables timeout functionality for AMD MWAITX. This is a significant behavioral change that requires thorough testing.

Please ensure:

Unit tests validate that MWAITX wakes up correctly when the timeout expires

Tests verify timeout precision (especially edge cases near uint32_t limits)

Tests confirm consistent timeout behavior between Intel and AMD code paths

Integration tests validate the power monitor API behavior on AMD platforms

Run the following script to check for existing test coverage:

🏁 Script executed:

#!/bin/bash # Description: Search for test coverage of amd_mwaitx and power monitor timeout functionality # Search for test files related to power intrinsics rg -t c -t h --files | rg -i 'test.*power|power.*test' # Search for test cases covering mwaitx or timeout behavior rg -n -A 10 -B 2 'amd_mwaitx|MWAITX|power.*timeout' --type c --type h -g '**/*test*' # Search for references to rte_power_monitor in test code rg -n -A 5 'rte_power_monitor\(' --type c -g '**/*test*'

Length of output: 115

🏁 Script executed:

#!/bin/bash # Examine main power test file for timeout coverage echo "=== Checking test_power.c for power monitor and timeout tests ===" cat app/test/test_power.c | head -100 echo -e "\n=== Searching for test functions related to power monitor ===" rg -n 'test_power|power_monitor' app/test/test_power.c | head -20 echo -e "\n=== Checking for timeout-related test logic ===" rg -n -C 5 'timeout' app/test/test_power.c echo -e "\n=== Checking for AMD-specific test or initialization ===" rg -n -C 3 'amd|AMD' app/test/test_power.c echo -e "\n=== Verifying if amd_mwaitx is tested elsewhere ===" rg -r 'amd_mwaitx' app/test/

Length of output: 9887

🏁 Script executed:

#!/bin/bash # Broader search for power_monitor and intrinsics testing echo "=== Search for rte_power_monitor tests ===" rg -r 'rte_power_monitor' app/test/ --type c echo -e "\n=== Search for power intrinsics test files ===" find app/test -name '*intrinsic*' -o -name '*power_intr*' echo -e "\n=== Check if power monitor timeout API is tested at all ===" rg -r 'rte_power_monitor\|rte_power_pause\|power_monitor' app/test/ --type c -C 2 echo -e "\n=== Look at other power test files for timeout coverage ===" ls -la app/test/test_power*.c echo -e "\n=== Check header for amd_mwaitx signature ===" rg -n 'amd_mwaitx' lib/eal/x86/rte_power_intrinsics.c -B 2 -A 8

Length of output: 3504

Add test coverage for AMD mwaitx timeout behavior

No existing tests cover amd_mwaitx or its timeout parameter.

Add unit tests to verify wake-up on timeout expiry, precision near uint32_t limits, and parity with the Intel path.

Add integration tests to validate rte_power_monitor timeout behavior on AMD platforms.

coderabbitai · 2025-10-16T10:44:03Z

lib/eal/x86/rte_power_intrinsics.c

-	RTE_SET_USED(timeout);
 #if defined(RTE_TOOLCHAIN_MSVC) || defined(__MWAITX__)
-	_mm_mwaitx(0, 0, 0);
+	_mm_mwaitx(2, 0, (uint32_t)timeout);


⚠️ Potential issue | 🔴 Critical

Critical: Semantic mismatch between absolute TSC timestamp and relative timeout interval.

The amd_mwaitx function receives a 64-bit timeout parameter that appears to be an absolute TSC timestamp (based on the caller passing tsc_timestamp at line 209), consistent with Intel's umwait instruction usage (lines 56-69). However, AMD's MWAITX instruction expects a relative timeout interval in EBX, not an absolute timestamp.

Issues:

Semantic mismatch: Passing an absolute TSC timestamp as a relative interval will cause incorrect timeout behavior. For example, if the current TSC is 1,000,000,000 and the target timestamp is 1,000,100,000 (100K ticks in the future), truncating to uint32_t and using as a relative interval could produce a timeout of either ~100K ticks (if lucky with truncation) or a wildly incorrect value.

Truncation hazard: Casting uint64_t to uint32_t (lines 93, 98) discards the upper 32 bits. For absolute TSC values (typically in the billions or trillions), the truncated value will be incorrect.

Inconsistent behavior: Intel UMWAIT and AMD MWAITX paths now have different timeout semantics, breaking API uniformity.

Expected behavior:

For AMD MWAITX, you should:

Convert the absolute tsc_timestamp to a relative interval: timeout_interval = tsc_timestamp - current_tsc

Clamp to uint32_t max if the interval exceeds 32-bit range

Pass the relative interval to MWAITX

Apply this diff to fix the relative timeout calculation:

static void amd_mwaitx(const uint64_t timeout) { + uint64_t cur_tsc = rte_rdtsc(); + uint64_t interval = (timeout > cur_tsc) ? (timeout - cur_tsc) : 0; + /* MWAITX timer is 32-bit; clamp if necessary */ + uint32_t timeout_val = (interval > UINT32_MAX) ? UINT32_MAX : (uint32_t)interval; + #if defined(RTE_TOOLCHAIN_MSVC) || defined(__MWAITX__) - _mm_mwaitx(2, 0, (uint32_t)timeout); + _mm_mwaitx(2, 0, timeout_val); #else asm volatile(".byte 0x0f, 0x01, 0xfb;" : /* ignore rflags */ : "a"(0), /* enter C1 */ - "b"((uint32_t)timeout), + "b"(timeout_val), "c"(2)); /* enable time-out */ #endif }

Additionally, add #include <rte_cycles.h> for rte_rdtsc() and #include <stdint.h> for UINT32_MAX at the top of the file if not already present.

Also applies to: 98-99

🤖 Prompt for AI Agents

In lib/eal/x86/rte_power_intrinsics.c around lines 93 and 98-99, the code currently passes an absolute 64-bit TSC timestamp (timeout) directly to AMD's _mm_mwaitx after truncating to uint32_t, but MWAITX expects a relative interval; change the logic to read the current TSC via rte_rdtsc(), compute timeout_interval = (timeout > current_tsc) ? (timeout - current_tsc) : 0, clamp the interval to UINT32_MAX if it exceeds 32-bit, and pass the resulting (uint32_t) interval to _mm_mwaitx/_mm_mwaitx_emulation; also add #include <rte_cycles.h> (for rte_rdtsc()) and #include <stdint.h> (for UINT32_MAX) at the top of the file if they are not present.

coderabbitai · 2025-10-16T10:44:03Z

lib/graph/graph.c

+	if (graph_from_id(id) == NULL) {
+		rte_errno = ENOENT;
 		goto fail;
-	if (!rte_lcore_is_enabled(lcore))
-		SET_ERR_JMP(ENOLINK, fail, "lcore %d not enabled", lcore);
+	}


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Verify error code consistency.

The code sets rte_errno = ENOENT when the graph is not found, but graph_from_id() at line 49 sets rte_errno = EINVAL for the same condition. This creates an inconsistency.

Consider either:

Update graph_from_id() to return ENOENT instead of EINVAL for consistency across all callers

Remove line 344 and rely on the error code set by graph_from_id()

Run this script to check if other callers rely on EINVAL from graph_from_id:

🏁 Script executed:

#!/bin/bash # Find all uses of graph_from_id and check error handling patterns ast-grep --pattern $'if (graph_from_id($$$) == NULL) { $$$ }'

Length of output: 159

--- 🏁 Script executed: ```shell #!/bin/bash # Find the definition of graph_from_id and see what it does rg -n "^[a-zA-Z_].*graph_from_id" lib/graph/graph.c | head -20

Length of output: 0

🏁 Script executed:

#!/bin/bash # Find all calls to graph_from_id (not just NULL checks) rg -n "graph_from_id\(" lib/graph/graph.c -C2

Length of output: 845

🏁 Script executed:

#!/bin/bash # Look for rte_errno assignments in graph_from_id function rg -n "EINVAL|ENOENT" lib/graph/graph.c | head -30

Length of output: 289

🏁 Script executed:

#!/bin/bash # Find the actual graph_from_id function definition with context rg -n "static.*graph_from_id|^graph_from_id" lib/graph/graph.c -A10

Length of output: 211

🏁 Script executed:

#!/bin/bash # Check if there are other error handling patterns after graph_from_id calls rg -B2 -A3 "graph_from_id" lib/graph/graph.c

Length of output: 896

Standardize missing-graph error code.
graph_from_id() (line 49) currently sets rte_errno = EINVAL when no match is found, yet the caller at lib/graph/graph.c:343-346 overrides it to ENOENT. Either update graph_from_id() to use ENOENT and remove the override, or drop the override and keep EINVAL everywhere.

🤖 Prompt for AI Agents

In lib/graph/graph.c around lines 343-346, the call path is setting conflicting errno values for a missing graph: graph_from_id() currently sets rte_errno = EINVAL on no-match while the caller here overrides it to ENOENT; standardize this by updating graph_from_id() to set rte_errno = ENOENT when no graph is found and then remove the overriding rte_errno = ENOENT at these lines (so callers rely on graph_from_id() for the canonical error code).

tmonjalo and others added 30 commits October 14, 2025 16:45

test/dma: add enqueue/dequeue operations

af5af1b

Add enqueue dequeue operations tests. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

app/dma-perf: measure enqueue/dequeue operations

e1bfc8d

This patch adds an option to measure performance of enq/deq operations in the benchmark app. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

dma/cnxk: support enqueue/dequeue operations

f0d9791

Implement DMA enqueue/dequeue operations when application enables it via configuration. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

eventdev: refactor DMA adapter operations

77c552f

Migrate all invocations of rte_event_dma_adapter_op API to rte_dma_op. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Amit Prakash Shukla <amitprakashs@marvell.com>

test/dma: adapt ring size to device capabilities

c4b0762

The default ringsize is 512, some DMA devices may not support, this commit support auto adapt ringsize. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

argparse: support core lists

d78103f

Core lists are widely used in DPDK, so add support for parsing them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>

bruce-richardson and others added 16 commits October 15, 2025 17:53

eal: use cpuset to string helper

5144ddd

Use the new cpuset to string function to replace other thread-specific functions which did similar work. Also use it for additional debug logs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

mem: introduce constant-time memory comparison

05bd97b

The new function rte_memeq_timingsafe() is based on FreeBSD API to avoid timing side-channel attacks. Bugzilla ID: 1773 Signed-off-by: Kai Ji <kai.ji@intel.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>

riscv support rte_memcpy in vector

765174e

This patch implements RISC-V vector intrinsics to accelerate memory copy operations for byte range (129~1600). Signed-off-by: Qiguo Chen <chen.qiguo@zte.com.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>

benchmark report for rte_memcpy

b1502b3

Benchmark results show 10~15% reduction in execution time for data sizes (129~1024) Signed-off-by: Qiguo Chen <chen.qiguo@zte.com.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>

sourcery-ai bot reviewed Oct 16, 2025

View reviewed changes

coderabbitai bot reviewed Oct 16, 2025

View reviewed changes

[PWCI] "Optimization Summary for RISC-V rte_memcpy" #195

Are you sure you want to change the base?

[PWCI] "Optimization Summary for RISC-V rte_memcpy" #195

Uh oh!

Conversation

ovsrobot commented Oct 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

ovsrobot commented Oct 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 16, 2025 •

edited

Loading