-
Notifications
You must be signed in to change notification settings - Fork 4
[PWCI] "Optimization Summary for RISC-V rte_memcpy" #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
After an upgrade to MinGW version 13, some compilation errors appear: drivers/bus/pci/windows/pci.c:362:58: error: 'GUID_DEVCLASS_NETUIO' undeclared drivers/bus/pci/windows/pci_netuio.c:57:39: error: 'GUID_DEVINTERFACE_NETUIO' undeclared The cause is MinGW has set NTDDI_VERSION to the highest version without defining the expected NetUIO constants. The case MinGW64 is added to define NetUIO constants. Some comments are improved to better track includes requirements. Fixes: 6605c7f ("bus/pci: fix build with Windows SDK >= 10.0.20253") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
After an upgrade to MinGW version 13, compilation breaks: drivers/net/mlx5/windows/mlx5_ethdev_os.c:285:69: error: 'dev_link.<U1000>.<Uaf00>.link_autoneg' may be used uninitialized This is because link_autoneg is never set in mlx5_link_update(). It can be set to the previous value (no change). Also it does not make sense to check this value to return the update status as it does not change. Fixes: 6fbd737 ("net/mlx5: support link update on Windows") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
After an upgrade to MinGW version 13, compilation breaks: In function 'rte_bbdev_queue_ops_dump': lib/bbdev/rte_bbdev.c:1269:63: error: '%s' directive argument is null [-Werror=format-overflow=] fprintf(f, " Enqueue Status Counters %s %" PRIu64 "\n", The enqueue status string may be null if the index is too high, because RTE_BBDEV_ENQ_STATUS_SIZE_MAX is defined to include padding for future enum insertion. This padding case must be checked to avoid printing a dump of a non-existing status. The type of the variable i is also changed to the enum required by the function rte_bbdev_enqueue_status_str(). Fixes: 353e363 ("bbdev: add queue debug dump") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Add enqueue/dequeue operations that use struct rte_dma_op to communicate with the DMA device. These operations need to be enabled at DMA device configuration time by setting the flag rte_dma_conf::enable_enq_deq if the device supports RTE_DMA_CAPA_OPS_ENQ_DEQ capability. When the DMA device is configured with RTE_DMA_CFG_FLAG_ENQ_DEQ flag, the enqueue/dequeue operations should be used to perform DMA operations. All other operations i.e., rte_dma_copy, rte_dma_copy_sg, rte_dma_fill, rte_dma_submit, rte_dma_completed, rte_dma_completed_status are not supported. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add enqueue dequeue operations tests. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
This patch adds an option to measure performance of enq/deq operations in the benchmark app. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Implement DMA enqueue/dequeue operations when application enables it via configuration. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Migrate all invocations of rte_event_dma_adapter_op API to rte_dma_op. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Amit Prakash Shukla <amitprakashs@marvell.com>
Stop dmadev may fail if there are pending DMA transfers, we need make sure there are no pending DMA transfers when stop. This commit uses following scheme: 1. flag stop proc so that new request will not process. 2. setting drop flag for all descriptor to quick complete. 3. waiting dmadev to complete. Fixes: 3c5f5f0 ("dma/hisilicon: add control path") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
In the runtest() function, it will return error if the test return value is less than 0. But the 'copy' and 'error_handling' testcase may return 1 if failed because its internal use or(||) operation. This commit fix it by treating non-zero as error in runtest() function. Fixes: 1b86a66 ("test/dma: add more comprehensive copy tests") Fixes: 99d7ec4 ("test/dma: add failure handling tests") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
The default ringsize is 512, some DMA devices may not support, this commit support auto adapt ringsize. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There is no standard, cross-platform function to get the basename of a file path across all the supported DPDK platforms, Linux, BSD and Windows. Both Linux and BSD have a "basename" function in standard library, except: * Linux has two different basename functions, a POSIX version (which may or may not modify args), and a GNU one which is guaranteed *not* to modify the input arg and returns pointer to internal storage. * FreeBSD has just the one basename function, but, to be different, it is guaranteed *always* to modify the argument and re-use it for output. * Windows just doesn't have a basename function, but provides _split_path as a similar function, but with many differences over basename, e.g. splitting off extension, returning empty basename if path ends in "/" etc. etc. Therefore, rather than just trying to implement basename for windows, which opens the question as to whether to emulate GNU and *never* modify arg, or emulate BSD and *always* modify arg, this patchset introduces "rte_basename" which should have defined behaviour on all platforms. The patch also introduces a set of test cases to confirm consistent behaviour on all platforms too. The behaviour is as in doxygen docs. Essentially: - does not modify input path buffer - returns output in a separate output buffer - uses snprintf and strlcpy style return value to indicate truncation Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The echoing of commands contained in a file loaded via the "load" command on testpmd CLI is governed by whether or not a cmdline-file or cmdline-file-noecho had been passed to testpmd on the commandline. Remove the use of a global setting for this, and explicitly add a "load_echo" command to match the "load" command. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
While testpmd allows a set of pre-prepared commands to be passed into it at startup via the "cmdline-file" (and cmdline-file-noecho) parameters, this is currently limited to a single file. By extending this support to allow the parameter to be passed multiple (up to 16) times, we enable users to have a library of pre-canned cmdline files and pass multiple of these in to the same run, e.g. have one cmdline file per NIC feature and enable multiple features by passing in multiple filenames. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Two small improvements for the cmdline file processing in testpmd. * Now that we support multiple files, change the prompt to indicate what file is currently being processed, and just print an EOF message when done. * When not echoing, the "Read" verb in the message "Read CLI commands..." is a little ambiguous, as it could mean "I have read", or "Go and read", i.e. job done or job about to start. Tweak the text to "Finished reading" which is unambiguous. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Replace use of the Linux/BSD-only basename function with "rte_basename". Since the DPDK basename function is guaranteed to do a copy into the destination buffer of known length, we can sometimes simplify the code, or make it safer, as part of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Use the new "rte_basename" function which has common behaviour on all supported platforms. As extra benefit, it removes the need for recreating the socket path twice, since rte_basename guarantees not to modify its argument. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Even though the idxd_bus.c file is linux-only, we can simplify the code through using rte_basename rather than just basename. Since rte_basename does not modify its parameter, we can eliminate the strdup and free, and since it also handles NULL sanely, we can remove an extra NULL check. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Previously, the AMD power monitor implementation did not enable the timeout, causing the lcore to remain in a wait state until an external monitoring event occurred or an interrupt was received. This patch enables the timeout-based exit condition, allowing the lcore to automatically wake up after the specified period. The maximum supported timeout value is 2^32 - 1. Fixes: c7ed1ce ("eal/x86: add power intrinsics for AMD") Cc: stable@dpdk.org Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The unit test for EAL init was passing in a NULL argv[0] value which is invalid. Update to have a valid string for argv[0]. Fixes: 104a92b ("app: add reentrancy tests") Fixes: 978ead0 ("test/func_reentrancy: fix EAL init call") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Introduce a string define for the currently running OS, or execution environment. Originally, with old make build system, CONFIG_RTE_EXEC_ENV used to hold this name string, but the variable seems to have been missed in the meson build system, until commit cadb255 ("eal: add OS defines for C conditional checks") which introduced the RTE_EXEC_ENV for a different purpose. Now we can fix the docs with the new name reference. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Make the function to print out the help text for an argparse object a public function, which takes as a new parameter the file stream on which to print. This can be used in future to allow application to extend their own help information. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
When the arguments passed to argparse include -h/--help then usage information is automatically printed. Provide the capability for the user to supply their own help function for this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
The different value types supported by the argparse library are not discussed in the documentation, so add some reference material about them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Core lists are widely used in DPDK, so add support for parsing them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>
Allow argparse to behave like getopt in ignoring non-flag arguments and moving them to the end of the list. This will allow use in DPDK for EAL args where, for example: ./dpdk-test -l 1,2 lcores_autotest --no-huge -- argparse_autotest will return from getopt processing in rte_eal_init as: ./dpdk-test -l 1,2 --no-huge -- lcores_autotest argparse_autotest with EAL init function then replacing the "--" with argv[0] before returning to the app with the return value of 5, for the 5 args (including the "--") processed. Since this is a significant behavioural change for argparse, which, unlike getopt, has the idea of positional arguments, we place this behaviour behind a new flag for argparse. If this new "ignore_non_flag_args" option is set, then positional arguments are disallowed and getopt behaviour is used for non-flag args. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
When passing an optional parameter to a argparse argument, the only way to do so has been to pass the argument using "=arg" on the end of the option, for example "--foo=bar". However, for short options, i.e. those of the form "-x", this requirement is unusual as getopt also supports taking additional characters after the option as the parameter, e.g. "-xarg". Add support for this form of parameter for short options to argparse. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>
To simplify future rework of the EAL arg handling, add a long-option equivalent for each short option that doesn't already have one. When updating the docs with the new long options, standardize the format of options which have both short and long variants, and drop the deprecated service-coremask option from the docs, rather than adding its new long option. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Create eal_option_list.h, containing all the possible EAL parameters, and basic info about them, such as type, whether they take a parameter or not. Each entry is defined using a macro, which will be then interpreted when the file is included. First time this header in included in the eal_common_options.c file, the macros are defined in such a way as to define field elements for an "eal_init_args" structure, where each value is either a string type, if it takes a parameter, or boolean type if it doesn't. For those elements that take multiple values, i.e. are passed multiple times, we put them in a TAILQ. The second time of inclusion is for TAILQ initialization, as it is required before using such object. The third time of inclusion, the macros are defined so as to define the arguments in an rte_argparse structure for EAL. For the basic string and boolean types, we just store the values in the appropriate field in the previous defined "eal_init_args" structure. For the list elements, we use the argparse callback to process those elements, adding them to the TAILQ as they are encountered. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>
DPDK traditionally has iterated through all args and processed them as they appear in the commandline. The arg processing logic can be simplified if instead we initially gather all arguments into a structure which is then processed with the arguments dealt with in a fixed/known order. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: David Marchand <david.marchand@redhat.com>
When rte_eal_init fails part way through, any saved EAL arguments need to be freed, and the run_once flag needs to be set back to zero again. The former task was never done on failure, and the latter was only done on some occasions. Rework the error handling to always go to an err_out label where cleanup is done. To prevent memory leaks from the saved arguments when eal_init is called twice, the check for multiple calls must be done first before the argument saving and parsing is done. This patch modifies all three eal.c files. Windows doesn't actually need changes, since there are no args saved, and no run_once sentinel value, but updating it keeps it consistent with FreeBSD and Linux versions. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Remove the separate function to check combinations of cmdline parameters. Instead, just do those checks when parsing the parameters since we have all info about what parameters are provided at that point. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Use a utility function and macro to simplify the code for checking for conflicting cmdline options. The checking can also be done at the initial argument collating stage, shortening the argument processing function which is very much on the long side. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The existing "available_cores" function in eal_common_options.c has general code for converting a set of cores to a printable string. Generalize this code into an "eal_cpuset_to_str" function which takes a cpuset as parameter, and create a new "available_cores" function using this new utility function. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Use the new cpuset to string function to replace other thread-specific functions which did similar work. Also use it for additional debug logs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Rather than working off arrays of lcore values, update the coremask handling to use rte_cpuset_t. The use of the array provides no additional value for coremasks, since the order of the bits set will always be from lowest core number to highest, making the tracking of what core was 0, 1, 2 etc. pointless. As part of this change, move the prototype for the parse_coremask function from rte_eal.h to eal_options.h, for two reasons: 1. This is an internal function that should not be used by apps and so there is no need to have it in a public header. 2. If the compilation command is run without the _GNU_SOURCE flag, we miss definitions from sched.h and so get build errors for the rte_cpuset_t type when compiling examples against an installed DPDK. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>
Add the new "--remap-lcore-ids" or "-R" option to signify that lcore ids are to start at a low range. This takes an optional parameter to specify the value to start from. Initially, use this to adjust the core mappings when using a coremask. As well as remapping within the range of RTE_MAX_LCORE, to allow use of cores >=RTE_MAX_LCORE we need to expand the internal arrays of cores to be CPU_SETSIZE big, and limit the checks for RTE_MAX_LCORE to only cases where we are not remapping. This requires an additional parameter to the internal API rte_eal_parse_coremask, having a small knock-on effect on DLB driver as the only other user of this API. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>
Allow the "-R" flag to be used with core lists so that we can specify lists with core ids >= RTE_MAX_LCORE in them, e.g. -l 140-148. In order to do so, we limit the use of the "@" symbol for per-core remapping when specifying a core list for automatic/global remapping, rather than dealing with the situation of some cores being manually remapped and others automatically. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
When run without an coremask or core list parameters, EAL detects the working set of cores allowed for the current process and uses that as a core list for use. Currently, this does not work for cores which are numbered >= RTE_MAX_LCORE. We can fix this by allowing the remap option to also be used with these auto-detected cores. In the process we simplify the code by taking advantage of that fact that we know up-front when processing parameters if any core parameter is passed, so we no longer need a flag and to auto-detect at the end. Instead, we take advantage of the fact that our core-arg processing routines now all work with cpusets, to have a simple tri-state option for remapping (if we don't have manual remapping done by --lcores-parameter): * parse coremask to cpuset * parse core list to cpuset * query OS for current thread's cpuset Once that is done, we have common code for processing, and optionally remapping, the resulting cpuset. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The new function rte_memeq_timingsafe() is based on FreeBSD API to avoid timing side-channel attacks. Bugzilla ID: 1773 Signed-off-by: Kai Ji <kai.ji@intel.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Replace memcmp() with rte_memeq_timingsafe() in cryptographic authentication verification operations across ipsec-mb drivers. Note: OpenSSL crypto driver already uses CRYPTO_memcmp() which provides equivalent timing attack resistance and is left unchanged. Note: scheduler driver memcmp stays unchanged as its not secret data comparison and actually faster with no timing attack risk. Bugzilla ID: 1773 Signed-off-by: Kai Ji <kai.ji@intel.com>
Allow the user to specify protocol link type when creating pcapng files. This change is needed to specify the protocol type in the pcapng file, DLT_EN10MB specifies ethernet packets only. This will allow dissectors for other protocols to be used on files generated by pcapng. Includes a breaking change to rte_pcapng_add_interface to add link_type parameter. Existing calls to the function have been updated to pass DLT_EN10MB for the link type argument. Fixes: d1da6d0 ("pcapng: require per-interface information") Signed-off-by: Dylan Schneider <schneide@qti.qualcomm.com> Acked-by: Stephen Hemminger <stephen@networplumber.org>
rte_graph_model_mcore_dispatch_core_bind relied on rte_lcore_is_enabled. Yet, "rte_lcore_is_enabled" only checks for EAL threads, which forbids external threads (NON EAL) to run a part of the graph. Verify if the lcore role is not "ROLE_OFF", and return relevant error code otherwise. Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
In dispatch mode, the assumption is that the graphs will be polled continuously: there is no way to know when packets are transferred from one graph to another. Introduce a callback to be notified when packets are enqueue in the work queue: this can be useful if we want to wake up a thread. This sleep/wake up mechanism is deferred to the application. Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
This patch implements RISC-V vector intrinsics to accelerate memory copy operations for byte range (129~1600). Signed-off-by: Qiguo Chen <chen.qiguo@zte.com.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>
Benchmark results show 10~15% reduction in execution time for data sizes (129~1024) Signed-off-by: Qiguo Chen <chen.qiguo@zte.com.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @ovsrobot, your pull request is larger than the review limit of 150000 diff characters
WalkthroughThis pull request introduces comprehensive support for DMA enqueue/dequeue operations throughout the library and application layers, refactors EAL argument parsing to use the argparse library, enhances argparse with CORELIST value type and non-flag argument handling, updates the pcapng API to accept configurable link types, implements timing-safe memory comparison functions, and adds RISC-V vector instruction-based memcpy optimizations. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes This is a substantial, multi-faceted change spanning core library refactoring (EAL argument parsing, DMA framework), new feature implementation (DMA OPS support, argparse enhancements), driver-level changes across multiple DMA/event/crypto drivers, and interconnected public API modifications. The changes demand understanding of: (1) EAL initialization and argument parsing flows; (2) DMA device operation enqueue/dequeue semantics; (3) argparse library architecture and its integration into EAL; (4) driver-specific OPS implementations across CNXK/DPAA/Hisilicon; (5) pcapng API contract changes; (6) event-adapter operation structure migration. High heterogeneity of logic across subsystems (no single repetitive pattern), complex state management in drivers, and interconnected API changes amplify review complexity. Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 26
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
app/test/test_dmadev.c (1)
95-101
: Use correct vchan when reading statsrte_dma_stats_get(dev_id, 0, &stats) ignores the vchan under test. Use vchan, not 0.
Apply this diff:
- rte_dma_stats_get(dev_id, 0, &stats); + rte_dma_stats_get(dev_id, vchan, &stats);drivers/dma/cnxk/cnxk_dmadev.c (1)
53-57
: Memory leak: c_desc.ops never freed (and error path not fully cleaned).
- cnxk_dmadev_vchan_free() frees only compl_ptr; ops array leaks on reconfigure/close.
- vchan_setup() error path frees compl_ptr but doesn’t null it.
Fix by freeing ops in vchan_free and nulling compl_ptr in the error path.
@@ for (; i < num_vchans; i++) { dpi_conf = &dpivf->conf[i]; rte_free(dpi_conf->c_desc.compl_ptr); dpi_conf->c_desc.compl_ptr = NULL; + rte_free(dpi_conf->c_desc.ops); + dpi_conf->c_desc.ops = NULL; } @@ - if (dpi_conf->c_desc.ops == NULL) { + if (dpi_conf->c_desc.ops == NULL) { plt_err("Failed to allocate for ops array"); - rte_free(dpi_conf->c_desc.compl_ptr); + rte_free(dpi_conf->c_desc.compl_ptr); + dpi_conf->c_desc.compl_ptr = NULL; return -ENOMEM; }Also applies to: 276-282
app/test-dma-perf/benchmark.c (1)
989-1043
: Free dma_ops and underlying ops blob in cleanup.Prevent leaks by freeing both the pointer array and ops storage.
@@ /* free sges for mbufs */ rte_free(src_sges); src_sges = NULL; rte_free(dst_sges); dst_sges = NULL; + /* free DMA ops (if allocated for OPS path) */ + if (dma_ops != NULL) { + /* first element points to the start of the contiguous ops blob */ + if (dma_ops[0] != NULL) + rte_free(dma_ops[0]); + rte_free(dma_ops); + dma_ops = NULL; + } + /* free the worker parameters */ for (i = 0; i < nb_workers; i++) { rte_free(lcores[i]); lcores[i] = NULL; }
🧹 Nitpick comments (33)
drivers/bus/pci/windows/pci_netuio.h (1)
8-8
: LGTM! MinGW-w64 compatibility addition is appropriate.The addition of
|| defined(__MINGW64__)
ensures that the GUID definitions are included when building with MinGW-w64, even when targeting sufficiently recent Windows versions. This is a reasonable compatibility fix since MinGW-w64 headers may not provide these definitions.Minor nitpick: There's an extra space before
defined(__MINGW64__)
.lib/bbdev/rte_bbdev.c (1)
1250-1250
: Consider using an integer type for the loop variable.While using
enum rte_bbdev_enqueue_status
as the loop index is more self-documenting, incrementing an enum withi++
is not idiomatic C. Although enums are treated as integers and this works correctly (especially with the NULL guard at lines 1269-1270), it can create values outside the defined enum range during iteration.For better clarity and type safety, consider one of these alternatives:
Option 1 (recommended): Use an integer type and cast when calling the function
- enum rte_bbdev_enqueue_status i; + unsigned int i; struct rte_bbdev *dev = get_dev(dev_id); VALID_DEV_OR_RET_ERR(dev, dev_id);Option 2: Explicitly cast in the loop
- for (i = 0; i < RTE_BBDEV_ENQ_STATUS_SIZE_MAX; i++) { - const char *status_str = rte_bbdev_enqueue_status_str(i); + for (i = (enum rte_bbdev_enqueue_status)0; + i < RTE_BBDEV_ENQ_STATUS_SIZE_MAX; + i = (enum rte_bbdev_enqueue_status)((int)i + 1)) { + const char *status_str = rte_bbdev_enqueue_status_str((enum rte_bbdev_enqueue_status)i);Also applies to: 1267-1267
app/test/test_func_reentrancy.c (1)
69-73
: Consider using a simple string literal for the program name.The use of
__FILE__
as the program name expands to the full source file path (e.g.,/path/to/app/test/test_func_reentrancy.c
), which is unconventional for argv[0]. While this doesn't affect the test's functionality (since it expectsrte_eal_init
to fail), using a clearer, more conventional name would improve readability.Apply this diff for better clarity:
test_eal_init_once(__rte_unused void *arg) { - char appname[] = __FILE__; unsigned lcore_self = rte_lcore_id(); - char *argv[] = { appname, NULL }; + char *argv[] = { "test_eal_init", NULL };drivers/event/dlb2/pf/base/dlb2_resource.c (1)
925-948
: Consider consolidating redundant counter variables.The local variable
set_count
is always incremented in lockstep withhw->num_prod_cores
at lines 947-948. Since they track the same value, you could simplify the code by usinghw->num_prod_cores
directly as the index.Current code:
hw->prod_core_list[set_count++] = i; hw->num_prod_cores++;Could be simplified to:
hw->prod_core_list[hw->num_prod_cores++] = i;This eliminates the extra variable and makes the code more concise without changing behavior.
drivers/dma/hisilicon/hisi_dmadev.h (1)
144-144
: Document the SQE_DROP_FLAG purpose and behavior.The new
SQE_DROP_FLAG
lacks documentation explaining its purpose and how the hardware interprets it. Based on the stop function usage (line 466 in hisi_dmadev.c), it appears to signal the hardware to quickly complete/discard operations, but this should be explicitly documented.Add a comment explaining the flag:
+/* SQE_DROP_FLAG signals hardware to quickly complete/discard the operation */ #define SQE_DROP_FLAG BIT(4)
config/riscv/meson.build (1)
151-159
: Align indentation with the V extension block.The zicbop detection block uses inconsistent indentation compared to the V extension block above (lines 135-146). Specifically:
- Line 152 uses 4-space continuation vs 8-space in line 136
- Line 153 uses 2-space indent vs 4-space in line 137
- Line 154 uses 6-space continuation vs 12-space in line 138
For consistency and maintainability, match the indentation pattern used in the V extension block.
Apply this diff to fix the indentation:
if (riscv_extension_macros and - (cc.get_define('__riscv_zicbop', args: machine_args) != '')) - if ((cc.get_id() == 'gcc' and cc.version().version_compare('>=14.1.0')) - or (cc.get_id() == 'clang' and cc.version().version_compare('>=18.1.0'))) - message('Compiling with the zicbop extension') - machine_args += ['-DRTE_RISCV_FEATURE_PREFETCH'] - else - warning('Detected zicbop extension but cannot use because intrinsics are not available (present in GCC 14.1.0+ and Clang 18.1.0+)') - endif + (cc.get_define('__riscv_zicbop', args: machine_args) != '')) + if ((cc.get_id() == 'gcc' and cc.version().version_compare('>=14.1.0')) + or (cc.get_id() == 'clang' and cc.version().version_compare('>=18.1.0'))) + message('Compiling with the zicbop extension') + machine_args += ['-DRTE_RISCV_FEATURE_PREFETCH'] + else + warning('Detected zicbop extension but cannot use because intrinsics are not available (present in GCC 14.1.0+ and Clang 18.1.0+)') + endif endiflib/eal/riscv/include/rte_memcpy.h (3)
262-269
: Prefer copying only the misalignment (dstofss) instead of a fixed 64B prologueThe alignment prologue copies 64 bytes but advances pointers by
dstofss
(≤63), causing a redundant 64B load/store when onlydstofss
bytes are needed. It’s safe but wasteful.Proposed change:
dstofss = (uintptr_t)dst & ALIGNMENT_MASK_64; if (dstofss > 0) { dstofss = 64 - dstofss; n -= dstofss; - _rte_mov64(dst, src); + _rte_mov(dst, src, (uint32_t)dstofss); src = (const uint8_t *)src + dstofss; dst = (uint8_t *)dst + dstofss; }
23-25
: Don’t disable the entire RVV path if PREFETCH is unavailableUndefining
RTE_RISCV_FEATURE_V
whenRTE_RISCV_FEATURE_PREFETCH
is absent disables otherwise valid RVV copies on platforms lacking prefetch support.Suggestion:
- Keep
RTE_RISCV_FEATURE_V
intact.- Make prefetch helpers no-ops when
RTE_RISCV_FEATURE_PREFETCH
is undefined (e.g., provide empty inline functions/macros).
This preserves RVV acceleration without relying on prefetch availability.
356-362
: RTE_INIT in a header creates per-TU constructors; consider centralizing initPlacing
RTE_INIT
in a header produces one constructor per translation unit, increasing init overhead and binary size.Move the feature detection routine to a single C file (e.g., riscv eal source) and expose the result via an inline getter or extern variable used by the inline fast paths.
drivers/dma/idxd/idxd_bus.c (1)
263-268
: Consider using PATH_MAX instead of magic number.The refactoring successfully eliminates dynamic allocation, but the fixed buffer size of 256 is a magic number. Consider using
PATH_MAX
(already available via the<libgen.h>
include) for better clarity and consistency with POSIX standards.Apply this diff to use PATH_MAX:
- char prefix[256]; + char prefix[PATH_MAX]; int retval = 0; size_t prefixlen; prefixlen = rte_basename(rte_eal_get_runtime_dir(), prefix, sizeof(prefix));doc/guides/testpmd_app_ug/testpmd_funcs.rst (1)
70-73
: Clarify echo/noecho semantics and align examples
- Please add one sentence stating the behavioral difference:
- --cmdline-file echoes each command before execution; --cmdline-file-noecho suppresses it.
- Similarly, load echoes; load_echo explicitly echoes each command.
- Consider adding a brief “with echo” vs “no echo” example, so users see expected console output.
Also applies to: 77-77, 98-100, 110-111
doc/guides/linux_gsg/eal_args.include.rst (1)
70-91
: Document how remapping affects main/service lcore options
- Please note explicitly that when -R/--remap-lcore-ids is used, --main-lcore and --service-corelist values refer to remapped lcore-ids (not physical CPUs).
- Keep the restriction note, but also add a short example showing --service-corelist after remap.
Also applies to: 92-98
drivers/dma/cnxk/cnxk_dmadev.h (1)
91-97
: Confirm ring mask semantics and ops array sizingmax_cnt is used as a mask (head/tail & max_cnt). Please ensure:
- max_cnt is (ring_size - 1) and ring_size is a power of two.
- c_desc->ops is allocated for ring_size entries (max_cnt + 1).
If not already guaranteed at setup, add an assert or comment to prevent future misuse.
lib/eal/common/eal_options.h (1)
23-25
: Include header for rte_cpuset_t to avoid ordering issuesThis header declares rte_eal_parse_coremask using rte_cpuset_t but does not include a header that defines it. Add an include (e.g., rte_common.h or rte_os.h) here to prevent compilation-order pitfalls in future includes.
lib/graph/rte_graph_worker_common.h (1)
71-73
: Use uintptr_t for cb_priv for pointer-size portabilityIf cb_priv carries a pointer/cookie, use uintptr_t instead of uint64_t to avoid assumptions on pointer size across architectures.
- packets_enqueued_cb notify_cb; /**< Callback when packet crosses lcores. */ - uint64_t cb_priv; /**< Opaque parameter for notify_cb. */ + packets_enqueued_cb notify_cb; /**< Callback when packet crosses lcores. */ + uintptr_t cb_priv; /**< Opaque parameter for notify_cb. */app/test-pmd/testpmd.c (1)
112-114
: Multi-file cmdline processing looks solid.Globals and loop usage are correct; error path breaks out cleanly. If partial success is acceptable, consider continuing on errors and reporting all failures.
Also applies to: 4583-4589
doc/guides/prog_guide/dmadev.rst (1)
111-142
: Consider clarifying alignment requirements in the example.The documentation and code example are clear and comprehensive. However, the example at line 123 uses
rte_zmalloc
without specifying alignment:op = rte_zmalloc(sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2), 0);Given that the structure includes trailing scatter-gather elements, it would be clearer to specify the alignment explicitly or add a comment explaining the alignment requirements, especially since the mempool creation at line 2253 in
app/test-eventdev/test_perf_common.c
uses a similar pattern.Apply this diff to make alignment explicit:
- op = rte_zmalloc(sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2), 0); + op = rte_zmalloc(NULL, sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2), + RTE_CACHE_LINE_SIZE);lib/graph/graph.c (1)
348-349
: Consider error code semantics.The code uses
ENOLINK
(Link has been severed) for an invalid lcore. While technically correct,EINVAL
(Invalid argument) might be more conventional and clearer for "invalid lcore" errors.The validation now checks
rte_lcore_has_role(lcore, ROLE_OFF)
to detect disabled lcores, which is appropriate for rejecting invalid core bindings.app/test/test_dmadev_api.c (1)
703-759
: Consider adding clarifying comments for the ops memory management flow.The test allocates an ops structure, enqueues it, sets the pointer to NULL, dequeues it, and then frees it. While this pattern correctly tests that dequeue returns the operation pointer, the
ops = NULL
assignment at line 743 may not be immediately clear to future readers.Consider adding a comment before line 743 to clarify the intent:
rte_delay_us_sleep(TEST_WAIT_US_VAL); + /* Set ops to NULL to verify that dequeue properly returns the operation pointer */ ops = NULL; /* Dequeue operations */
app/test/test_dmadev.c (3)
1075-1085
: Pool name collisions across devices/runsHardcoded names "pkt_pool"/"ops_pool" can collide when multiple devices run the test. Include dev_id/vchan in names.
Apply this diff:
- pkt_pool = rte_pktmbuf_pool_create("pkt_pool", info.max_desc * 2, 0, 0, + char pkt_pool_name[32]; + snprintf(pkt_pool_name, sizeof(pkt_pool_name), "pkt_pool_%d_%u", dev_id, vchan); + pkt_pool = rte_pktmbuf_pool_create(pkt_pool_name, info.max_desc * 2, 0, 0, CPY_LEN + RTE_PKTMBUF_HEADROOM, rte_socket_id()); @@ - ops_pool = rte_mempool_create("ops_pool", info.max_desc, + char ops_pool_name[32]; + snprintf(ops_pool_name, sizeof(ops_pool_name), "ops_pool_%d_%u", dev_id, vchan); + ops_pool = rte_mempool_create(ops_pool_name, info.max_desc, sizeof(struct rte_dma_op) + (sizeof(struct rte_dma_sge) * 2), 0, 0, NULL, NULL, NULL, NULL, rte_socket_id(), 0);
1118-1126
: Confirm submit semantics for enqueue_ops/dequeue_opsNo rte_dma_submit() is called between enqueue and dequeue. Some PMDs may require submit even in ENQ_DEQ mode. Confirm API guarantees or add submit.
If required, add after completing an enqueue burst:
+ if (rte_dma_submit(dev_id, vchan) < 0) + ERR_RETURN("Error with rte_dma_submit()\n");Please confirm with dmadev ENQ/DEQ API docs or driver behavior.
Also applies to: 1134-1138
1169-1194
: prepare_enq_deq_ops: add minimal capacity checkIf max_desc <= BURST_SIZE the test logic will fail (see “max” calc). Add a guard and skip.
Apply this diff:
int ret = rte_dma_info_get(dev_id, &info); if (ret != 0) ERR_RETURN("Error with rte_dma_info_get()\n"); qconf.direction = RTE_DMA_DIR_MEM_TO_MEM; qconf.nb_desc = info.max_desc; + if (qconf.nb_desc <= BURST_SIZE) + ERR_RETURN("Insufficient descriptors for ENQ/DEQ test (need > %u, have %u)\n", + BURST_SIZE, qconf.nb_desc);drivers/dma/cnxk/cnxk_dmadev.c (1)
452-464
: Typo in function name: cnxk_damdev_burst_capacity → cnxk_dmadev_burst_capacity.Purely cosmetic, but aligns with existing naming (dmadev).
-static uint16_t -cnxk_damdev_burst_capacity(const void *dev_private, uint16_t vchan) +static uint16_t +cnxk_dmadev_burst_capacity(const void *dev_private, uint16_t vchan) @@ - dev->fp_obj->burst_capacity = cnxk_damdev_burst_capacity; + dev->fp_obj->burst_capacity = cnxk_dmadev_burst_capacity;Also applies to: 555-556
drivers/dma/cnxk/cnxk_dmadev_fp.c (1)
692-729
: Mask nb_src/nb_dst to hardware limits (consistency with other paths).Other paths use CNXK_DPI_MAX_POINTER masking. Enqueue should do the same to avoid invalid cmd lengths.
- src = op->nb_src; - dst = op->nb_dst; + src = op->nb_src & CNXK_DPI_MAX_POINTER; + dst = op->nb_dst & CNXK_DPI_MAX_POINTER; @@ - src = op->nb_src; - dst = op->nb_dst; + src = op->nb_src & CNXK_DPI_MAX_POINTER; + dst = op->nb_dst & CNXK_DPI_MAX_POINTER;Also applies to: 753-787
lib/eal/common/eal_common_options.c (1)
1996-2001
: Guard against NULL from available_cores() in error log.available_cores() may return NULL; passing it to “%s” is unsafe.
- char *available = available_cores(); - EAL_LOG(ERR, "invalid coremask or core-list parameter, please check specified cores are part of %s", - available); - free(available); + char *available = available_cores(); + EAL_LOG(ERR, + "invalid coremask or core-list parameter, please check specified cores are part of %s", + available ? available : "(no available cores)"); + if (available) + free(available);lib/dmadev/rte_dmadev.c (1)
804-817
: Add name for new capability RTE_DMA_CAPA_OPS_ENQ_DEQ in dump.Currently prints “unknown”.
} capa_names[] = { { RTE_DMA_CAPA_MEM_TO_MEM, "mem2mem" }, @@ { RTE_DMA_CAPA_OPS_COPY_SG, "copy_sg" }, { RTE_DMA_CAPA_OPS_FILL, "fill" }, + { RTE_DMA_CAPA_OPS_ENQ_DEQ, "ops_enq_deq" }, };
app/test-dma-perf/benchmark.c (1)
409-462
: Align throughput counter with completions (optional).Other paths count completions; here total_cpl is updated on enqueue. Consider incrementing on dequeues to keep consistency.
- while (enq < sz) { + while (enq < sz) { do { deq = rte_dma_dequeue_ops(dev_id, 0, op, DEQ_SZ); - tdeq += deq; + tdeq += deq; + worker_info->total_cpl += deq; } while (deq); @@ - tenq += enq; - - worker_info->total_cpl += enq; + tenq += enq;lib/dmadev/rte_dmadev.h (5)
282-293
: Config flags addition is good, but update related docsFlags look fine. Please also update any lingering references to struct fields (e.g., “silent_mode”) to point to RTE_DMA_CFG_FLAG_SILENT in this header.
360-362
: Public struct change may break ABI/APIReplacing former config fields with a flags uint64_t changes the layout of public struct rte_dma_conf. Confirm ABI policy/major version window and update release notes accordingly.
805-861
: Public rte_dma_op FAM should use [] instead of [0]Zero-length arrays are non‑standard. Prefer a C99 flexible array member for portability.
Apply this diff:
- struct rte_dma_sge src_dst_seg[0]; + struct rte_dma_sge src_dst_seg[];
1221-1229
: Doc nit: refer to the new config flag nameReplace “rte_dma_conf::enable_enq_deq” with “RTE_DMA_CFG_FLAG_ENQ_DEQ in rte_dma_conf::flags” to match the new API.
1261-1264
: Doc nit: refer to the new config flag nameReplace “rte_dma_conf::enable_enq_deq” with “RTE_DMA_CFG_FLAG_ENQ_DEQ in rte_dma_conf::flags”.
lib/argparse/rte_argparse.c (1)
711-717
: argv compaction: consider null-terminating the handled sliceAfter moving args, set argv[ret] = NULL to terminate the handled portion. Not required for correctness here, but improves safety for downstream consumers that expect a NULL sentinel.
Apply this diff:
- ret = j; /* only return args actually handled */ + ret = j; /* only return args actually handled */ + argv[ret] = NULL; /* Then put contents of the args_to_move array into the argv in the space left. */Also applies to: 801-813
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (91)
.mailmap
(2 hunks)app/dumpcap/main.c
(1 hunks)app/test-dma-perf/benchmark.c
(9 hunks)app/test-dma-perf/config.ini
(2 hunks)app/test-dma-perf/main.c
(3 hunks)app/test-dma-perf/main.h
(1 hunks)app/test-eventdev/test_perf_common.c
(2 hunks)app/test-eventdev/test_perf_common.h
(2 hunks)app/test-pmd/cmdline.c
(4 hunks)app/test-pmd/parameters.c
(1 hunks)app/test-pmd/testpmd.c
(2 hunks)app/test-pmd/testpmd.h
(3 hunks)app/test/process.h
(1 hunks)app/test/test_argparse.c
(5 hunks)app/test/test_dmadev.c
(7 hunks)app/test/test_dmadev_api.c
(12 hunks)app/test/test_event_dma_adapter.c
(3 hunks)app/test/test_func_reentrancy.c
(1 hunks)app/test/test_pcapng.c
(2 hunks)app/test/test_string_fns.c
(2 hunks)benchmark_report.txt
(1 hunks)config/meson.build
(1 hunks)config/riscv/meson.build
(1 hunks)doc/guides/contributing/design.rst
(1 hunks)doc/guides/linux_gsg/eal_args.include.rst
(5 hunks)doc/guides/prog_guide/argparse_lib.rst
(5 hunks)doc/guides/prog_guide/dmadev.rst
(1 hunks)doc/guides/prog_guide/eventdev/event_dma_adapter.rst
(2 hunks)doc/guides/rel_notes/release_25_11.rst
(3 hunks)doc/guides/testpmd_app_ug/run_app.rst
(1 hunks)doc/guides/testpmd_app_ug/testpmd_funcs.rst
(3 hunks)doc/guides/tools/dmaperf.rst
(3 hunks)drivers/bus/pci/windows/pci.c
(1 hunks)drivers/bus/pci/windows/pci_netuio.h
(1 hunks)drivers/crypto/ipsec_mb/pmd_aesni_gcm.c
(2 hunks)drivers/crypto/ipsec_mb/pmd_aesni_mb.c
(2 hunks)drivers/crypto/ipsec_mb/pmd_snow3g.c
(1 hunks)drivers/crypto/ipsec_mb/pmd_zuc.c
(1 hunks)drivers/dma/cnxk/cnxk_dmadev.c
(8 hunks)drivers/dma/cnxk/cnxk_dmadev.h
(2 hunks)drivers/dma/cnxk/cnxk_dmadev_fp.c
(13 hunks)drivers/dma/dpaa/dpaa_qdma.c
(1 hunks)drivers/dma/dpaa2/dpaa2_qdma.c
(1 hunks)drivers/dma/hisilicon/hisi_dmadev.c
(4 hunks)drivers/dma/hisilicon/hisi_dmadev.h
(2 hunks)drivers/dma/idxd/idxd_bus.c
(1 hunks)drivers/event/dlb2/dlb2_priv.h
(1 hunks)drivers/event/dlb2/pf/base/dlb2_resource.c
(1 hunks)drivers/net/mlx5/windows/mlx5_ethdev_os.c
(1 hunks)lib/argparse/rte_argparse.c
(14 hunks)lib/argparse/rte_argparse.h
(3 hunks)lib/bbdev/rte_bbdev.c
(2 hunks)lib/dmadev/rte_dmadev.c
(5 hunks)lib/dmadev/rte_dmadev.h
(4 hunks)lib/dmadev/rte_dmadev_core.h
(2 hunks)lib/dmadev/rte_dmadev_trace.h
(1 hunks)lib/dmadev/rte_dmadev_trace_fp.h
(1 hunks)lib/dmadev/rte_dmadev_trace_points.c
(1 hunks)lib/eal/common/eal_common_lcore.c
(2 hunks)lib/eal/common/eal_common_memory.c
(1 hunks)lib/eal/common/eal_common_options.c
(20 hunks)lib/eal/common/eal_common_proc.c
(1 hunks)lib/eal/common/eal_common_thread.c
(1 hunks)lib/eal/common/eal_option_list.h
(1 hunks)lib/eal/common/eal_options.h
(1 hunks)lib/eal/common/eal_private.h
(1 hunks)lib/eal/common/eal_thread.h
(1 hunks)lib/eal/freebsd/eal.c
(13 hunks)lib/eal/include/rte_eal.h
(0 hunks)lib/eal/include/rte_memory.h
(1 hunks)lib/eal/include/rte_string_fns.h
(1 hunks)lib/eal/linux/eal.c
(14 hunks)lib/eal/linux/eal_memory.c
(1 hunks)lib/eal/meson.build
(1 hunks)lib/eal/riscv/include/rte_memcpy.h
(2 hunks)lib/eal/unix/meson.build
(1 hunks)lib/eal/unix/rte_basename.c
(1 hunks)lib/eal/windows/eal.c
(9 hunks)lib/eal/windows/meson.build
(1 hunks)lib/eal/windows/rte_basename.c
(1 hunks)lib/eal/x86/rte_power_intrinsics.c
(1 hunks)lib/eventdev/rte_event_dma_adapter.c
(8 hunks)lib/eventdev/rte_event_dma_adapter.h
(0 hunks)lib/graph/graph.c
(2 hunks)lib/graph/graph_pcap.c
(1 hunks)lib/graph/rte_graph.h
(2 hunks)lib/graph/rte_graph_model_mcore_dispatch.c
(1 hunks)lib/graph/rte_graph_worker_common.h
(1 hunks)lib/meson.build
(1 hunks)lib/pcapng/rte_pcapng.c
(2 hunks)lib/pcapng/rte_pcapng.h
(3 hunks)
💤 Files with no reviewable changes (2)
- lib/eventdev/rte_event_dma_adapter.h
- lib/eal/include/rte_eal.h
🧰 Additional context used
🧬 Code graph analysis (25)
lib/eal/common/eal_private.h (1)
lib/eal/common/eal_common_options.c (2)
eal_collate_args
(218-256)eal_cpuset_to_str
(1619-1685)
lib/eal/windows/rte_basename.c (1)
lib/eal/unix/rte_basename.c (1)
rte_basename
(14-37)
app/test/test_string_fns.c (2)
lib/eal/unix/rte_basename.c (1)
rte_basename
(14-37)lib/eal/windows/rte_basename.c (1)
rte_basename
(9-53)
lib/dmadev/rte_dmadev_trace_fp.h (1)
lib/eal/include/rte_trace_point.h (2)
void
(319-347)void
(349-363)
lib/eal/common/eal_common_lcore.c (1)
lib/eal/common/eal_common_options.c (1)
eal_cpuset_to_str
(1619-1685)
app/test-pmd/testpmd.c (1)
app/test-pmd/cmdline.c (3)
i
(2355-2380)i
(6373-6375)cmdline_read_from_file
(14199-14253)
drivers/dma/cnxk/cnxk_dmadev.h (1)
drivers/dma/cnxk/cnxk_dmadev_fp.c (3)
cnxk_dma_ops_enqueue
(678-738)cn10k_dma_ops_enqueue
(740-787)cnxk_dma_ops_dequeue
(789-821)
app/test/test_func_reentrancy.c (1)
lib/eal/include/rte_lcore.h (1)
rte_lcore_id
(77-81)
lib/eal/unix/rte_basename.c (1)
lib/eal/windows/rte_basename.c (1)
rte_basename
(9-53)
lib/argparse/rte_argparse.h (1)
lib/argparse/rte_argparse.c (1)
rte_argparse_print_help
(903-919)
lib/graph/graph.c (1)
lib/graph/rte_graph_model_mcore_dispatch.c (1)
graph_sched_wq_create
(9-62)
lib/eal/freebsd/eal.c (3)
lib/eal/linux/eal.c (1)
rte_eal_init_alert
(482-485)lib/eal/windows/eal.c (1)
rte_eal_init_alert
(91-95)lib/eal/common/eal_common_options.c (7)
eal_save_args
(319-323)eal_save_args
(360-406)eal_collate_args
(218-256)eal_parse_log_options
(1727-1769)eal_parse_args
(1894-2244)eal_clean_saved_args
(325-329)eal_clean_saved_args
(408-428)
lib/eal/include/rte_string_fns.h (2)
lib/eal/unix/rte_basename.c (1)
rte_basename
(14-37)lib/eal/windows/rte_basename.c (1)
rte_basename
(9-53)
app/test/test_dmadev_api.c (1)
lib/dmadev/rte_dmadev.h (2)
rte_dma_enqueue_ops
(1241-1258)rte_dma_dequeue_ops
(1276-1293)
lib/eal/linux/eal.c (4)
lib/eal/freebsd/eal.c (2)
rte_eal_init_alert
(400-403)rte_config_init
(291-330)lib/eal/windows/eal.c (1)
rte_eal_init_alert
(91-95)lib/eal/common/eal_common_options.c (10)
eal_save_args
(319-323)eal_save_args
(360-406)eal_collate_args
(218-256)eal_parse_log_options
(1727-1769)eal_parse_args
(1894-2244)eal_plugins_init
(553-557)eal_plugins_init
(715-751)eal_option_device_parse
(456-474)eal_clean_saved_args
(325-329)eal_clean_saved_args
(408-428)lib/eal/common/eal_common_lcore.c (1)
rte_eal_cpu_init
(150-232)
app/test/test_argparse.c (1)
lib/argparse/rte_argparse.c (2)
rte_argparse_parse
(922-961)rte_argparse_parse_type
(964-979)
drivers/dma/cnxk/cnxk_dmadev.c (2)
drivers/dma/cnxk/cnxk_dmadev_fp.c (7)
cnxk_dmadev_copy
(242-296)cnxk_dmadev_copy_sg
(298-352)cn10k_dmadev_copy
(354-398)cn10k_dmadev_copy_sg
(400-441)cnxk_dma_ops_enqueue
(678-738)cnxk_dma_ops_dequeue
(789-821)cn10k_dma_ops_enqueue
(740-787)drivers/common/cnxk/roc_model.h (1)
roc_model_is_cn10k
(139-149)
app/test-pmd/testpmd.h (1)
app/test-pmd/cmdline.c (1)
cmdline_read_from_file
(14199-14253)
app/test/test_dmadev.c (3)
lib/mempool/rte_mempool.c (1)
rte_mempool_create
(984-1017)lib/mbuf/rte_mbuf.h (2)
rte_pktmbuf_alloc_bulk
(1006-1046)rte_pktmbuf_free
(1492-1504)lib/dmadev/rte_dmadev.h (2)
rte_dma_enqueue_ops
(1241-1258)rte_dma_dequeue_ops
(1276-1293)
lib/eal/common/eal_options.h (4)
lib/eal/common/eal_common_options.c (12)
eal_parse_log_options
(1727-1769)eal_parse_args
(1894-2244)eal_option_device_parse
(456-474)eal_adjust_config
(2289-2305)eal_cleanup_config
(2279-2287)eal_plugins_init
(553-557)eal_plugins_init
(715-751)eal_save_args
(319-323)eal_save_args
(360-406)eal_clean_saved_args
(325-329)eal_clean_saved_args
(408-428)handle_eal_info_request
(338-358)lib/eal/linux/eal.c (1)
eal_proc_type_detect
(348-372)lib/eal/freebsd/eal.c (1)
eal_proc_type_detect
(264-288)lib/eal/windows/eal.c (1)
eal_proc_type_detect
(46-76)
app/test-dma-perf/benchmark.c (1)
lib/dmadev/rte_dmadev.h (2)
rte_dma_enqueue_ops
(1241-1258)rte_dma_dequeue_ops
(1276-1293)
lib/eal/windows/eal.c (1)
lib/eal/common/eal_common_options.c (8)
eal_save_args
(319-323)eal_save_args
(360-406)eal_collate_args
(218-256)eal_parse_log_options
(1727-1769)eal_parse_args
(1894-2244)eal_option_device_parse
(456-474)eal_clean_saved_args
(325-329)eal_clean_saved_args
(408-428)
lib/eal/common/eal_common_thread.c (1)
lib/eal/common/eal_common_options.c (1)
eal_cpuset_to_str
(1619-1685)
lib/eal/riscv/include/rte_memcpy.h (1)
lib/eal/arm/include/rte_memcpy_32.h (1)
rte_memcpy
(297-301)
lib/eal/common/eal_common_options.c (3)
lib/argparse/rte_argparse.c (3)
rte_argparse_print_help
(903-919)rte_argparse_parse
(922-961)rte_argparse_parse_type
(964-979)lib/eal/linux/eal.c (2)
i
(847-874)eal_proc_type_detect
(348-372)lib/eal/common/eal_common_lcore.c (1)
rte_get_next_lcore
(106-123)
int | ||
cmdline_read_from_file(const char *filename, bool echo) | ||
{ | ||
struct cmdline *cl; | ||
int fd = -1; | ||
int ret = 0; | ||
|
||
/* cmdline_file_new does not produce any output | ||
* so when echoing is requested we open filename directly | ||
* and then pass that to cmdline_new with stdout as the output path. | ||
*/ | ||
if (!echo_cmdline_file) { | ||
if (!echo) { | ||
cl = cmdline_file_new(main_ctx, "testpmd> ", filename); | ||
} else { | ||
/* use basename(filename) as prompt */ | ||
char prompt[32] = "[] "; | ||
|
||
rte_basename(filename, &prompt[1], sizeof(prompt) - strlen(prompt)); | ||
strlcat(prompt, "] ", sizeof(prompt)); | ||
|
||
fd = open(filename, O_RDONLY); | ||
if (fd < 0) { | ||
fprintf(stderr, "Failed to open file %s: %s\n", | ||
filename, strerror(errno)); | ||
return; | ||
return -1; | ||
} | ||
|
||
cl = cmdline_new(main_ctx, "testpmd> ", fd, STDOUT_FILENO); | ||
cl = cmdline_new(main_ctx, prompt, fd, STDOUT_FILENO); | ||
} | ||
if (cl == NULL) { | ||
fprintf(stderr, | ||
"Failed to create file based cmdline context: %s\n", | ||
filename); | ||
ret = -1; | ||
goto end; | ||
} | ||
|
||
cmdline_interact(cl); | ||
cmdline_quit(cl); | ||
/* when done, if we have echo, we only need to print end of file, | ||
* but if no echo, we need to use printf and include the filename. | ||
*/ | ||
if (echo) | ||
cmdline_printf(cl, "<End-Of-File>\n"); | ||
else | ||
printf("Finished reading CLI commands from %s\n", filename); | ||
|
||
cmdline_quit(cl); | ||
cmdline_free(cl); | ||
|
||
printf("Read CLI commands from %s\n", filename); | ||
|
||
end: | ||
if (fd >= 0) | ||
close(fd); | ||
return ret; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Possible FD ownership and stdout-close hazard in cmdline_read_from_file (echo path)
When echo is true, cmdline_new is created with s_out = STDOUT_FILENO and s_in = fd. Many cmdline implementations close s_in/s_out on cmdline_free. If so:
- cmdline_free could close STDOUT_FILENO, breaking subsequent output.
- cmdline_free may also close the input fd while this function also closes it at end, risking double-close.
Recommend dup the FDs before passing them to cmdline_new, and only close your originals here. Also avoid handing cmdline_new a raw STDOUT by duplicating it.
Suggested patch:
int
cmdline_read_from_file(const char *filename, bool echo)
{
struct cmdline *cl;
int fd = -1;
+ int in_fd = -1;
+ int out_fd = -1;
int ret = 0;
@@
- if (!echo) {
+ if (!echo) {
cl = cmdline_file_new(main_ctx, "testpmd> ", filename);
} else {
/* use basename(filename) as prompt */
char prompt[32] = "[] ";
rte_basename(filename, &prompt[1], sizeof(prompt) - strlen(prompt));
strlcat(prompt, "] ", sizeof(prompt));
fd = open(filename, O_RDONLY);
if (fd < 0) {
fprintf(stderr, "Failed to open file %s: %s\n",
filename, strerror(errno));
return -1;
}
- cl = cmdline_new(main_ctx, prompt, fd, STDOUT_FILENO);
+ in_fd = dup(fd);
+ if (in_fd < 0) {
+ fprintf(stderr, "Failed to dup input fd for %s: %s\n",
+ filename, strerror(errno));
+ ret = -1;
+ goto end;
+ }
+ out_fd = dup(STDOUT_FILENO);
+ if (out_fd < 0) {
+ fprintf(stderr, "Failed to dup stdout: %s\n", strerror(errno));
+ ret = -1;
+ goto end;
+ }
+ cl = cmdline_new(main_ctx, prompt, in_fd, out_fd);
+ if (cl == NULL) {
+ fprintf(stderr,
+ "Failed to create file based cmdline context: %s\n",
+ filename);
+ ret = -1;
+ goto end;
+ }
}
- if (cl == NULL) {
- fprintf(stderr,
- "Failed to create file based cmdline context: %s\n",
- filename);
- ret = -1;
- goto end;
- }
@@
- cmdline_quit(cl);
- cmdline_free(cl);
+ cmdline_quit(cl);
+ cmdline_free(cl);
+ /* cl took ownership of in_fd/out_fd (dups). Prevent closing them below. */
+ in_fd = -1;
+ out_fd = -1;
end:
if (fd >= 0)
close(fd);
+ if (in_fd >= 0)
+ close(in_fd);
+ if (out_fd >= 0)
+ close(out_fd);
return ret;
}
This prevents stdout from being closed and avoids double-closing the input stream.
To confirm whether cmdline_free closes s_in/s_out in your environment, you can grep the cmdline implementation:
🏁 Script executed:
#!/bin/bash
# Inspect cmdline_free to see if it closes file descriptors
rg -n -C2 -g '!**/build/**' -g '!**/node_modules/**' 'cmdline_free\s*\('
rg -n -C3 -g '!**/build/**' -g '!**/node_modules/**' 'close\s*\(\s*(.*s_in|.*s_out|.*fd' lib cmdline
Length of output: 3991
I need to examine the cmdline implementation to verify the FD handling concerns. Let me search more carefully.
---
🏁 Script executed:
```shell
#!/bin/bash
# View the cmdline_free implementation
cat lib/cmdline/cmdline.c | head -150 | tail -50
Length of output: 797
🏁 Script executed:
#!/bin/bash
# Search for the cmdline struct definition and s_in/s_out fields
rg -n -B2 -A10 'struct cmdline' lib/cmdline/cmdline.c | head -60
Length of output: 1400
🏁 Script executed:
#!/bin/bash
# Search for close() calls in cmdline.c to see if FDs are closed
rg -n 'close\s*\(' lib/cmdline/cmdline.c
Length of output: 46
🏁 Script executed:
#!/bin/bash
# Look for s_in or s_out usage in cmdline_free or related functions
rg -n -A5 'cmdline_free' lib/cmdline/cmdline.c
Length of output: 146
Avoid double-closing the input FD (no need to dup STDOUT_FILENO)
cmdline_free
closes s_in
if >2 but skips FDs ≤2, so passing your raw fd
to cmdline_new
and then calling close(fd)
at the end will double-close the file. It also never closes STDOUT_FILENO
, so duplicating STDOUT_FILENO
isn’t needed. Fix by either dup’ing only the input FD before cmdline_new
(and closing the original) or by setting fd = -1
after cmdline_free
to skip the final close(fd)
.
🤖 Prompt for AI Agents
In app/test-pmd/cmdline.c around lines 14199-14253, the code passes the raw
input fd to cmdline_new and later closes fd again, risking a double-close
because cmdline_free may close s_in for FDs > 2; fix by duplicating only the
input FD before calling cmdline_new (int in_dup = dup(fd)), pass in_dup to
cmdline_new while immediately closing the original fd, and avoid duplicating or
touching STDOUT_FILENO so it is not closed; alternatively, if you prefer minimal
change, set fd = -1 after cmdline_free to prevent the final close(fd) from
running.
#define ROUNDS 2E7 | ||
#define CPY_LEN 64 | ||
struct rte_mempool *ops_pool, *pkt_pool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ROUNDS: avoid floating literal and reduce test duration
2E7 is a double, causing implicit conversions. Also 20M/max iterations is excessive for app/test. Use an integer literal and consider a lower default.
Apply this minimal type fix:
-#define ROUNDS 2E7
+#define ROUNDS 20000000U
Optionally reduce to a smaller value or derive from capabilities to keep tests fast.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
#define ROUNDS 2E7 | |
#define CPY_LEN 64 | |
struct rte_mempool *ops_pool, *pkt_pool; | |
#define ROUNDS 20000000U | |
#define CPY_LEN 64 | |
struct rte_mempool *ops_pool, *pkt_pool; |
🤖 Prompt for AI Agents
In app/test/test_dmadev.c around lines 1059-1061, the macro ROUNDS is defined as
the floating-point literal 2E7 which triggers implicit conversions and is
unnecessarily large for tests; change ROUNDS to an integer literal (for example
20000000U) or a smaller integer default (e.g., 2000000U) and ensure any code
using ROUNDS expects an integer type (use size_t/unsigned long as appropriate)
to avoid floating-to-integer conversions and shorten test duration.
max = info.max_desc - BURST_SIZE; | ||
tenq = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against too-small descriptor rings
max = info.max_desc - BURST_SIZE can be <= 0; loops using n != max will misbehave. Add a capacity check.
Apply this diff:
- max = info.max_desc - BURST_SIZE;
+ if (info.max_desc <= BURST_SIZE)
+ ERR_RETURN("Device queue too small for BURST_SIZE=%u (max_desc=%u)\n",
+ BURST_SIZE, info.max_desc);
+ max = info.max_desc - BURST_SIZE;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
max = info.max_desc - BURST_SIZE; | |
tenq = 0; | |
if (info.max_desc <= BURST_SIZE) | |
ERR_RETURN("Device queue too small for BURST_SIZE=%u (max_desc=%u)\n", | |
BURST_SIZE, info.max_desc); | |
max = info.max_desc - BURST_SIZE; | |
tenq = 0; |
🤖 Prompt for AI Agents
In app/test/test_dmadev.c around lines 1086-1087, the computation max =
info.max_desc - BURST_SIZE can produce a non-positive value which breaks loops
that iterate until n != max; add a capacity check immediately after reading
info.max_desc to ensure info.max_desc > BURST_SIZE. If the check fails, handle
it by logging/printing a clear message and skipping or aborting the test (return
non-zero or exit) rather than continuing with a non-positive max; otherwise
compute max as before and proceed.
memset((void *)(uintptr_t)ops[j]->src_dst_seg[0].addr, | ||
rte_rand() & 0xFF, CPY_LEN); | ||
memset((void *)(uintptr_t)ops[j]->src_dst_seg[1].addr, 0, CPY_LEN); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IOVA used as VA for memset/memcmp — breakage when IOVA != VA
memset/memcmp on (void *)(uintptr_t)addr assumes VA==IOVA. Use mbuf data pointers.
Apply this diff:
- memset((void *)(uintptr_t)ops[j]->src_dst_seg[0].addr,
- rte_rand() & 0xFF, CPY_LEN);
- memset((void *)(uintptr_t)ops[j]->src_dst_seg[1].addr, 0, CPY_LEN);
+ char *src_va = rte_pktmbuf_mtod(mbufs[j], char *);
+ char *dst_va = rte_pktmbuf_mtod(mbufs[j + BURST_SIZE], char *);
+ memset(src_va, rte_rand() & 0xFF, CPY_LEN);
+ memset(dst_va, 0, CPY_LEN);
@@
- /* check the data is correct */
- ssg = ops[j]->src_dst_seg[0];
- dsg = ops[j]->src_dst_seg[1];
- if (memcmp((void *)(uintptr_t)ssg.addr, (void *)(uintptr_t)dsg.addr,
- ssg.length) != 0)
+ /* check the data is correct */
+ struct rte_mbuf *src_m = (struct rte_mbuf *)(uintptr_t)ops[j]->user_meta;
+ struct rte_mbuf *dst_m = (struct rte_mbuf *)(uintptr_t)ops[j]->event_meta;
+ char *src_va2 = rte_pktmbuf_mtod(src_m, char *);
+ char *dst_va2 = rte_pktmbuf_mtod(dst_m, char *);
+ if (memcmp(src_va2, dst_va2, CPY_LEN) != 0)
ERR_RETURN("Error with copy operation\n");
Also applies to: 1144-1151
🤖 Prompt for AI Agents
In app/test/test_dmadev.c around lines 1113-1116 (and similarly 1144-1151), the
code casts src_dst_seg[].addr (an IOVA) to a VA for memset/memcmp which breaks
when IOVA != VA; instead obtain the packet buffer's virtual data pointer (use
the mbuf associated with the segment and rte_pktmbuf_mtod or
rte_pktmbuf_mtod_offset to compute the VA at the segment offset), ensure CPY_LEN
does not exceed the mbuf data_len, and call memset/memcmp on that VA; update
both occurrences accordingly so you use mbuf data pointers rather than casting
IOVA to a pointer.
printf("\n"); | ||
|
||
rte_mempool_free(pkt_pool); | ||
rte_mempool_free(ops_pool); | ||
|
||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Free resources on early error paths
ERR_RETURN returns immediately and leaks pkt_pool/ops_pool. Convert to a cleanup path that always frees pools.
Example pattern:
- rte_mempool_free(pkt_pool);
- rte_mempool_free(ops_pool);
- return 0;
+out:
+ rte_mempool_free(pkt_pool);
+ rte_mempool_free(ops_pool);
+ return ret;
+err:
+ ret = -1;
+ goto out;
Replace ERR_RETURN inside this function with setting ret and goto err.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In app/test/test_dmadev.c around lines 1159-1164, several ERR_RETURN calls
return early and leak pkt_pool and ops_pool; change those ERR_RETURN occurrences
to set an appropriate ret value and goto err so that the common cleanup at the
end (which frees pkt_pool and ops_pool) always runs. Update the function so the
err label performs rte_mempool_free(pkt_pool) and rte_mempool_free(ops_pool)
only if the pools are non-NULL, and ensure all early-exit paths jump to err
instead of returning directly.
/* basename is on the end, so if path is too long, use only last PATH_MAX bytes */ | ||
const size_t pathlen = strlen(path); | ||
if (pathlen > PATH_MAX) | ||
path = &path[pathlen - PATH_MAX]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truncation may cut path mid-component.
When a path exceeds PATH_MAX, the code truncates to the last PATH_MAX bytes. This could split a path component (e.g., /very/long/path/dirname
might become name
if truncated mid-component), potentially producing incorrect basename results.
Consider documenting this edge case behavior or exploring alternative truncation strategies that preserve path component boundaries.
🤖 Prompt for AI Agents
In lib/eal/unix/rte_basename.c around lines 23 to 26, the current truncation to
the last PATH_MAX bytes can cut a path component in half and yield an incorrect
basename; change the truncation so it preserves component boundaries by, when
pathlen > PATH_MAX, locating the first '/' character inside the retained tail
(i.e., find the first '/' at or after pathlen - PATH_MAX) and set path to the
character after that slash (or to the start of the retained tail if no slash is
found), or alternatively document this edge-case behavior in the function
comment if preserving boundaries is undesired.
int ret = _splitpath_s(path, NULL, 0, dir, sizeof(dir), | ||
fname, sizeof(fname), ext, sizeof(ext)); | ||
if (ret != 0) | ||
return (size_t)-1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value (size_t)-1 may cause confusion.
Returning (size_t)-1
on error results in a very large positive number (SIZE_MAX) rather than a negative error code, which may not be the expected behavior for callers checking for errors.
Consider documenting this error return value clearly in the function's public header documentation, or using a different error indication mechanism if the API allows it.
🤖 Prompt for AI Agents
In lib/eal/windows/rte_basename.c around lines 26-29 the function returns
(size_t)-1 on error (which is SIZE_MAX and can be confusing); either update the
function's public header/doc comment to explicitly state that the function
returns SIZE_MAX on error and callers must check for that value, or change the
implementation to return a more conventional error indicator (e.g., return 0 and
set errno appropriately such as EINVAL) and update all callers/tests to handle
the new convention; if you choose the latter, add errno.h include and ensure the
header doc matches the new behavior.
static void amd_mwaitx(const uint64_t timeout) | ||
{ | ||
RTE_SET_USED(timeout); | ||
#if defined(RTE_TOOLCHAIN_MSVC) || defined(__MWAITX__) | ||
_mm_mwaitx(0, 0, 0); | ||
_mm_mwaitx(2, 0, (uint32_t)timeout); | ||
#else | ||
asm volatile(".byte 0x0f, 0x01, 0xfb;" | ||
: /* ignore rflags */ | ||
: "a"(0), /* enter C1 */ | ||
"c"(0)); /* no time-out */ | ||
"b"((uint32_t)timeout), | ||
"c"(2)); /* enable time-out */ | ||
#endif | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Verify timeout behavior with test coverage.
The previous implementation completely ignored the timeout parameter (hence the removed RTE_SET_USED(timeout)
mentioned in the AI summary). The new implementation enables timeout functionality for AMD MWAITX. This is a significant behavioral change that requires thorough testing.
Please ensure:
- Unit tests validate that MWAITX wakes up correctly when the timeout expires
- Tests verify timeout precision (especially edge cases near uint32_t limits)
- Tests confirm consistent timeout behavior between Intel and AMD code paths
- Integration tests validate the power monitor API behavior on AMD platforms
Run the following script to check for existing test coverage:
🏁 Script executed:
#!/bin/bash
# Description: Search for test coverage of amd_mwaitx and power monitor timeout functionality
# Search for test files related to power intrinsics
rg -t c -t h --files | rg -i 'test.*power|power.*test'
# Search for test cases covering mwaitx or timeout behavior
rg -n -A 10 -B 2 'amd_mwaitx|MWAITX|power.*timeout' --type c --type h -g '**/*test*'
# Search for references to rte_power_monitor in test code
rg -n -A 5 'rte_power_monitor\(' --type c -g '**/*test*'
Length of output: 115
🏁 Script executed:
#!/bin/bash
# Examine main power test file for timeout coverage
echo "=== Checking test_power.c for power monitor and timeout tests ==="
cat app/test/test_power.c | head -100
echo -e "\n=== Searching for test functions related to power monitor ==="
rg -n 'test_power|power_monitor' app/test/test_power.c | head -20
echo -e "\n=== Checking for timeout-related test logic ==="
rg -n -C 5 'timeout' app/test/test_power.c
echo -e "\n=== Checking for AMD-specific test or initialization ==="
rg -n -C 3 'amd|AMD' app/test/test_power.c
echo -e "\n=== Verifying if amd_mwaitx is tested elsewhere ==="
rg -r 'amd_mwaitx' app/test/
Length of output: 9887
🏁 Script executed:
#!/bin/bash
# Broader search for power_monitor and intrinsics testing
echo "=== Search for rte_power_monitor tests ==="
rg -r 'rte_power_monitor' app/test/ --type c
echo -e "\n=== Search for power intrinsics test files ==="
find app/test -name '*intrinsic*' -o -name '*power_intr*'
echo -e "\n=== Check if power monitor timeout API is tested at all ==="
rg -r 'rte_power_monitor\|rte_power_pause\|power_monitor' app/test/ --type c -C 2
echo -e "\n=== Look at other power test files for timeout coverage ==="
ls -la app/test/test_power*.c
echo -e "\n=== Check header for amd_mwaitx signature ==="
rg -n 'amd_mwaitx' lib/eal/x86/rte_power_intrinsics.c -B 2 -A 8
Length of output: 3504
Add test coverage for AMD mwaitx timeout behavior
- No existing tests cover
amd_mwaitx
or its timeout parameter. - Add unit tests to verify wake-up on timeout expiry, precision near
uint32_t
limits, and parity with the Intel path. - Add integration tests to validate
rte_power_monitor
timeout behavior on AMD platforms.
RTE_SET_USED(timeout); | ||
#if defined(RTE_TOOLCHAIN_MSVC) || defined(__MWAITX__) | ||
_mm_mwaitx(0, 0, 0); | ||
_mm_mwaitx(2, 0, (uint32_t)timeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Semantic mismatch between absolute TSC timestamp and relative timeout interval.
The amd_mwaitx
function receives a 64-bit timeout
parameter that appears to be an absolute TSC timestamp (based on the caller passing tsc_timestamp
at line 209), consistent with Intel's umwait
instruction usage (lines 56-69). However, AMD's MWAITX instruction expects a relative timeout interval in EBX, not an absolute timestamp.
Issues:
-
Semantic mismatch: Passing an absolute TSC timestamp as a relative interval will cause incorrect timeout behavior. For example, if the current TSC is 1,000,000,000 and the target timestamp is 1,000,100,000 (100K ticks in the future), truncating to uint32_t and using as a relative interval could produce a timeout of either ~100K ticks (if lucky with truncation) or a wildly incorrect value.
-
Truncation hazard: Casting uint64_t to uint32_t (lines 93, 98) discards the upper 32 bits. For absolute TSC values (typically in the billions or trillions), the truncated value will be incorrect.
-
Inconsistent behavior: Intel UMWAIT and AMD MWAITX paths now have different timeout semantics, breaking API uniformity.
Expected behavior:
For AMD MWAITX, you should:
- Convert the absolute
tsc_timestamp
to a relative interval:timeout_interval = tsc_timestamp - current_tsc
- Clamp to uint32_t max if the interval exceeds 32-bit range
- Pass the relative interval to MWAITX
Apply this diff to fix the relative timeout calculation:
static void amd_mwaitx(const uint64_t timeout)
{
+ uint64_t cur_tsc = rte_rdtsc();
+ uint64_t interval = (timeout > cur_tsc) ? (timeout - cur_tsc) : 0;
+ /* MWAITX timer is 32-bit; clamp if necessary */
+ uint32_t timeout_val = (interval > UINT32_MAX) ? UINT32_MAX : (uint32_t)interval;
+
#if defined(RTE_TOOLCHAIN_MSVC) || defined(__MWAITX__)
- _mm_mwaitx(2, 0, (uint32_t)timeout);
+ _mm_mwaitx(2, 0, timeout_val);
#else
asm volatile(".byte 0x0f, 0x01, 0xfb;"
: /* ignore rflags */
: "a"(0), /* enter C1 */
- "b"((uint32_t)timeout),
+ "b"(timeout_val),
"c"(2)); /* enable time-out */
#endif
}
Additionally, add #include <rte_cycles.h>
for rte_rdtsc()
and #include <stdint.h>
for UINT32_MAX
at the top of the file if not already present.
Also applies to: 98-99
🤖 Prompt for AI Agents
In lib/eal/x86/rte_power_intrinsics.c around lines 93 and 98-99, the code
currently passes an absolute 64-bit TSC timestamp (timeout) directly to AMD's
_mm_mwaitx after truncating to uint32_t, but MWAITX expects a relative interval;
change the logic to read the current TSC via rte_rdtsc(), compute
timeout_interval = (timeout > current_tsc) ? (timeout - current_tsc) : 0, clamp
the interval to UINT32_MAX if it exceeds 32-bit, and pass the resulting
(uint32_t) interval to _mm_mwaitx/_mm_mwaitx_emulation; also add #include
<rte_cycles.h> (for rte_rdtsc()) and #include <stdint.h> (for UINT32_MAX) at the
top of the file if they are not present.
if (graph_from_id(id) == NULL) { | ||
rte_errno = ENOENT; | ||
goto fail; | ||
if (!rte_lcore_is_enabled(lcore)) | ||
SET_ERR_JMP(ENOLINK, fail, "lcore %d not enabled", lcore); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Verify error code consistency.
The code sets rte_errno = ENOENT
when the graph is not found, but graph_from_id()
at line 49 sets rte_errno = EINVAL
for the same condition. This creates an inconsistency.
Consider either:
- Update
graph_from_id()
to returnENOENT
instead ofEINVAL
for consistency across all callers - Remove line 344 and rely on the error code set by
graph_from_id()
Run this script to check if other callers rely on EINVAL from graph_from_id:
🏁 Script executed:
#!/bin/bash
# Find all uses of graph_from_id and check error handling patterns
ast-grep --pattern $'if (graph_from_id($$$) == NULL) {
$$$
}'
Length of output: 159
---
🏁 Script executed:
```shell
#!/bin/bash
# Find the definition of graph_from_id and see what it does
rg -n "^[a-zA-Z_].*graph_from_id" lib/graph/graph.c | head -20
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Find all calls to graph_from_id (not just NULL checks)
rg -n "graph_from_id\(" lib/graph/graph.c -C2
Length of output: 845
🏁 Script executed:
#!/bin/bash
# Look for rte_errno assignments in graph_from_id function
rg -n "EINVAL|ENOENT" lib/graph/graph.c | head -30
Length of output: 289
🏁 Script executed:
#!/bin/bash
# Find the actual graph_from_id function definition with context
rg -n "static.*graph_from_id|^graph_from_id" lib/graph/graph.c -A10
Length of output: 211
🏁 Script executed:
#!/bin/bash
# Check if there are other error handling patterns after graph_from_id calls
rg -B2 -A3 "graph_from_id" lib/graph/graph.c
Length of output: 896
Standardize missing-graph error code.
graph_from_id()
(line 49) currently sets rte_errno = EINVAL
when no match is found, yet the caller at lib/graph/graph.c:343-346 overrides it to ENOENT
. Either update graph_from_id()
to use ENOENT
and remove the override, or drop the override and keep EINVAL
everywhere.
🤖 Prompt for AI Agents
In lib/graph/graph.c around lines 343-346, the call path is setting conflicting
errno values for a missing graph: graph_from_id() currently sets rte_errno =
EINVAL on no-match while the caller here overrides it to ENOENT; standardize
this by updating graph_from_id() to set rte_errno = ENOENT when no graph is
found and then remove the overriding rte_errno = ENOENT at these lines (so
callers rely on graph_from_id() for the canonical error code).
Auto-submission for "http://patchwork.dpdk.org/project/dpdk/list/?series=36374"
Summary by CodeRabbit
Release Notes
New Features
API Changes
Bug Fixes