sync #2

nv6 · 2025-04-13T22:38:06Z

No description provided.

When the 'mirror-to' option was conceived, it was intended to allow users the ability to arbitrarily build their own ports to use the mirror port rather than always using an algorithmically defined mirror port. However, the mirror port code never accommodated a port that the user defined as part of an OVS bridge. This could be useful for running against all kinds of OVS specific ports. Adjust the 'mirror-to' option so that when the user specifies a port, the utility searches through system interfaces, as well as the OVS DB. This means that we need to drop the port_exists() check for the mirror port, but that should be okay. Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-March/053508.html Reported-at: https://issues.redhat.com/browse/FDP-1194 Reported-by: Jun Wang <junwang01@cestc.cn> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com>

Load dictionary_code.txt in addition to the default dictionary. Add a new command line argument --dictionaries to be able to specify codespell dictionaries. Acked-by: Salem Sol <salems@nvidia.com> Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Aaron Conole <aconole@redhat.com>

When a new bucket is inserted to an existing group, OVS creates a new group with just that one bucket and then copies old buckets into it. The problem is that dp_hash mappings remain initialized for that one bucket and no traffic can be sent to any of the old buckets. If a bucket is removed, then OVS creates an empty new group and then copies old buckets into it, except for the removed one. Mappings are also not updated in this case and the group behaves as if it had no buckets at all. We need to re-hash all the buckets after the copy of the old buckets. ofproto-provider API already has a callback for this case, but it just wasn't implemented for ofproto-dpif. It wasn't necessary in the past, but it became necessary when the hash map construction was moved to the ofproto-dpif layer. No locking in this function is required, because the new group must not be visible during modification. It becomes visible only after the GROUP_MOD processing is finished. Fixes: 2e3fd24 ("ofproto-dpif: Improve dp_hash selection method for select groups") Reported-at: openvswitch/ovs-issues#357 Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com>

While creating passive TCP connections the library stores part of the connection method as a bind path regardless of it being a path. And then it may attempt to unlink it on close: fatal-signal | WARN | could not unlink "27.0.0.1:58866" (No such file or directory) Unlinking only makes sense for unix sockets, not TCP, so the variable should only be initialized for "punix" case. It's not a big problem since those files are unlikely to exist, but it generates strange warnings in the logs. Fixes: af35823 ("python: Add TCP passive-mode to IDL.") Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

The weak reference test sends two transactions and then waits for an update and sends a third transaction. First transaction adds a row and the second one deletes it. The test relies on an IDL run between the second and the third transactions to receive and process updates from both previous transactions, and that is not guaranteed. This is causing frequent test failures in CI. If the server is slow to send an update for the second transaction, the IDL run will return without receiving it and the test application has no way to know that it needs to wait for one more update. The change sequence number doesn't have a meaning other than it changes on updates, so we can't rely on it to wait for two updates. The strong reference test has somewhat similar issue, it sends two transactions and then expects both updates to be received together. And that is, again, not guaranteed. However, the difference here is that the issue (crash) that it tries to reproduce requires both updates to be received within the same IDL run, which was not a hard requirement for the weak reference test. The weak reference test tolerates receiving updates in two separate IDL runs, so we could have fixed it by waiting for an update after the first transaction without clearing the tracking and then waiting again after the second. But such approach would not work for the strong reference test. And there is actually no way for us to ensure that both updates will be received within the same IDL run. Best we can do is to sleep for some time hoping that it's enough for both updates to be queued up. So, let's do that. Tests are very simple and fast, so sleeping for 1 second should be enough. 'refs to link[12]' tests are similar to a strong reference test in terms that they require changes to be processed within the same IDL run. Fixes: 02f31a1 ("ovsdb-idl: Preserve references for rows deleted in same IDL run as their insertion.") Fixes: 91e1ff5 ("ovsdb-idl: Don't reparse orphaned rows.") Fixes: 4102674 ("ovsdb-idl: Preserve change_seqno when deleting rows.") Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

This addition uses the existing syscall probes to record statistics related to the OVS 'Unreasonably long ... ms poll interval' message. Basically, it records the min/max/average time between system poll calls. This can be used to determine if a long poll event has occurred during the capture. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com>

Since we use a floating user (user with dynamic userid) and floating groups (groups with dynamic groupid), when you use bootc the uid/gid of the directory may change and so it's necessary to be sure that the uid/gid is updated each time you try to start the daemon. ovsdb-server.service is the correct place to do that, since ovs-vswitchd.service uses After=ovsdb-server.service and so it's always started after it. See bootc-dev/bootc#673 (comment) Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com>

Checkpatch has the ability to scan source files rather than exclusively scanning patch files. In this mode, some checks get disabled (ex: all of the header and commit message parsing) and the display modes are adjusted to simply print the file and line, rather than including the patch line details. When checkpatch was updated for subject parsing, it was inadvertently ignoring the source file mode. This means any checks run against a source file will always print a bogus subject suggestion, and warning. Fix this by not printing these warnings when in source file mode. This should have no effect on checkpatch against actual patch files. Additionally, introduce two simple tests to verify that the source file mode still functions. Fixes: 1b8fa4a ("checkpatch: Add checks for the subject line.") Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com>

Acked-by: Simon Horman <horms@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com>

flake8 2.7.0 introduced the F824 check [0]. While the check sounds reasonable on the surface, we have code in the repository where the global variable is used only for access. My conclusion from evaluating the alternative of adjusting the code to appease the check is that it would invite future mistakes where the global variable is unintentionally shadowed in addition to making the code less clear and explicit. 0: PyCQA/flake8#1974 1: PyCQA/pyflakes#825 Signed-off-by: Frode Nordahl <fnordahl@ubuntu.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Simon Horman <horms@ovn.org>

All the calls to put_drop_action checks first if it is supported. Instead, embed the check inside. Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Eli Britstein <elibr@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Rebalancing stats are printed once in a run with INFO log level. Old detailed per-hash rebalancing stats now printed with DBG log level. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2025-March/422028.html Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Vladislav Odintsov <vlodintsov@k2.cloud> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

When the cluster falls apart, there is no leader and so there are no heartbeats that need to be sent or received on time. In this case the RAFT_TIMER_THRESHOLD is too small, as the process will only wake up for a full election timer. When this happens, cooperative multitasking code will emit warnings and traces thinking there is a timer overrun. Fix that by setting the multitasking interval according to the full election timer, if there is no known leader. Fixes: d4a1564 ("ovsdb: raft: Enable cooperative multitasking.") Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

These logs have been proven useful while debugging complex leadership change scenarios. Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Consider a following scenario in a 3-node cluster: 1. node-1 goes down for some reason without leaving the cluster. 2. Some changes are committed to the log. 3. node-2 requests to leave the cluster with the cluster/leave. 4. node-3 also requests to leave the cluster. 5. node-1 comes back online. In this scenario today the cluster breaks and doesn't recover. The reason is that both node-2 and node-3 are in the 'leaving' state, so they do not initiate elections. node-1 is behind on log updates and can't become a leader. But there is no leader to catch it up. In order for the cluster to recover in this situation, one of the leaving servers must become a leader again, then catch up node-1 with the log updates and then node-1 can become a new leader, process both server removal requests and become an operational cluster of 1. The 'leaving' state is not a real state of the server in RAFT and the server should still participate in the cluster until the removal is committed by the majority of the NEW configuration. This can be achieved by allowing the server in a 'leaving' state to initiate elections and become a leader again this way. Becoming a leader also means that this server will need to be able to execute commands and do all the normal tasks of a leader. Since the leader can't leave the cluster though, this server will need to attempt to transfer leadership again in order to actually leave. This should be done after a leave timeout. The time between becoming a leader and transferring the leadership will allow node-1 to get up to speed with all the cluster updates and be ready to become a new leader. Sending a server removal request right after transferring leadership can't succeed, because the other server has to go through election in order to become a leader before it can process a removal request. So, adding a delay between the transfer and the removal request. It can be lower than the election timer, just to avoid waiting for too long if the election timer has a large value. But it should be randomized, so multiple leaving servers do not just bounce the leadership without actually getting to the server removal request. A test is added to stress different scenarios with servers leaving while some of the cluster members are down. Fixes: 1b1d2e6 ("ovsdb: Introduce experimental support for clustered databases.") Reported-at: https://issues.redhat.com/browse/FDP-662 Tested-by: Terry Wilson <twilson@redhat.com> Acked-by: Terry Wilson <twilson@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Without this patch, the chances of events being lost on highly loaded systems increase. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

Generate events for dropped upcalls and add the print the result. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

This patch makes it possible to display only succeeded or errored upcalls. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

The upcall_cost.py script has a useful port mapping cache that can be very useful for other scripts. Move it into a library file (usdt_lib.py). While we're at it, refactor it into a class since setting attributes to functions and having it return different types depending on input is not very clean. This patch should not introduce any functional change. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

Printing just the datapath on each upcall gives little information (most often, there will only be one well-known datapath). Instead, print both the input port name (plus the datapath). In order to do this, refactor decode_nla to always generate the dump that only gets printed if needed. That way it can be called earlier on. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

Drop events are generated from the kernel, not ovs-vswitchd so instead of relying on nla decoding, copy the skb's device name directly into the event. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

Use pcapng instead of pcap format and store the result, the key (if available) and the input port name so they are visible in wireshark/tshark. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

If the dp cache request misses, retry but only a couple of times so we don't overload the system and only if we're not running on an externally provided cache. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

apconole and others added 24 commits March 21, 2025 12:47

ofproto-dpif: Fix spelling in comments and the support field macro.

ba67589

Acked-by: Simon Horman <horms@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com>

AUTHORS: Update email for Ian Stokes.

50dfd1c

Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Simon Horman <horms@ovn.org>

ovsdb: raft: Add debug logs for role transition.

2731546

These logs have been proven useful while debugging complex leadership change scenarios. Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

utilities: usdt-scripts: Remove sleep in loop.

4414078

Without this patch, the chances of events being lost on highly loaded systems increase. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

utilities: upcall_monitor: Monitor dropped upcalls.

c40a7ff

Generate events for dropped upcalls and add the print the result. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

utilities: upcall_monitor: Allow filtering on result.

ca804a7

This patch makes it possible to display only succeeded or errored upcalls. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

nv6 merged commit 0f64bf1 into nv6:enhance-ovs-docker Apr 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync #2

sync #2

Uh oh!

nv6 commented Apr 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

sync #2

sync #2

Uh oh!

Conversation

nv6 commented Apr 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants