Skip to content

Conversation

@ovsrobot
Copy link
Owner

@ovsrobot ovsrobot commented Oct 27, 2025

Auto-submission for "http://patchwork.dpdk.org/project/dpdk/list/?series=36474"

Summary by Sourcery

Add network namespace support and netlink-based control to the tap PMD, replacing direct ioctl calls and enabling dynamic interface movement handling.

New Features:

  • Introduce tap_ctrl to choose netlink or ioctl control path and implement netlink-based get/set operations for link flags, MAC address, and MTU.
  • Add tap_netns_change and subscribe to RTM_DELLINK events to detect and handle interfaces moving across network namespaces using TUNGETDEVNETNS.
  • Implement new netlink helper functions in tap_netlink.c for managing interface properties by ifindex.

Enhancements:

  • Rename ioctl_mode to ctrl_mode and replace direct tap_ioctl calls with unified tap_ctrl across the driver.
  • Initialize a netlink socket and track interface index during device creation with an ioctl fallback when netlink is unavailable.
  • Move nlsk_fd out of the HAVE_TCA_FLOWER guard, ensure proper cleanup, and consolidate netlink subscription setup in LSC interrupt handling.

Summary by CodeRabbit

Release Notes

  • New Features

    • TAP device driver now utilizes netlink-based control path as primary method, with automatic fallback to legacy interface for compatibility
    • Added network namespace awareness to automatically detect and handle interface moves across namespaces
    • Enhanced link state event handling with improved netlink integration
  • Bug Fixes

    • Improved resource management during device lifecycle to ensure proper socket cleanup

Add functions to get/set link flags, MAC address, and MTU using netlink
RTM_GETLINK/RTM_SETLINK messages instead of ioctl.

These will be used in the next commits for a more robust solution that
does not rely on interface names.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
Prepare to replace ioctl with netlink by renaming enum ioctl_mode to
ctrl_mode and wrapping tap_ioctl with tap_ctrl.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
Make netlink socket available unconditionally, not just for rte_flow.

Use netlink for get/set operations on link flags, MAC, and MTU when
available. Fall back to ioctl if netlink socket creation fails.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
When an interface is moved to another network namespace, the kernel
sends RTM_DELLINK. Detect this case by using TUNGETDEVNETNS ioctl on the
keep-alive fd. If successful, the interface still exists but in
a different namespace.

To handle this, temporarily switch to the new namespace using setns(),
query the new ifindex, recreate netlink and LSC interrupt sockets in
that namespace, then switch back. Replace the old netlink socket with
the new one so subsequent operations work in the target namespace.

This allows the driver to track interfaces across namespace changes
without losing control.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
@sourcery-ai
Copy link

sourcery-ai bot commented Oct 27, 2025

Reviewer's Guide

This PR refactors the TAP PMD to abstract interface control via a new tap_ctrl layer that uses netlink when available (falling back to ioctl), implements support for detecting and handling network namespace moves, and adds a suite of netlink helper functions for flag, MTU, and MAC management.

Sequence diagram for TAP PMD interface control abstraction (netlink fallback to ioctl)

sequenceDiagram
    participant App
    participant TAP_PMD
    participant Netlink
    participant Ioctl
    App->>TAP_PMD: Request interface control (e.g., set MAC, MTU)
    TAP_PMD->>Netlink: Try netlink operation (tap_ctrl)
    alt Netlink available
        Netlink-->>TAP_PMD: Success/failure
    else Netlink unavailable
        TAP_PMD->>Ioctl: Fallback to ioctl operation
        Ioctl-->>TAP_PMD: Success/failure
    end
    TAP_PMD-->>App: Return result
Loading

Sequence diagram for network namespace move detection and handling

sequenceDiagram
    participant Kernel
    participant TAP_PMD
    participant Netlink
    participant Namespace
    Kernel->>Netlink: RTM_DELLINK event (interface moved)
    Netlink->>TAP_PMD: Notify of RTM_DELLINK
    TAP_PMD->>Namespace: Query new netns via TUNGETDEVNETNS
    TAP_PMD->>Namespace: setns() to new namespace
    TAP_PMD->>Netlink: Recreate netlink socket in new namespace
    TAP_PMD->>Namespace: setns() back to original namespace
    TAP_PMD-->>Netlink: Update ifindex and sockets
Loading

Class diagram for TAP PMD interface control abstraction and netlink helpers

classDiagram
    class pmd_internals {
        int remote_if_index
        int if_index
        int ioctl_sock
        int nlsk_fd
        ...
    }
    class tap_ctrl {
        +tap_ctrl(pmd, request, ifr, set, mode)
    }
    class tap_nl_ctrl {
        +tap_nl_ctrl(pmd, request, ifr, set, mode)
    }
    class tap_ioctl {
        +tap_ioctl(pmd, request, ifr, set, mode)
    }
    class NetlinkHelpers {
        +tap_nl_get_link_flags()
        +tap_nl_set_link_flags()
        +tap_nl_set_link_mtu()
        +tap_nl_set_link_mac()
        +tap_nl_get_link_mac()
    }
    pmd_internals --> tap_ctrl
    tap_ctrl --> tap_nl_ctrl
    tap_ctrl --> tap_ioctl
    tap_nl_ctrl --> NetlinkHelpers
Loading

Class diagram for network namespace change handling

classDiagram
    class pmd_internals {
        int if_index
        int nlsk_fd
        int ka_fd
        char name[]
        ...
    }
    class tap_netns_change {
        +tap_netns_change(dev)
    }
    pmd_internals --> tap_netns_change
    tap_netns_change : - open("/proc/self/ns/net")
    tap_netns_change : - setns(netns_fd, CLONE_NEWNET)
    tap_netns_change : - update if_index
    tap_netns_change : - recreate netlink socket
    tap_netns_change : - setns(orig_netns_fd, CLONE_NEWNET)
Loading

File-Level Changes

Change Details Files
Abstract interface control to use netlink when available
  • Rename ioctl_mode to ctrl_mode
  • Introduce tap_nl_ctrl and tap_ctrl to dispatch netlink or ioctl
  • Replace all tap_ioctl calls with tap_ctrl
  • Initialize and cleanup nlsk_fd in rte_eth_tap.c and struct
  • Update dev creation and close paths to handle netlink fallback
drivers/net/tap/rte_eth_tap.c
drivers/net/tap/rte_eth_tap.h
Support network namespace moves for TAP devices
  • Implement tap_netns_change to detect namespace relocation via TUNGETDEVNETNS
  • Update tap_nl_msg_handler to handle RTM_DELLINK events and trigger netns change logic
  • Ensure LSC interrupt subscription includes RTMGRP_LINK for namespace events
drivers/net/tap/rte_eth_tap.c
Add netlink helper functions for link management
  • Implement tap_nl_get/set_link_flags, tap_nl_set_link_mtu, tap_nl_get/set_link_mac in tap_netlink.c
  • Add link management prototypes to tap_netlink.h
drivers/net/tap/tap_netlink.c
drivers/net/tap/tap_netlink.h
Clean up and adjust headers
  • Add linux/sched.h include in rte_eth_tap.c
  • Include net/if.h in tap_netlink.c and remove redundant includes in headers
  • Add rte_ether.h include in tap_netlink.h
drivers/net/tap/rte_eth_tap.c
drivers/net/tap/tap_netlink.c
drivers/net/tap/tap_netlink.h

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link

coderabbitai bot commented Oct 27, 2025

Walkthrough

This pull request introduces a netlink-based control path abstraction layer for the TAP PMD driver, adding netlink socket management, namespace awareness, and new link management APIs while maintaining backward compatibility with existing ioctl-based operations through a fallback mechanism.

Changes

Cohort / File(s) Summary
Control path abstraction and ioctl-to-netlink wrapper
drivers/net/tap/rte_eth_tap.c
Renamed enum ioctl_mode to enum ctrl_mode; updated tap_ioctl signature to accept the new enum type; introduced tap_ctrl wrapper that selects between netlink (tap_nl_ctrl) and ioctl paths; added netlink-based implementations for SIOCSIFFLAGS, SIOCGIFFLAGS, SIOCSIFHWADDR, SIOCGIFHWADDR, and SIOCSIFMTU; updated all call sites from tap_ioctl to tap_ctrl; added namespace detection (tap_netns_change) with TUNGETDEVNETNS logic; enhanced LSC interrupt handling to subscribe to RTMGRP_LINK events; updated logging to use tap_ctrl_req2str.
Structure reorganization
drivers/net/tap/rte_eth_tap.h
Moved nlsk_fd field outside the #ifdef HAVE_TCA_FLOWER conditional block to make netlink socket descriptor unconditionally available.
New netlink link management APIs
drivers/net/tap/tap_netlink.c
Added five new link management functions: tap_nl_get_link_flags, tap_nl_set_link_flags, tap_nl_set_link_mtu, tap_nl_get_link_mac, tap_nl_set_link_mac; introduced tap_nl_link_cb callback and link_info_ctx structure to parse RTM_NEWLINK/RTM_SETLINK responses; implemented request/response mechanism using netlink attributes (IFLA_ADDRESS, IFLA_MTU).
Netlink link management declarations
drivers/net/tap/tap_netlink.h
Added rte_ether.h include; declared five new public link management function prototypes; removed unused ctype.h and stdio.h includes.

Sequence Diagram

sequenceDiagram
    participant caller as PMD Control Path
    participant ctrl as tap_ctrl() wrapper
    participant nlctrl as tap_nl_ctrl() netlink
    participant ioctrl as tap_ioctl() ioctl
    participant kernel as Kernel

    caller->>ctrl: tap_ctrl(request, mode)
    
    alt nlsk_fd available
        ctrl->>nlctrl: tap_nl_ctrl(request, mode)
        nlctrl->>kernel: netlink: RTM_SETLINK / RTM_GETLINK
        kernel-->>nlctrl: response + attributes
        nlctrl-->>ctrl: result (0 or -1)
    else nlsk_fd unavailable (fallback)
        ctrl->>ioctrl: tap_ioctl(request, mode)
        ioctrl->>kernel: ioctl(SIOCSIFFLAGS / SIOCGIFMTU / ...)
        kernel-->>ioctrl: result
        ioctrl-->>ctrl: result (0 or -1)
    end
    
    ctrl-->>caller: return status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • rte_eth_tap.c: Extensive refactoring affecting numerous call sites across MAC, MTU, flags, and promisc operations; requires careful verification of netlink/ioctl fallback logic and namespace handling (tap_netns_change).
  • Resource lifecycle: nlsk_fd finalization order and error path handling need careful inspection.
  • New netlink functions in tap_netlink.c: Five new implementations with callback-based parsing; verify proper error handling (ENODEV) and attribute extraction logic.
  • Integration verification: Ensure all updated call sites correctly pass updated enum ctrl_mode parameters and maintain backward compatibility.

Poem

🐰 Control flows split like rabbit paths—netlink takes the lead,
Yet fallback trails ensure we reach our destination at speed.
Namespaces shift, sockets dance in sync,
Link flags and MAC addresses, linked through netlink.
Flexibility hops forward, compatibility stays behind!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.59% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "add network namespace support" refers to a real and significant aspect of the changeset. The pull request does introduce namespace awareness functionality through tap_netns_change and TUNGETDEVNETNS logic that detects and adapts to interface moves across network namespaces. However, this feature is one component of a larger refactoring that primarily centers on introducing a netlink-based control path abstraction to replace the ioctl-focused flow, including new tap_nl_ctrl and tap_ctrl wrappers and five new public APIs for netlink-based link management. The title captures a meaningful part of the change but does not highlight the most substantial architectural transformation occurring in this changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch series_36474

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
drivers/net/tap/tap_netlink.h (1)

13-13: Header API additions are fine; consider normalizing ifindex types

Prototypes look correct. For consistency with kernel structs, consider using int for ifindex to match struct ifinfomsg.ifi_index across header/impl in a follow-up.

Also applies to: 41-47

drivers/net/tap/tap_netlink.c (1)

416-471: Netlink link helpers LGTM

Parsing and RTM_[GET|SET]LINK usage are correct; errno propagation is sane. Minor nit: consider sizing the attribute buffer via RTA_SPACE(...) instead of a fixed 64 bytes for future extensibility.

Also applies to: 486-521, 538-563, 577-608, 623-658, 673-704

drivers/net/tap/rte_eth_tap.c (1)

1905-1952: Make tap_lsc_intr_handle_set idempotent when set==1

If an FD is already present, setting a new one should first clean the old one to avoid leaks. You can optionally add:

@@ static int tap_lsc_intr_handle_set(struct rte_eth_dev *dev, int set)
-    if (set) {
+    if (set) {
+        if (rte_intr_fd_get(pmd->intr_handle) >= 0)
+            (void)tap_lsc_intr_handle_set(dev, 0);
         /*
          * Subscribe to RTMGRP_LINK ...
          */

This complements the netns change fix and hardens repeated enable paths.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39b54f2 and 18bd908.

📒 Files selected for processing (4)
  • drivers/net/tap/rte_eth_tap.c (24 hunks)
  • drivers/net/tap/rte_eth_tap.h (1 hunks)
  • drivers/net/tap/tap_netlink.c (2 hunks)
  • drivers/net/tap/tap_netlink.h (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
drivers/net/tap/tap_netlink.h (1)
drivers/net/tap/tap_netlink.c (5)
  • tap_nl_get_link_flags (486-521)
  • tap_nl_set_link_flags (538-562)
  • tap_nl_set_link_mtu (577-608)
  • tap_nl_set_link_mac (673-704)
  • tap_nl_get_link_mac (623-658)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (2)
drivers/net/tap/rte_eth_tap.h (1)

80-80: Unconditional nlsk_fd field looks good

Initialized to -1 and properly closed in tap_dev_close; no ABI concerns for internal struct. Please confirm no stale references remain under HAVE_TCA_FLOWER blocks.

drivers/net/tap/rte_eth_tap.c (1)

931-933: Control-path migration to tap_ctrl is consistent

Call sites correctly switch from ioctl to tap_ctrl with proper mode selection; fallbacks are preserved. After error-propagation fix in tap_nl_ctrl, behavior parity with ioctl is maintained.

Also applies to: 941-943, 1337-1345, 1359-1370, 1388-1399, 1418-1429, 1448-1459, 1475-1516, 1763-1764, 2201-2253, 2281-2306

#include <net/if.h>
#include <linux/if_tun.h>
#include <linux/if_ether.h>
#include <linux/sched.h>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Include the correct header for setns()

setns() is declared in <sched.h>, not <linux/sched.h>. Add <sched.h> to avoid implicit declaration (often a build error with -Werror). Keeping <linux/sched.h> only for CLONE_NEWNET is unnecessary; <sched.h> provides the prototype and CLONE_NEWNET as well on glibc systems.

Apply:

-#include <linux/sched.h>
+#include <sched.h>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#include <linux/sched.h>
#include <sched.h>
🤖 Prompt for AI Agents
In drivers/net/tap/rte_eth_tap.c around line 40, the code currently includes
<linux/sched.h> but setns() is declared in <sched.h>; add an #include <sched.h>
and remove the #include <linux/sched.h> (or at minimum replace it) so the
setns() prototype is available and CLONE_NEWNET is still defined on glibc
systems, avoiding implicit-declaration build errors.

Comment on lines +778 to +859
static int
tap_nl_ctrl(struct pmd_internals *pmd, unsigned long request,
struct ifreq *ifr, int set, enum ctrl_mode mode)
{
bool remote = pmd->remote_if_index && (mode == REMOTE_ONLY || mode == LOCAL_AND_REMOTE);
struct rte_ether_addr *mac;
int ret = 0;

switch (request) {
case SIOCSIFFLAGS:
if (mode == LOCAL_ONLY || mode == LOCAL_AND_REMOTE) {
ret = tap_nl_set_link_flags(pmd->nlsk_fd, pmd->if_index,
ifr->ifr_flags, set);
if (ret < 0)
return ret;
}
if (remote)
ret = tap_nl_set_link_flags(pmd->nlsk_fd, pmd->remote_if_index,
ifr->ifr_flags, set);
break;

case SIOCGIFFLAGS:
if (mode == REMOTE_ONLY && pmd->remote_if_index) {
unsigned int flags = 0;
ret = tap_nl_get_link_flags(pmd->nlsk_fd, pmd->remote_if_index, &flags);
if (ret == 0)
ifr->ifr_flags = flags;
} else {
unsigned int flags = 0;
ret = tap_nl_get_link_flags(pmd->nlsk_fd, pmd->if_index, &flags);
if (ret == 0)
ifr->ifr_flags = flags;
}
break;

case SIOCGIFHWADDR:
mac = (struct rte_ether_addr *)ifr->ifr_hwaddr.sa_data;
if (mode == REMOTE_ONLY && pmd->remote_if_index) {
ret = tap_nl_get_link_mac(pmd->nlsk_fd, pmd->remote_if_index, mac);
if (ret == 0)
ifr->ifr_hwaddr.sa_family = AF_LOCAL;
} else {
ret = tap_nl_get_link_mac(pmd->nlsk_fd, pmd->if_index, mac);
if (ret == 0)
ifr->ifr_hwaddr.sa_family = AF_LOCAL;
}
break;

case SIOCSIFHWADDR:
mac = (struct rte_ether_addr *)ifr->ifr_hwaddr.sa_data;
if (mode == LOCAL_ONLY || mode == LOCAL_AND_REMOTE) {
ret = tap_nl_set_link_mac(pmd->nlsk_fd, pmd->if_index, mac);
if (ret < 0)
return ret;
}
if (remote)
ret = tap_nl_set_link_mac(pmd->nlsk_fd, pmd->remote_if_index, mac);
break;

case SIOCSIFMTU:
if (mode == LOCAL_ONLY || mode == LOCAL_AND_REMOTE) {
ret = tap_nl_set_link_mtu(pmd->nlsk_fd, pmd->if_index, ifr->ifr_mtu);
if (ret < 0)
return ret;
}
if (remote)
ret = tap_nl_set_link_mtu(pmd->nlsk_fd, pmd->remote_if_index, ifr->ifr_mtu);
break;

default:
TAP_LOG(WARNING, "%s: unsupported netlink request", pmd->name);
return -EINVAL;
}

if (ret < 0) {
TAP_LOG(DEBUG, "%s: netlink %s failed: %s(%d)", pmd->name,
tap_ctrl_req2str(request), strerror(errno), errno);
return -errno;
}

return 0;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize error returns to negative errno in tap_nl_ctrl

Early returns propagate -1 instead of -errno, unlike tap_ioctl. This breaks callers that rely on negative errno codes.

Apply:

@@
-            if (ret < 0)
-                return ret;
+            if (ret < 0)
+                return -errno;
@@
-            if (ret < 0)
-                return ret;
+            if (ret < 0)
+                return -errno;
@@
-            if (ret < 0)
-                return ret;
+            if (ret < 0)
+                return -errno;

Optionally, capture errno into a local before returning to avoid clobbering.

🤖 Prompt for AI Agents
In drivers/net/tap/rte_eth_tap.c around lines 778 to 859, the function
tap_nl_ctrl currently returns ret directly on early error (e.g. "if (ret < 0)
return ret;"), which propagates -1 instead of a canonical negative errno; change
those early returns to capture errno into a local int (e.g. int err = errno;)
and return -err; update every place inside the switch that does "if (ret < 0)
return ret;" (SIOCSIFFLAGS, SIOCSIFHWADDR, SIOCSIFMTU cases) to use this
pattern, and if any code calls other functions that may clobber errno before
returning, capture errno immediately after the failing call before invoking any
other function.

Comment on lines +1783 to +1863
tap_netns_change(struct rte_eth_dev *dev)
{
struct pmd_internals *pmd = dev->data->dev_private;
int netns_fd, orig_netns_fd, new_nlsk_fd;

netns_fd = ioctl(pmd->ka_fd, TUNGETDEVNETNS);
if (netns_fd < 0) {
TAP_LOG(INFO, "%s: interface deleted", pmd->name);
return 0;
}

/* Interface was moved to another namespace */
pmd->if_index = 0;

/* Save current namespace */
orig_netns_fd = open("/proc/self/ns/net", O_RDONLY);
if (orig_netns_fd < 0) {
TAP_LOG(ERR, "%s: failed to open original netns: %s",
pmd->name, strerror(errno));
close(netns_fd);
return -1;
}

/* Switch to new namespace */
if (setns(netns_fd, CLONE_NEWNET) < 0) {
TAP_LOG(ERR, "%s: failed to enter new netns: %s",
pmd->name, strerror(errno));
close(netns_fd);
close(orig_netns_fd);
return -1;
}

/*
* Update ifindex by querying interface name.
* The interface now has a new ifindex in the new namespace.
*/
pmd->if_index = if_nametoindex(pmd->name);

/* Recreate netlink socket in new namespace */
new_nlsk_fd = tap_nl_init(0);

/* Recreate LSC interrupt netlink socket in new namespace */
rte_intr_callback_unregister_pending(pmd->intr_handle, tap_dev_intr_handler, dev, NULL);
if (tap_lsc_intr_handle_set(dev, 1) < 0)
TAP_LOG(WARNING, "%s: failed to recreate LSC interrupt socket",
pmd->name);

/* Switch back to original namespace */
if (setns(orig_netns_fd, CLONE_NEWNET) < 0)
TAP_LOG(ERR, "%s: failed to return to original netns: %s",
pmd->name, strerror(errno));

close(orig_netns_fd);
close(netns_fd);

if (pmd->if_index == 0) {
TAP_LOG(WARNING, "%s: interface moved to another namespace, "
"failed to get new ifindex",
pmd->name);
if (new_nlsk_fd >= 0)
close(new_nlsk_fd);
return -1;
}

if (new_nlsk_fd < 0) {
TAP_LOG(WARNING, "%s: failed to recreate netlink socket in new namespace",
pmd->name);
return -1;
}

/* Close old netlink socket and replace with new one */
if (pmd->nlsk_fd >= 0)
tap_nl_final(pmd->nlsk_fd);
pmd->nlsk_fd = new_nlsk_fd;

TAP_LOG(INFO, "%s: interface moved to another namespace, new ifindex: %u",
pmd->name, pmd->if_index);

return 0;
}
#endif
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Namespace switch: close and re-arm LSC netlink socket to avoid FD leaks; replace non-portable API

  • The sequence re-registers LSC in the new netns without closing the previous intr FD, leaking the old netlink socket and risking stale events.
  • rte_intr_callback_unregister_pending(...) is non-standard across DPDK versions; prefer the proven unregister loop used elsewhere in this file.

Fix by explicitly disabling LSC before enabling it again:

@@ static int tap_netns_change(struct rte_eth_dev *dev)
-    /* Recreate LSC interrupt netlink socket in new namespace */
-    rte_intr_callback_unregister_pending(pmd->intr_handle, tap_dev_intr_handler, dev, NULL);
-    if (tap_lsc_intr_handle_set(dev, 1) < 0)
+    /* Recreate LSC interrupt netlink socket in new namespace */
+    (void)tap_lsc_intr_handle_set(dev, 0); /* closes old FD and unregisters */
+    if (tap_lsc_intr_handle_set(dev, 1) < 0)
         TAP_LOG(WARNING, "%s: failed to recreate LSC interrupt socket",
                 pmd->name);

This both unregisters the callback and closes the prior FD (clean path), preventing leaks.

🤖 Prompt for AI Agents
drivers/net/tap/rte_eth_tap.c around lines 1783-1863: the LSC netlink socket and
interrupt callback are re-created in the new netns without properly
disabling/unregistering the old callback or closing its FD, and the code uses
the non-portable rte_intr_callback_unregister_pending API; fix by first
disabling LSC on the device in the original namespace (unregister the callback
using the same unregister loop/pattern used elsewhere in this file to remove all
callbacks), close the old LSC netlink FD, then create the new netlink socket in
the new namespace and register the LSC callback; ensure error paths close
new_nlsk_fd when failing and that pmd->intr_handle state reflects the change to
avoid FD leaks or duplicate callbacks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants