Skip to content

fix nl_addr leaks detected by valgrind#480

Merged
rubensfig merged 3 commits intomainfrom
jogo_valgrind_addr_leaks
Aug 15, 2025
Merged

fix nl_addr leaks detected by valgrind#480
rubensfig merged 3 commits intomainfrom
jogo_valgrind_addr_leaks

Conversation

@KanjiMonster
Copy link
Copy Markdown
Contributor

@KanjiMonster KanjiMonster commented Aug 12, 2025

Running valgrind, it detected we are leaking allocated nl_addrs at various places following the pattern:

auto p = nl_addr_alloc(...);
nl_addr_parse(<address>, <family>, &p);
std::unique_ptr<nl_addr, decltype(&nl_addr_put)> tm_addr(p, nl_addr_put);

The core issue here is that nl_addr_parse() does its own allocation and updates pwith a pointer to it, and thus the initial allocation behindp is leaked.

Since we always parse static addresses, replace those dynamic allocations with statically allocated addresses in the constructor.

while fixing this, also fix the IPv6 multicast prefix from ff80::/10 to ff00::/8.

Valgrind complains about memory being leaked by nl_bridge::mdb_to_set():

==2768== 36 bytes in 1 blocks are definitely lost in loss record 806 of 1,550
==2768==    at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768==    by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768==    by 0x24BEB1: basebox::nl_bridge::mdb_to_set(rtnl_mdb*, std::set<std::tuple<unsigned int, unsigned short, rofl::caddress_ll>, std::less<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> >, std::allocator<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> > >*) (nl_bridge.cc:1082)
==2768==    by 0x24C3EB: basebox::nl_bridge::mdb_update(rtnl_mdb*, rtnl_mdb*) (nl_bridge.cc:1116)
==2768==    by 0x23C3D7: basebox::cnetlink::handle_wakeup(rofl::cthread&) (cnetlink.cc:939)
==2768==    by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x61ED649: start_thread (pthread_create.c:442)
==2768==    by 0x626E583: clone (clone.S:100)
==2768==
==2768== 36 bytes in 1 blocks are definitely lost in loss record 807 of 1,550
==2768==    at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768==    by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768==    by 0x56BE8A7: nl_addr_parse (addr.c:448)
==2768==    by 0x24BEF2: basebox::nl_bridge::mdb_to_set(rtnl_mdb*, std::set<std::tuple<unsigned int, unsigned short, rofl::caddress_ll>, std::less<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> >, std::allocator<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> > >*) (nl_bridge.cc:1092)
==2768==    by 0x24C3EB: basebox::nl_bridge::mdb_update(rtnl_mdb*, rtnl_mdb*) (nl_bridge.cc:1116)
==2768==    by 0x23C3D7: basebox::cnetlink::handle_wakeup(rofl::cthread&) (cnetlink.cc:939)
==2768==    by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x61ED649: start_thread (pthread_create.c:442)
==2768==    by 0x626E583: clone (clone.S:100)

The core issue is that we first alloc a nl_addr, and then call
nl_addr_parse(), which will then alloc its own nl_addr, replacing the
value of p, leaking the original allocated nl_addr.

      auto p = nl_addr_alloc(4);
      nl_addr_parse("224.0.0.0/24", AF_INET, &p);

Only then we create unique_ptr<> for the second allocation, so the first
is never free'd.

In the IPv6 case, we then call nl_addr_parse() a second time, now
replacing p again with a newly allocated nl_addr(). But we never update
the unique_ptr<>, so the second allocated address is again leaked.

Since we parse static addresses, replace those dynamic allocations with
statically allocated addresses in the constructor.

This sidesteps the issue of leaking them, and avoids having to allocate
and free addresses for every mdb entry just to compare them.

Fixes: 30e4b45 ("nl_bridge: ensure we do not create multicast groups for 224.0.0.X")
Fixes: 8c53f64 ("nl_bridge: do not install link local multicast entries")
Fixes: 4ced3ce ("nl_bridge: create mdb groups for IPv6 link local and greater scopes")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Valgrind complains about memory being leaked by nl_l3::add_l3_neigh():

==2768== 144 bytes in 4 blocks are definitely lost in loss record 1,229 of 1,550
==2768==    at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768==    by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768==    by 0x25C4F9: basebox::nl_l3::add_l3_neigh(rtnl_neigh*) (nl_l3.cc:749)
==2768==    by 0x234540: basebox::cnetlink::route_neigh_apply(basebox::nl_obj const&) (cnetlink.cc:1259)
==2768==    by 0x23C3EA: basebox::cnetlink::handle_wakeup(rofl::cthread&) (cnetlink.cc:921)
==2768==    by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x61ED649: start_thread (pthread_create.c:442)
==2768==    by 0x626E583: clone (clone.S:100)

as well as nl_l3::add_l3_addr():

==2768== 275 bytes in 1 blocks are definitely lost in loss record 1,331 of 1,550
==2768==    at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768==    by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768==    by 0x25EC80: basebox::nl_l3::add_l3_addr(rtnl_addr*) (nl_l3.cc:249)
==2768==    by 0x25F543: basebox::nl_l3::init() (nl_l3.cc:153)
==2768==    by 0x291E51: basebox::controller::handle_desc_stats_reply(rofl::crofdpt&, rofl::cauxid const&, rofl::openflow::cofmsg_desc_stats_reply&) (controller.cc:235)
==2768==    by 0x58860BE: rofl::crofdpt::handle_recv(rofl::crofchan&, rofl::crofconn&, rofl::openflow::cofmsg*) (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x588A4D4: ??? (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x58AA21A: rofl::crofconn::handle_rx_messages() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768==    by 0x61ED649: start_thread (pthread_create.c:442)
==2768==    by 0x626E583: clone (clone.S:100)

The core issue is that we first alloc a nl_addr, and then call
nl_addr_parse(), which will then alloc its own nl_addr, replacing the
value of p, leaking the original allocated nl_addr.

    auto p = nl_addr_alloc(255);
    nl_addr_parse("127.0.0.0/8", AF_INET, &p);
    std::unique_ptr<nl_addr, decltype(&nl_addr_put)> lo_addr(p, nl_addr_put);

Only then we create unique_ptr<> for the second allocation, so the first
is never free'd.

Since we parse static addresses, replace those dynamic allocations with
statically allocated addresses in the constructor.

This sidesteps the issue of leaking them, and avoids having to allocate
and free addresses for every neighbor or assigned address just to
compare them.

Fixes: a4dc683 ("Fix broken lo (#111)")
Fixes: c2fbb1d ("Configure IP addresses on top of bridges (#109)")
Fixes: 0a5c7c1 ("bug: fix late nexthop registration")
Fixes: f407098 ("feat: Added  IPv6 neighbor, address and routing functions (#137)")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
The IPv6 multicast prefix is ff00::/8, not ff80::/10 [1].

[1] https://www.rfc-editor.org/rfc/rfc2373#section-2.7

Fixes: 0a5c7c1 ("bug: fix late nexthop registration")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
@KanjiMonster KanjiMonster force-pushed the jogo_valgrind_addr_leaks branch from aed0bc3 to 77e0098 Compare August 14, 2025 07:12
@KanjiMonster KanjiMonster marked this pull request as ready for review August 14, 2025 11:53
@KanjiMonster KanjiMonster requested a review from rubensfig August 14, 2025 11:53
@rubensfig rubensfig merged commit 5692f72 into main Aug 15, 2025
4 checks passed
@rubensfig rubensfig deleted the jogo_valgrind_addr_leaks branch August 15, 2025 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants