fix nl_addr leaks detected by valgrind#480
Merged
Conversation
Valgrind complains about memory being leaked by nl_bridge::mdb_to_set():
==2768== 36 bytes in 1 blocks are definitely lost in loss record 806 of 1,550
==2768== at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768== by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768== by 0x24BEB1: basebox::nl_bridge::mdb_to_set(rtnl_mdb*, std::set<std::tuple<unsigned int, unsigned short, rofl::caddress_ll>, std::less<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> >, std::allocator<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> > >*) (nl_bridge.cc:1082)
==2768== by 0x24C3EB: basebox::nl_bridge::mdb_update(rtnl_mdb*, rtnl_mdb*) (nl_bridge.cc:1116)
==2768== by 0x23C3D7: basebox::cnetlink::handle_wakeup(rofl::cthread&) (cnetlink.cc:939)
==2768== by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x61ED649: start_thread (pthread_create.c:442)
==2768== by 0x626E583: clone (clone.S:100)
==2768==
==2768== 36 bytes in 1 blocks are definitely lost in loss record 807 of 1,550
==2768== at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768== by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768== by 0x56BE8A7: nl_addr_parse (addr.c:448)
==2768== by 0x24BEF2: basebox::nl_bridge::mdb_to_set(rtnl_mdb*, std::set<std::tuple<unsigned int, unsigned short, rofl::caddress_ll>, std::less<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> >, std::allocator<std::tuple<unsigned int, unsigned short, rofl::caddress_ll> > >*) (nl_bridge.cc:1092)
==2768== by 0x24C3EB: basebox::nl_bridge::mdb_update(rtnl_mdb*, rtnl_mdb*) (nl_bridge.cc:1116)
==2768== by 0x23C3D7: basebox::cnetlink::handle_wakeup(rofl::cthread&) (cnetlink.cc:939)
==2768== by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x61ED649: start_thread (pthread_create.c:442)
==2768== by 0x626E583: clone (clone.S:100)
The core issue is that we first alloc a nl_addr, and then call
nl_addr_parse(), which will then alloc its own nl_addr, replacing the
value of p, leaking the original allocated nl_addr.
auto p = nl_addr_alloc(4);
nl_addr_parse("224.0.0.0/24", AF_INET, &p);
Only then we create unique_ptr<> for the second allocation, so the first
is never free'd.
In the IPv6 case, we then call nl_addr_parse() a second time, now
replacing p again with a newly allocated nl_addr(). But we never update
the unique_ptr<>, so the second allocated address is again leaked.
Since we parse static addresses, replace those dynamic allocations with
statically allocated addresses in the constructor.
This sidesteps the issue of leaking them, and avoids having to allocate
and free addresses for every mdb entry just to compare them.
Fixes: 30e4b45 ("nl_bridge: ensure we do not create multicast groups for 224.0.0.X")
Fixes: 8c53f64 ("nl_bridge: do not install link local multicast entries")
Fixes: 4ced3ce ("nl_bridge: create mdb groups for IPv6 link local and greater scopes")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Valgrind complains about memory being leaked by nl_l3::add_l3_neigh():
==2768== 144 bytes in 4 blocks are definitely lost in loss record 1,229 of 1,550
==2768== at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768== by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768== by 0x25C4F9: basebox::nl_l3::add_l3_neigh(rtnl_neigh*) (nl_l3.cc:749)
==2768== by 0x234540: basebox::cnetlink::route_neigh_apply(basebox::nl_obj const&) (cnetlink.cc:1259)
==2768== by 0x23C3EA: basebox::cnetlink::handle_wakeup(rofl::cthread&) (cnetlink.cc:921)
==2768== by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x61ED649: start_thread (pthread_create.c:442)
==2768== by 0x626E583: clone (clone.S:100)
as well as nl_l3::add_l3_addr():
==2768== 275 bytes in 1 blocks are definitely lost in loss record 1,331 of 1,550
==2768== at 0x484A208: calloc (vg_replace_malloc.c:1328)
==2768== by 0x56BE091: nl_addr_alloc (addr.c:189)
==2768== by 0x25EC80: basebox::nl_l3::add_l3_addr(rtnl_addr*) (nl_l3.cc:249)
==2768== by 0x25F543: basebox::nl_l3::init() (nl_l3.cc:153)
==2768== by 0x291E51: basebox::controller::handle_desc_stats_reply(rofl::crofdpt&, rofl::cauxid const&, rofl::openflow::cofmsg_desc_stats_reply&) (controller.cc:235)
==2768== by 0x58860BE: rofl::crofdpt::handle_recv(rofl::crofchan&, rofl::crofconn&, rofl::openflow::cofmsg*) (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x588A4D4: ??? (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x58AA21A: rofl::crofconn::handle_rx_messages() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x58DD082: rofl::cthread::handle_wakeup() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x58DD717: rofl::cthread::run_loop() (in /usr/lib/librofl_common.so.0.1.1)
==2768== by 0x61ED649: start_thread (pthread_create.c:442)
==2768== by 0x626E583: clone (clone.S:100)
The core issue is that we first alloc a nl_addr, and then call
nl_addr_parse(), which will then alloc its own nl_addr, replacing the
value of p, leaking the original allocated nl_addr.
auto p = nl_addr_alloc(255);
nl_addr_parse("127.0.0.0/8", AF_INET, &p);
std::unique_ptr<nl_addr, decltype(&nl_addr_put)> lo_addr(p, nl_addr_put);
Only then we create unique_ptr<> for the second allocation, so the first
is never free'd.
Since we parse static addresses, replace those dynamic allocations with
statically allocated addresses in the constructor.
This sidesteps the issue of leaking them, and avoids having to allocate
and free addresses for every neighbor or assigned address just to
compare them.
Fixes: a4dc683 ("Fix broken lo (#111)")
Fixes: c2fbb1d ("Configure IP addresses on top of bridges (#109)")
Fixes: 0a5c7c1 ("bug: fix late nexthop registration")
Fixes: f407098 ("feat: Added IPv6 neighbor, address and routing functions (#137)")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
The IPv6 multicast prefix is ff00::/8, not ff80::/10 [1]. [1] https://www.rfc-editor.org/rfc/rfc2373#section-2.7 Fixes: 0a5c7c1 ("bug: fix late nexthop registration") Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
aed0bc3 to
77e0098
Compare
rubensfig
approved these changes
Aug 15, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Running valgrind, it detected we are leaking allocated
nl_addrs at various places following the pattern:The core issue here is that
nl_addr_parse() does its own allocation and updatespwith a pointer to it, and thus the initial allocation behindp is leaked.Since we always parse static addresses, replace those dynamic allocations with statically allocated addresses in the constructor.
while fixing this, also fix the IPv6 multicast prefix from
ff80::/10toff00::/8.