Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bgp filter fun (backport #15466) #15550

Closed
wants to merge 3 commits into from

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Mar 14, 2024

see individual commits

a) Allows unsuppress-maps to be part of the filtering system that would allow outgoing data to a peer.
b) Fix memory leak associated with late decision not to send data to peer in bgp


This is an automatic backport of pull request #15466 done by Mergify.

Currently in subgroup_default_originate the attr.aspath
is set in bgp_attr_default_set, which hashs the aspath
and creates a refcount for it.  If this is a withdraw
the subgroup_announce_check and bgp_adj_out_set_subgroup
is called which will intern the attribute.  This will
cause the the attr.aspath to be set to a new value
finally at the bottom of the function it intentionally
uninterns the aspath which is not the one that was
created for this function.  This reduces the other
aspath's refcount by 1 and if a clear bgp * is issued
fast enough the aspath for that will be removed
and the system will crash.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit e613e12)
If unsuppress-map is setup for outgoing peers, consider that
policy is being applied as for RFC 8212.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 6814401)
Customer has this valgrind trace:

Direct leak of 2829120 byte(s) in 70728 object(s) allocated from:
  0 in community_new ../bgpd/bgp_community.c:39
  1 in community_uniq_sort ../bgpd/bgp_community.c:170
  2 in route_set_community ../bgpd/bgp_routemap.c:2342
  3 in route_map_apply_ext ../lib/routemap.c:2673
  4 in subgroup_announce_check ../bgpd/bgp_route.c:2367
  5 in subgroup_process_announce_selected ../bgpd/bgp_route.c:2914
  6 in group_announce_route_walkcb ../bgpd/bgp_updgrp_adv.c:199
  7 in hash_walk ../lib/hash.c:285
  8 in update_group_af_walk ../bgpd/bgp_updgrp.c:2061
  9 in group_announce_route ../bgpd/bgp_updgrp_adv.c:1059
 10 in bgp_process_main_one ../bgpd/bgp_route.c:3221
 11 in bgp_process_wq ../bgpd/bgp_route.c:3221
 12 in work_queue_run ../lib/workqueue.c:282

The above leak detected by valgrind was from a screenshot so I copied it
by hand.  Any mistakes in line numbers are purely from my transcription.
Additionally this is against a slightly modified 8.5.1 version of FRR.
Code inspection of 8.5.1 -vs- latest master shows the same problem
exists.  Code should be able to be followed from there to here.

What is happening:

There is a route-map being applied that modifes the outgoing community
to a peer.  This is saved in the attr copy created in
subgroup_process_announce_selected.  This community pointer is not
interned.  So the community->refcount is still 0.  Normally when
a prefix is announced, the attr and the prefix are placed on a
adjency out structure where the attribute is interned.  This will
cause the community to be saved in the community hash list as well.
In a non-normal operation when the decision to send is aborted after
the route-map application, the attribute is just dropped and the
pointer to the community is just dropped too, leading to situations
where the memory is leaked.  The usage of bgp suppress-fib would
would be a case where the community is caused to be leaked.
Additionally the previous commit where an unsuppress-map is used
to modify the outgoing attribute but since unsuppress-map was
not considered part of outgoing policy the attribute would be dropped as
well.  This pointer drop also extends to any dynamically allocated
memory saved by the attribute pointer that was not interned yet as well.

So let's modify the return case where the decision is made to
not send the prefix to the peer to always just flush the attribute
to ensure memory is not leaked.

Fixes: #15459
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit addff17)

# Conflicts:
#	bgpd/bgp_route.c
Copy link
Author

mergify bot commented Mar 14, 2024

Cherry-pick of addff17 has failed:

On branch mergify/bp/stable/8.5/pr-15466
Your branch is ahead of 'origin/stable/8.5' by 2 commits.
  (use "git push" to publish your local commits)

You are currently cherry-picking commit addff17a5.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   bgpd/bgp_conditional_adv.c
	modified:   bgpd/bgp_updgrp.h
	modified:   bgpd/bgp_updgrp_adv.c

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   bgpd/bgp_route.c

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@ton31337
Copy link
Member

Closing this, created a manual backport here #15568.

@ton31337 ton31337 closed this Mar 18, 2024
@ton31337 ton31337 deleted the mergify/bp/stable/8.5/pr-15466 branch March 18, 2024 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants