New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
isisd: fix crash when deactivating ISIS adjacency on the interface. #15993
base: master
Are you sure you want to change the base?
Conversation
1. When the command "no <ip|ipv6> router isis WORD" is executed on the interface, it invokes list_delete_all_node to iterate and release the memory of all nodes in the cirtcuit->u.bc.adjdb[1] linked list. However, the nodes are not unlinked during this traversal process, leading to the call of *list->del to delete the data of the linked list nodes. 2. For ISIS, deleting the data of the linked list nodes is done by calling isis_delete_adj. Subsequently, isis_level2_adj_up will be called to iterate and query the cirtcuit->u.bc.adjdb[1] linked list. If there are many neighbors on this interface, accessing the memory of the released linked list nodes may occur. 3. Not limited to ISIS, if the linked list is not unlinked during the deletion of all nodes in process 1, *list->del should not be allowed to iterate through the list again. Signed-off-by: zhou-run <166502045+zhou-run@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show the backtrace?
In my opinion the problem is in isis_circuit_up()
when we create a list we don't specify circuit->u.bc.adjdb[0]->del = isis_delete_adj;
.
The issue lies in
The backtrace provided above pertains to version 8.2.2, but it seems that the same issue exists in the code of the master branch as well. |
Yes, but the list needs to have a deletion hook when it's created. So it might be an issue in timing (or not). |
In addition to the list deletion operation |
Someone clearly understanding the IS-IS code should take a look here. /cc @odd22 @louis-6wind @pguibert6WIND maybe? |
I'll take a look at it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void isis_circuit_down(struct isis_circuit *circuit)
{
[...]
} else if (circuit->circ_type == CIRCUIT_T_BROADCAST) {
struct list *adj_list;
struct listnode *node;
if (circuit->u.bc.adjdb[0]) {
adj_list = list_new();
isis_adj_build_up_list(circuit->u.bc.adjdb[0],
adj_list);
for (ALL_LIST_ELEMENTS_RO(adj_list, node, adj))
isis_log_adj_change(
adj, adj->adj_state,
ISIS_ADJ_DOWN,
"circuit is being brought down");
list_delete(&adj_list);
}
if (circuit->u.bc.adjdb[1]) {
adj_list = list_new();
isis_adj_build_up_list(circuit->u.bc.adjdb[1],
adj_list);
for (ALL_LIST_ELEMENTS_RO(adj_list, node, adj))
isis_log_adj_change(
adj, adj->adj_state,
ISIS_ADJ_DOWN,
"circuit is being brought down");
list_delete(&adj_list);
}
The problem is that we are building a temporary link-list adj_list
containing the up adjacencies. The temporary list is deleted and their nodes are freed but still referenced in u.bc.adjdb[X]
. This is a heap-after-free issue.
There's no need to use a temporary link-list. The problem should be fixed in all places where isis_adj_build_up_list()
is used to build a temporary list.
@@ -331,7 +331,7 @@ void list_delete_all_node(struct list *list) | |||
next = node->next; | |||
if (*list->del) | |||
(*list->del)(node->data); | |||
listnode_free(list, node); | |||
list_delete_node(list, node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void list_delete_node(struct list *list, struct listnode *node)
{
frrtrace(2, frr_libfrr, list_delete_node, list, node);
if (node->prev)
node->prev->next = node->next;
else
list->head = node->next;
if (node->next)
node->next->prev = node->prev;
else
list->tail = node->prev;
list->count--;
listnode_free(list, node);
}
list_delete_node()
updates the link-list and free the node with listnode_free(). This is correct but, since all the nodes are deleted, we do not need to do that. listnode_free() is sufficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list_delete_node()
updates the link-list and free the node with listnode_free(). This is correct but, since all the nodes are deleted, we do not need to do that. listnode_free() is sufficient
I don't think so. Refer to the backtrace above, (*list->del)()=>...=>isis_level2_adj_up()
. If the memory of each node in the list is not unlinked before being released, the operation to delete node data (*list->del)()
should not be allowed to continue traversing this linked list, as it would otherwise access released node memory. However, the restriction on allowing (*list->del)()
to traverse the list is inherently unreasonable.
bool isis_level2_adj_up(struct isis_area *area)
{
struct listnode *node, *cnode;
struct isis_circuit *circuit;
struct list *adjdb;
struct isis_adjacency *adj;
if (area->is_type == IS_LEVEL_1)
return false;
for (ALL_LIST_ELEMENTS_RO(area->circuit_list, cnode, circuit)) {
if (circuit->circ_type == CIRCUIT_T_BROADCAST) {
adjdb = circuit->u.bc.adjdb[1];
if (!adjdb || !adjdb->count)
continue;
for (ALL_LIST_ELEMENTS_RO(adjdb, node, adj)) { //here heap-after-free
if (adj->level != ISIS_ADJ_LEVEL1
&& adj->adj_state == ISIS_ADJ_UP)
return true;
}
} else if (circuit->circ_type == CIRCUIT_T_P2P
&& circuit->u.p2p.neighbor) {
adj = circuit->u.p2p.neighbor;
if (adj->level != ISIS_ADJ_LEVEL1
&& adj->adj_state == ISIS_ADJ_UP)
return true;
}
}
return false;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your fix avoids this crash and could avoid some other potential crashes with link-lists by cleaning up the memory before freeing it. But it masks the actual issue which is the heap-after-free.
You have noticed there is heap-after-free in isis_level2_adj_up(). I agree with you. However, it is not supposed to happen. You are not supposed to access memory zones that could have been allocated to something else. It can lead to unexpected behaviors.
If you avoid using temporary list, you won't have the crash anymore and you will have fixed the root cause.
Deleting the temporary link-list |
you are right. I have misinterpreted the backtrace. I have just loaded frr-8.2.2 tag to interpret the backtrace. I see that the issue is because in
This is not a case of heap-use-after but I am not sure we should allow accessing a list being deleted. |
The issue is in
|
list_delete_all_node
to iterate and release the memory of all nodes in thecirtcuit->u.bc.adjdb[1]
linked list. However, the nodes are not unlinked during this traversal process, leading to the call of*list->del
to delete the data of the linked list nodes.isis_delete_adj
. Subsequently,isis_level2_adj_up
will be called to iterate and query thecirtcuit->u.bc.adjdb[1]
linked list. If there are many neighbors on this interface, accessing the memory of the released linked list nodes may occur.*list->del
should not be allowed to iterate through the list again.Signed-off-by: zhou-run zhou.run@h3c.com