Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modified down rules to pre-down rules to ensure that default route is… #3853

Merged
merged 6 commits into from
Jan 17, 2020
Merged

modified down rules to pre-down rules to ensure that default route is… #3853

merged 6 commits into from
Jan 17, 2020

Conversation

kannankvs
Copy link
Collaborator

- What I did
When management VRF is deleted, zebra was crashing as specified in the gitissue3798. As discussed in the issue, it has got two parts. One part is to fix the crash in FRR, which will happen in the future. Other part is to ensure that default route is deleted just before interface is brought down by using "pre-down" commands instead of "down" commands, which will avoid the zebra crash. Irrespective of this change, FRR will be fixed to avoid the crash in zebra code.

- How I did it
Changed "down" commands to "pre-down" commands.

- How to verify it
Executed management VRF test cases specified in the sonic-mgmt PR1210 and verified that they are passing.

- Description for the changelog
Changed the commands given under "down" to "pre-down"

@kannankvs
Copy link
Collaborator Author

retest please

@kannankvs
Copy link
Collaborator Author

@lguohan , @jleveque : Is it possible to review this and mark it for merging to 201911?

up ip link set dev lo-m master mgmt
down ip link delete dev lo-m
down ip link delete dev lo-m
{% endif %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated condition here if, should remove one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerlinp : Which is duplicated condition? We modified only the order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition from line 8 to 20, so line 12 and 19 is duplicated. And it is better to add a block mgmt.

{% if (MGMT_VRF_CONFIG) and (MGMT_VRF_CONFIG['vrf_global']['mgmtVrfEnabled'] == "true") %}
pre-down ip link set dev eth0 nomaster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this? It is different from original style. And it may be wrong, in kernel handling set nomaster is including event down, so here do a down in pre-down. I think cgdelete is enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerlinp : How come nomaster is doing an event down? Yes, we also expect the cgdelete to handle to take care of nomaster. But, similar to pre-down for route delete, we thought that "explicitly setting to nomaster" is better/safe for applications like zebra/dhcp_relay to handle this "nomaster" event separately. Anyway, this "nomaster" is not making any difference in the behavior in applications dhcp_relay and hence we think that it is safe to remove this.
@prsunny , @jleveque : Any comments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be exactly, linux kernel do change master just like event down, will flush neighbors and routes. If set nomaster before delete routes in kernel(here they all in pre-down, it make troubles), kernel will clear routes in handling down, and commands to delete doutes will do nothing, and there is no deldoute netlink, and our fix will fail again.

	case NETDEV_DOWN:
		fib_disable_ip(dev, event, false);
		break;
	case NETDEV_CHANGE:
		flags = dev_get_flags(dev);
		if (flags & (IFF_RUNNING | IFF_LOWER_UP))
			fib_sync_up(dev, RTNH_F_LINKDOWN);
		else
			fib_sync_down_dev(dev, event, false);
		/* fall through */
	case NETDEV_CHANGEMTU:
		rt_cache_flush(net);
		break;
	case NETDEV_CHANGEUPPER:
		info = ptr;
		/* flush all routes if dev is linked to or unlinked from
		 * an L3 master device (e.g., VRF)
		 */
		if (info->upper_dev && netif_is_l3_master(info->upper_dev))
			fib_disable_ip(dev, NETDEV_DOWN, true);
		break;

At least here cannot use pre-down. But now that there is no up ip link set dev eth0 master mgmt how comes down ip link set dev eth0 nomaster? I think this is not safe but dangerous.

@@ -68,6 +69,7 @@ iface eth0 inet dhcp
vrf mgmt
up cgcreate -g l3mdev:mgmt
up cgset -r l3mdev.master-device=mgmt mgmt
pre-down ip link set dev eth0 nomaster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as line 57.

@kannankvs
Copy link
Collaborator Author

@tylerlinp : We removed the nomaster as discussed. Assuming that cgdelete takes care of it. Request you to review and approve the changes.

@prsunny
Copy link
Contributor

prsunny commented Dec 17, 2019

@tylerlinp , can you check the changes?

@tylerlinp
Copy link
Contributor

@prsunny I had checked a few days ago and a suggestion was not replied.

The condition from line 8 to 20, so line 12 and 19 is duplicated. And it is better to add a block mgmt.

@kannankvs
Copy link
Collaborator Author

@prsunny I had checked a few days ago and a suggestion was not replied.

The condition from line 8 to 20, so line 12 and 19 is duplicated. And it is better to add a block mgmt.

@tylerlinp : Addressed the comment. Please check and approve.
@prsunny : Can you please mark it for cherrypicking to 201911 release?

@kannankvs
Copy link
Collaborator Author

retest vsimage please

@prsunny prsunny merged commit d150721 into sonic-net:master Jan 17, 2020
@qiluo-msft
Copy link
Collaborator

Retest mellanox please

@wendani
Copy link
Contributor

wendani commented Feb 5, 2020

Request for 201911 branch

abdosi pushed a commit that referenced this pull request Feb 14, 2020
#3853)

* modified down rules to pre-down rules to ensure that default route is deleted just before interface is made down
pphuchar pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Mar 9, 2020
sonic-net#3853)

* modified down rules to pre-down rules to ensure that default route is deleted just before interface is made down
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 24, 2020
sonic-net#3853)

* modified down rules to pre-down rules to ensure that default route is deleted just before interface is made down
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants