Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default LACP rate is set incorrectly both on VMs and DUT. #143

Closed
antonpatenko opened this issue Mar 21, 2017 · 9 comments
Closed

Default LACP rate is set incorrectly both on VMs and DUT. #143

antonpatenko opened this issue Mar 21, 2017 · 9 comments
Assignees
Labels

Comments

@antonpatenko
Copy link
Contributor

antonpatenko commented Mar 21, 2017

The LACP rate value for devices on testbed is expected to be set to "slow". It is so on DUT, but on VMs the rate is set to "fast".

  • Steps to reproduce:

1.1) SSH to PTF host.
1.2) Run tcpdump for eth3 ("tcpdump -i eth3") and verify that LACP PDUs arrive each second.

2.1) SSH to ARISTA03T2 (as "admin").
2.2) Run next commands:
- enable
- show running-config interfaces ethernet1
- show running-config interfaces ethernet2
2.3) Using output of the last command, verify that LACP rate is set to "fast".

3.1) SSH to DUT.
3.2) Type the following command:
docker exec -i teamd teamdctl Ethernet0 state
3.3) Using output of the last command, verify that "fast rate" value is set to "no". It means that LACP rate on DUT is slow (at least, teamdctl thinks so).

@stcheng
Copy link
Contributor

stcheng commented Mar 21, 2017

I think we need to update the t1-lag-spine.j2 template to remove the lacp rate fast line.
It is introduced during the deployment.
@antonpatenko could you on the VM reconfigure the interfaces to run no lacp rate fast and check again? If that's the case, we could update the j2 template.

@pavel-shirshov
Copy link
Contributor

We can't change testbed setting from lacp fast to lacp slow. We have to support LACP fast, because our T1 devices use LACP fast mode, not slow.
So, probably in our fast-reboot we need to switch from LACP fast to LACP slow on DUT and neighbors, then make fast-reboot and switch device configuration back

@liatgrozovik
Copy link

We need a decision what to do with this issue. currently following the requirement we got on the test it is not working as expected. so we need to understand how you would like to proceed. @antonpatenko please check stcheng suggestion and update if it works then we should decide how to proceed.

@lguohan
Copy link
Contributor

lguohan commented Mar 23, 2017

The LACP PDUs you receive at step 1.2 is the PDU sent by DUT, the rate should be defined by arista vm according to LACP protocol. In this case, since arista vm is configured as fast, the PDU observe rate should be fast too.

For fast reboot, we have to switch the T1 from fast to slow, and that should be T1 default value.

I think we should disable lacp fast on the arista vm.

However, since sonic DUT some times have to talk neighbor which is configured as lacp fast. In the anton's test, we need to change the neighbor lacp to fast and do a check.

@antonpatenko
Copy link
Contributor Author

Looks like Shuotian's suggestion changed situation.
The packets below are captured on DUT after switching rate on VMs LAG member.

16:52:59 LACPv1, length 110
16:53:03 LACPv1, length 110
16:53:29 LACPv1, length 110
16:53:33 LACPv1, length 110

As you see, some packets are being sent faster. than slow rate requires, but in general this looks better.

@stcheng
Copy link
Contributor

stcheng commented Mar 23, 2017

@antonpatenko we could for one time do an experiment to check how many packets are received during a certain long enough time and see if the lacp rate is slow. meanwhile, I think we will need a script to toggle the lacp rate on the T1 Arista devices so as to continue fast boot test.

@liatgrozovik
Copy link

Not clear why fast boot test configuration is relevant for here.
this is T1 lag topology, maybe T0 topology and currently it does not work correct.
@stcheng you have defined the test requirements, the test is written according to the requirements but the topology or the DUT does not work as expected. We need clear definition how to proceed.

@pavel-shirshov
Copy link
Contributor

@liatgrozovik Agree.
We could change t1-lag topology configuration. I'll prepare PR for sonic-mgmt. repo.

@daall
Copy link
Contributor

daall commented Apr 28, 2020

Fixed by #154.

@daall daall closed this as completed Apr 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants