Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Evpn Warmreboot] Added Dependancy check logic in VrfMgr #1466

Merged
merged 9 commits into from
Jan 8, 2021

Conversation

nkelapur
Copy link
Contributor

@nkelapur nkelapur commented Oct 13, 2020

Added changes to handel dependency check in VrfMgr for warmreboot

Why I did it

This was done to ensure for EVPN warm-reboot the order of data replay to kernel is maintained across various submodules and the kernel programming will be successful.

How I verified it

Verified with EVPN warmreboot

Details if related

More details in warmreboot section of EVPN VXLAN HLD
sonic-net/SONiC#437

@ghost
Copy link

ghost commented Oct 13, 2020

CLA assistant check
All CLA requirements met.

@lguohan lguohan added the evpn label Nov 4, 2020
@nkelapur nkelapur marked this pull request as ready for review November 17, 2020 06:20
@nkelapur
Copy link
Contributor Author

Dependant on #1276

There is no reconcile operation required in vrfmgrd and intfmgrd
Hence immediately after replay these are marked as reconciled
@prsunny
Copy link
Collaborator

prsunny commented Dec 25, 2020

retest this please

1 similar comment
@srj102
Copy link
Contributor

srj102 commented Dec 25, 2020

retest this please

@@ -46,7 +46,7 @@ def swss_check_RestoreCount(dvs, state_db, restore_count):
if fv[0] == "restore_count":
assert int(fv[1]) == restore_count[key] + 1
elif fv[0] == "state":
assert fv[1] == "reconciled"
assert fv[1] == "reconciled" or fv[1] == "disabled"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to check for "disabled"? If not, please remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we still need "disabled" to handle the cold restart scenario, specially for docker restart

cfgmgr/intfmgr.cpp Outdated Show resolved Hide resolved
@@ -626,6 +655,7 @@ bool IntfMgr::doIntfAddrTask(const vector<string>& keys,
void IntfMgr::doTask(Consumer &consumer)
{
SWSS_LOG_ENTER();
static bool replayNotDone = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please rename to "replayDone" and set to false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure .. will change it

@nkelapur
Copy link
Contributor Author

retest this please

1 similar comment
@nkelapur
Copy link
Contributor Author

retest this please

@prsunny
Copy link
Collaborator

prsunny commented Dec 30, 2020

retest this please

8 similar comments
@prsunny
Copy link
Collaborator

prsunny commented Dec 31, 2020

retest this please

@nkelapur
Copy link
Contributor Author

retest this please

@srj102
Copy link
Contributor

srj102 commented Jan 1, 2021

retest this please

@nkelapur
Copy link
Contributor Author

nkelapur commented Jan 3, 2021

retest this please

@nkelapur
Copy link
Contributor Author

nkelapur commented Jan 4, 2021

retest this please

@nkelapur
Copy link
Contributor Author

nkelapur commented Jan 4, 2021

retest this please

@nkelapur
Copy link
Contributor Author

nkelapur commented Jan 6, 2021

retest this please

@prsunny
Copy link
Collaborator

prsunny commented Jan 8, 2021

retest this please

@liushilongbuaa
Copy link
Contributor

retest vs please

@prsunny
Copy link
Collaborator

prsunny commented Jan 8, 2021

retest this please

@prsunny prsunny merged commit 7e3b2c6 into sonic-net:master Jan 8, 2021
DavidZagury pushed a commit to DavidZagury/sonic-swss that referenced this pull request Mar 4, 2021
)

* [Evpn Warmreboot] Added Dependancy check logic in VrfMgr
This was done to ensure for EVPN warm-reboot the order of data replay to kernel is maintained across various submodules and the kernel programming will be successful.

* Marking Vrfmgrd and Intfmgrd to reconcile immediately after replay
There is no reconcile operation required in vrfmgrd and intfmgrd, hence immediately after replay these are marked as reconciled
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
#### What I did
Modified pre-check for file `/host/warmboot/issu_bank.txt` in `warm-reboot` script.
Added automated recovery of  `/host/warmboot/issu_bank.txt`

#### How I did it
Modified the `warm-reboot` script.

#### How to verify it
Need to somehow corrupt the `/host/warmboot/issu_bank.txt` (list below) and then run the `warm-reboot` command.
For testing I have modified the `warm-reboot` script - added corruption command before `check_issu_bank_file` function call.

1. Remove `issu_bank.txt`
2. Clear `issu_bank.txt`
3. Change characters count in `issu_bank.txt`
4. Write invalid content into `issu_bank.txt`

#### Previous command output (if the output of a command-line utility has changed)
```
root@arc-switch1041:~# warm-reboot
(/host/warmboot/issu_bank.txt) does NOT exist or empty ...
To recover (/host/warmboot/issu_bank.txt) file, do the following:
$ docker exec -it syncd sx_api_dbg_generate_dump.py
$ docker exec -it syncd cat /tmp/sdkdump | grep 'ISSU Bank'
Command above will print the VALUE of ISSU BANK - 0 or 1, use this VALUE in the next command
$ printf VALUE > /host/warmboot/issu_bank.txt
```

#### New command output (if the output of a command-line utility has changed)
```
root@arc-switch1041:~# warm-reboot
(/host/warmboot/issu_bank.txt) does NOT exist or empty ...
Recovering the (/host/warmboot/issu_bank.txt) file
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants