Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI_STATUS_TABLE_FULL and swss:orchagent shutdown #2125

Open
loshihyu opened this issue Oct 6, 2018 · 5 comments
Open

SAI_STATUS_TABLE_FULL and swss:orchagent shutdown #2125

loshihyu opened this issue Oct 6, 2018 · 5 comments
Assignees

Comments

@loshihyu
Copy link
Contributor

loshihyu commented Oct 6, 2018

We are testing SONiC to add 8k+ ipv4 routes in Broadcom BCM56960 switch. CRM showed 8192 ipv4_routes available. We could add up to 8188. Then, when adding the 8189th route, we hit following syslog errors, see "* syslog:" below, SAI_STATUS_TABLE_FULL, and syncd calls exit_and_notify() to shutdown orchagent running in swss container. We have to do "config reload" or reboot system to recover.

image

I saw there is one similar issue case opened before:
[syncd][topology t0] exit_and_notify after processing the event of SAI_STATUS_TABLE_FULL #654
sonic-net/sonic-mgmt#654

Is any way to prevent to hit this condition, e.g. SONiC code RouterOrch::addRoute checks available routes before actually adding a route? Looks like SAI_STATUS_TABLE_FULL and shutting down orchagent would apply on all resources listed in crm when more than allowed resources are used, e.g. ipv6_route, ipv4_neighbor, etc., see below. Any plan to enhance and avoid shutting down orchagent in this SAI_STATUS_TABLE_FULL case?

image

Thanks!

Wilson

@prsunny
Copy link
Contributor

prsunny commented Oct 6, 2018

As per the current design, orchagent crashes when there is a "table full" error. This is the expectation. However, in this case, looks like the crm available count is returning an incorrect value (4 instead of 0). The available count is returned by the SAI vendor based on their table size. If this is consistently happening, we would need to take this with Broadcom.

@loshihyu
Copy link
Contributor Author

loshihyu commented Oct 6, 2018

Any plan to not to crash orchagent in this "table full" error? And we will work with Broadcom if incorrect crm available count becomes an issue to us. Thanks for your prompt follow-up Sunny!

@prsunny
Copy link
Contributor

prsunny commented Oct 6, 2018

Not planned for any immediate release!

@loshihyu
Copy link
Contributor Author

loshihyu commented Oct 8, 2018

Ok, Sunny, could you help to add this as a soon-to-fix critical issue and let us know when the fix will be available? SONiC no longer works after orchagent crashes, and it has critical impact on users. Thanks!

@lguohan
Copy link
Collaborator

lguohan commented Oct 12, 2018

please enable the alpm so that you are going to hit routing table issue in the near future.

stepanblyschak added a commit to stepanblyschak/sonic-buildimage that referenced this issue Apr 22, 2022
```
22a388b [show] fix get routing stack routine (sonic-net#2137)
cb3a047 Support option --ports of config qos reload for reloading ports' QoS and buffer configuration to default (sonic-net#2125)
154a801 Enhance "config interface type/advertised-type" to be blocked on RJ45 ports  (sonic-net#2112)
3732ac5 Add CLI for route flow counter feature (sonic-net#2031)
29771e7 [techsupport] improve robustness (sonic-net#2117)
f9dc681 [intfutil] Display RJ45 port and portchannel speed in 'M' instead of 'G' when it's <= 1000M (sonic-net#2110)
781ae9f [config] Do not enable pfcwd for BmcMgmtToRRouter (sonic-net#2136)
23e9398 [scripts/fast-reboot] Shutdown remaining containers through systemd (sonic-net#2133)
576c9ef [scripts/fast-reboot] stop timers in advance (sonic-net#2131)
4dad79c bugfix: incorrect command for portchannel creation (sonic-net#2134)
c17b1f4 [show][muxcable] Decrease the timeout for show mux status/hwmode (sonic-net#2130)
49d61f8 [scripts/fast-reboot] cleanup (sonic-net#2132)
52ca324 [config/config_mgmt.py]: Fix dpb issue with upper case mac in (sonic-net#2066)
9e2fbf4 Update db_migrator to support `pfcwd_sw_enable` (sonic-net#2087)
4010bd0 FGNHG CLI changes (sonic-net#1588)
6bd54d0 Fix 'show mac' output when FDB entry for default vlan is None instead of 1 (sonic-net#2126)
```

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
liat-grozovik pushed a commit that referenced this issue May 12, 2022
288c2d8 Revert "[scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)" (#2161)
bce4694 [autoneg] add support for remote speed advertisement (#2124)
a73f156 [show][vrf]Fixing show vrf to include vlan subinterface (#2158)
7a06457 [auto_ts] Enable register/de-register auto_ts config for APP Extension (#2139)
083ebcc Add transceiver-info items advertised for cmis-supported moddules (#2135)
0811214 Validate destination port is not LAG (#2053)
6ab1c51 [minigraph]  Consume golden_config_db.json while loading minigraph (#2140)
c37a957 [Kdump] Remove the duplicate logic if Kdump was disabled (#2128)
1143869 Ordering fix for sfpshow eeprom (#2113)
fdb79b8 Allow fw update for other boot type against on the previous "none" boot fw update (#2040)
a54a091 [GCU] Supressing YANG errors from libyang while sorting (#1991)
fbfa8bc [GCU] Enabling AddRack and adding RemoveRack tests (#2143)
d012be9 [Command-Reference] Add CLI docs for route flow counter (#2069)
8c07d59 [Mellanox] [reboot] [asan] stop asan-enabled containers on reboot (#2107)
697aae3 Fix speed parsing when speed is NOT fetched from APPL_DB (#2138)
22a388b [show] fix get routing stack routine (#2137)
cb3a047 Support option --ports of config qos reload for reloading ports' QoS and buffer configuration to default (#2125)
154a801 Enhance "config interface type/advertised-type" to be blocked on RJ45 ports  (#2112)
3732ac5 Add CLI for route flow counter feature (#2031)
29771e7 [techsupport] improve robustness (#2117)
f9dc681 [intfutil] Display RJ45 port and portchannel speed in 'M' instead of 'G' when it's <= 1000M (#2110)
781ae9f [config] Do not enable pfcwd for BmcMgmtToRRouter (#2136)
23e9398 [scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)
576c9ef [scripts/fast-reboot] stop timers in advance (#2131)
4dad79c bugfix: incorrect command for portchannel creation (#2134)
c17b1f4 [show][muxcable] Decrease the timeout for show mux status/hwmode (#2130)
49d61f8 [scripts/fast-reboot] cleanup (#2132)
52ca324 [config/config_mgmt.py]: Fix dpb issue with upper case mac in (#2066)
9e2fbf4 Update db_migrator to support `pfcwd_sw_enable` (#2087)
4010bd0 FGNHG CLI changes (#1588)
6bd54d0 Fix 'show mac' output when FDB entry for default vlan is None instead of 1 (#2126)
liushilongbuaa pushed a commit to liushilongbuaa/sonic-buildimage that referenced this issue Jun 20, 2022
…anch

Related work items: #52, #71, #73, #75, #77, sonic-net#1306, sonic-net#1588, sonic-net#1991, sonic-net#2031, sonic-net#2040, sonic-net#2053, sonic-net#2066, sonic-net#2069, sonic-net#2087, sonic-net#2107, sonic-net#2110, sonic-net#2112, sonic-net#2113, sonic-net#2117, sonic-net#2124, sonic-net#2125, sonic-net#2126, sonic-net#2128, sonic-net#2130, sonic-net#2131, sonic-net#2132, sonic-net#2133, sonic-net#2134, sonic-net#2135, sonic-net#2136, sonic-net#2137, sonic-net#2138, sonic-net#2139, sonic-net#2140, sonic-net#2143, sonic-net#2158, sonic-net#2161, sonic-net#2233, sonic-net#2243, sonic-net#2250, sonic-net#2254, sonic-net#2260, sonic-net#2261, sonic-net#2267, sonic-net#2278, sonic-net#2282, sonic-net#2285, sonic-net#2288, sonic-net#2289, sonic-net#2292, sonic-net#2294, sonic-net#8887, sonic-net#9279, sonic-net#9390, sonic-net#9511, sonic-net#9700, sonic-net#10025, sonic-net#10322, sonic-net#10479, sonic-net#10484, sonic-net#10493, sonic-net#10500, sonic-net#10580, sonic-net#10595, sonic-net#10628, sonic-net#10634, sonic-net#10635, sonic-net#10644, sonic-net#10670, sonic-net#10691, sonic-net#10716, sonic-net#10731, sonic-net#10750, sonic-net#10751, sonic-net#10752, sonic-net#10761, sonic-net#10769, sonic-net#10775, sonic-net#10776, sonic-net#10779, sonic-net#10786, sonic-net#10792, sonic-net#10793, sonic-net#10800, sonic-net#10806, sonic-net#10826, sonic-net#10839, sonic-net#10840, sonic-net#10842, sonic-net#10844, sonic-net#10847, sonic-net#10849, sonic-net#10852, sonic-net#10865, sonic-net#10872, sonic-net#10877, sonic-net#10886, sonic-net#10889, sonic-net#10903, sonic-net#10904, sonic-net#10905, sonic-net#10913, sonic-net#10914, sonic-net#10916, sonic-net#10919, sonic-net#10925, sonic-net#10926, sonic-net#10929, sonic-net#10933, sonic-net#10934, sonic-net#10937, sonic-net#10941, sonic-net#10947, sonic-net#10952, sonic-net#10953, sonic-net#10957, sonic-net#10959, sonic-net#10971, sonic-net#10972, sonic-net#10980
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants