Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caclmgrd core dump file generated after loading scale ipv6 cacl rules #10883

Closed
ZhaohuiS opened this issue May 19, 2022 · 10 comments
Closed

caclmgrd core dump file generated after loading scale ipv6 cacl rules #10883

ZhaohuiS opened this issue May 19, 2022 · 10 comments
Assignees
Labels
MSFT Triaged this issue has been triaged

Comments

@ZhaohuiS
Copy link
Contributor

ZhaohuiS commented May 19, 2022

Description

During test test_cacl_application.py::test_cacl_scale_rules_ipv6, after applying scale ipv6 rules, these is no ip6tables rules added.
From syslog, some many syncd error logs appear and syncd crashed and all containers restarted except database.

Even ip6table rules are added successfully, there is core dump file for caclmgrd under /var/core during the test, its core file is attached.

Steps to reproduce the issue:

  1. Apply scale_ipv6_rules.json file with command acl-loader update full /tmp/scale_ipv6_rules.json
  2. ip6tables -S to show if ipv6 rules are added
  3. Check if there is new core file for caclmgrd under /var/core
  4. Check if containers are restarted

Describe the results you received:

There is core dump file generated for caclmgrd under /var/core.
Error logs attached as cacl_ipv6_crash.log

Containers restarted:

admin@str-e1031-acs-3:~$ docker ps
CONTAINER ID        IMAGE                             COMMAND                  CREATED             STATUS              PORTS               NAMES
22b72452e1d3        docker-snmp:latest                "/usr/local/bin/supe…"   20 hours ago        Up 13 minutes                           snmp
37551648166d        docker-sonic-telemetry:latest     "/usr/local/bin/supe…"   20 hours ago        Up 13 minutes                           telemetry
aaa7a3a893f6        docker-router-advertiser:latest   "/usr/bin/docker-ini…"   20 hours ago        Up 13 minutes                           radv
a4bc99b6844d        docker-dhcp-relay:latest          "/usr/bin/docker_ini…"   20 hours ago        Up 13 minutes                           dhcp_relay
dec32bd00823        docker-lldp:latest                "/usr/bin/docker-lld…"   20 hours ago        Up 13 minutes                           lldp
40357149d286        docker-syncd-brcm:latest          "/usr/local/bin/supe…"   20 hours ago        Up 13 minutes                           syncd
f7eb9b8020cc        docker-teamd:latest               "/usr/local/bin/supe…"   20 hours ago        Up 13 minutes                           teamd
b089bdc62205        docker-orchagent:latest           "/usr/bin/docker-ini…"   20 hours ago        Up 13 minutes                           swss
573bf27961f9        docker-fpm-frr:latest             "/usr/bin/docker_ini…"   20 hours ago        Up 13 minutes                           bgp
7736c1e8ba96        docker-platform-monitor:latest    "/usr/bin/docker_ini…"   20 hours ago        Up 14 minutes                           pmon
0e33eb24db20        docker-database:latest            "/usr/local/bin/dock…"   20 hours ago        Up 7 hours                              database

Describe the results you expected:

Output of show version:

admin@str-e1031-acs-3:/var/log$ show version

SONiC Software Version: SONiC.20201231.66
Distribution: Debian 10.12
Kernel: 4.19.0-12-2-amd64
Build commit: 929ec21596
Build date: Wed May  4 00:35:07 UTC 2022
Built by: cloudtest@5ff90133c000006

Platform: x86_64-cel_e1031-r0
HwSKU: Celestica-E1031-T48S4
ASIC: broadcom
ASIC Count: 1
Serial Number: R0882F2B039723BY000014
Uptime: 10:46:01 up  8:57,  2 users,  load average: 2.46, 2.29, 2.22

Docker images:
REPOSITORY                 TAG                 IMAGE ID            SIZE
docker-platform-monitor    20201231.66         0b8a7d2ace20        544MB
docker-platform-monitor    latest              0b8a7d2ace20        544MB
docker-fpm-frr             20201231.66         c0563027e246        391MB
docker-fpm-frr             latest              c0563027e246        391MB
docker-teamd               20201231.66         dc94f0153a56        373MB
docker-teamd               latest              dc94f0153a56        373MB
docker-orchagent           20201231.66         74919164c1b8        389MB
docker-orchagent           latest              74919164c1b8        389MB
docker-sonic-telemetry     20201231.66         03f8622c7ff8        451MB
docker-sonic-telemetry     latest              03f8622c7ff8        451MB
docker-syncd-brcm          20201231.66         d567887d229b        654MB
docker-syncd-brcm          latest              d567887d229b        654MB
docker-snmp                20201231.66         e7461f9aaacc        405MB
docker-snmp                latest              e7461f9aaacc        405MB
docker-lldp                20201231.66         8828a0d657f6        402MB
docker-lldp                latest              8828a0d657f6        402MB
docker-database            20201231.66         887bf7078876        361MB
docker-database            latest              887bf7078876        361MB
docker-router-advertiser   20201231.66         97f8fed9a890        362MB
docker-router-advertiser   latest              97f8fed9a890        362MB
docker-dhcp-relay          20201231.66         4430b31fb785        378MB
docker-dhcp-relay          latest              4430b31fb785        378MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

scal_ipv6_rules.txt
caclmgrd.1652945008.68160.core.gz
cacl_ipv6_crash.log

@ZhaohuiS ZhaohuiS changed the title many containers restart after loading scale ipv6 cacl rules caclmgrd core dump file generated after loading scale ipv6 cacl rules May 19, 2022
@zhangyanzhao zhangyanzhao added Triaged this issue has been triaged MSFT labels May 25, 2022
@ZhaohuiS
Copy link
Contributor Author

Paste core file decode information here:

root@str-e1031-acs-3:/var/core# gdb /usr/bin/python3 caclmgrd.1653912439.175656.core
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python3...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 175656]
[New LWP 178562]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/python3 /usr/local/bin/caclmgrd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fa891fc3382 in swss::DBConnector::hgetall<std::insert_iterator<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > > (this=<optimized out>, key=..., result=...) at /usr/include/c++/8/bits/basic_string.h:2290
2290    /usr/include/c++/8/bits/basic_string.h: No such file or directory.
[Current thread is 1 (Thread 0x7fa89304c740 (LWP 175656))]
(gdb) 
(gdb) bt
#0  0x00007fa891fc3382 in swss::DBConnector::hgetall<std::insert_iterator<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > > (this=<optimized out>, key=..., result=...) at /usr/include/c++/8/bits/basic_string.h:2290
#1  0x00007fa891ea10da in std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > swss::DBConnector::hgetall<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /lib/x86_64-linux-gnu/libswsscommon.so.0
#2  0x00007fa891e9e5c1 in swss::ConfigDBConnector_Native::get_table(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) ()
   from /lib/x86_64-linux-gnu/libswsscommon.so.0
#3  0x00007fa891f9a37f in _wrap_ConfigDBConnector_Native_get_table (args=<optimized out>, kwargs=<optimized out>) at /usr/include/c++/8/bits/basic_string.h:936
#4  0x00000000005d88a4 in _PyMethodDef_RawFastCallKeywords ()
#5  0x000000000054b330 in ?? ()
#6  0x00000000005524cd in _PyEval_EvalFrameDefault ()
#7  0x00000000005d91fc in _PyFunction_FastCallKeywords ()
#8  0x000000000054b1f0 in ?? ()
#9  0x00000000005524cd in _PyEval_EvalFrameDefault ()
#10 0x000000000054c328 in _PyEval_EvalCodeWithName ()
#11 0x00000000005d94f2 in _PyFunction_FastCallKeywords ()
#12 0x000000000054b1f0 in ?? ()
#13 0x00000000005524cd in _PyEval_EvalFrameDefault ()
#14 0x00000000005d91fc in _PyFunction_FastCallKeywords ()
#15 0x000000000054e7e0 in _PyEval_EvalFrameDefault ()
#16 0x00000000005d91fc in _PyFunction_FastCallKeywords ()
#17 0x000000000054e5ac in _PyEval_EvalFrameDefault ()
#18 0x000000000054bcc2 in _PyEval_EvalCodeWithName ()
#19 0x000000000054e0a3 in PyEval_EvalCode ()
#20 0x0000000000630ce2 in ?? ()
#21 0x0000000000630d97 in PyRun_FileExFlags ()
#22 0x00000000006319ff in PyRun_SimpleFileExFlags ()
#23 0x000000000065432e in ?? ()
#24 0x000000000065468e in _Py_UnixMain ()
#25 0x00007fa89307309b in __libc_start_main (main=0x4bc560 <main>, argc=2, argv=0x7fff5c5b4fe8, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7fff5c5b4fd8) at ../csu/libc-start.c:308
#26 0x00000000005e0e8a in _start ()
(gdb) 

@ZhaohuiS
Copy link
Contributor Author

It crashed at line 863 of caclmgrd:
if self.config_db_map[namespace].get_table(self.ACL_TABLE)[acl_table]["type"] == self.ACL_TABLE_TYPE_CTRLPLANE:

Even add the following protection, it crashed as same.
if self.config_db_map[namespace].get_table(self.ACL_TABLE):

The content of previous acl_table is:

{'DATA_INGRESS_IPV4_TEST': {'policy_desc': 'DATA_INGRESS_IPV4_TEST', 'ports': ['Ethernet46', 'Ethernet32', 'Ethernet12', 'Ethernet33', 'Ethernet45', 'Ethernet41', 'Ethernet34', 'Ethernet21', 'Ethernet31', 'Ethernet25', 'Ethernet5', 'Ethernet20', 'Ethernet3', 'Ethernet16', 'PortChannel102', 'Ethernet2', 'Ethernet7', 'Ethernet42', 'Ethernet17', 'Ethernet30', 'Ethernet22', 'Ethernet10', 'Ethernet36', 'Ethernet24', 'Ethernet29', 'PortChannel103', 'Ethernet38', 'Ethernet4', 'Ethernet40', 'Ethernet47', 'Ethernet6', 'Ethernet13', 'Ethernet18', 'Ethernet15', 'PortChannel101', 'Ethernet37', 'Ethernet26', 'PortChannel104', 'Ethernet28', 'Ethernet19', 'Ethernet14', 'Ethernet1', 'Ethernet11', 'Ethernet8', 'Ethernet39', 'Ethernet44', 'Ethernet23', 'Ethernet35', 'Ethernet0', 'Ethernet27', 'Ethernet9', 'Ethernet43'], 'stage': 'ingress', 'type': 'L3'}, 'EVERFLOW': {'policy_desc': 'EVERFLOW', 'ports': ['PortChannel101', 'PortChannel102', 'PortChannel103', 'PortChannel104', 'Ethernet0', 'Ethernet1', 'Ethernet2', 'Ethernet3', 'Ethernet4', 'Ethernet5', 'Ethernet6', 'Ethernet7', 'Ethernet8', 'Ethernet9', 'Ethernet10', 'Ethernet11', 'Ethernet12', 'Ethernet13', 'Ethernet14', 'Ethernet15', 'Ethernet16', 'Ethernet17', 'Ethernet18', 'Ethernet19', 'Ethernet20', 'Ethernet21', 'Ethernet22', 'Ethernet23', 'Ethernet24', 'Ethernet25', 'Ethernet26', 'Ethernet27', 'Ethernet28', 'Ethernet29', 'Ethernet30', 'Ethernet31', 'Ethernet32', 'Ethernet33', 'Ethernet34', 'Ethernet35', 'Ethernet36', 'Ethernet37', 'Ethernet38', 'Ethernet39', 'Ethernet40', 'Ethernet41', 'Ethernet42', 'Ethernet43', 'Ethernet44', 'Ethernet45', 'Ethernet46', 'Ethernet47'], 'stage': 'ingress', 'type': 'MIRROR'}, 'EVERFLOWV6': {'policy_desc': 'EVERFLOWV6', 'ports': ['PortChannel101', 'PortChannel102', 'PortChannel103', 'PortChannel104', 'Ethernet0', 'Ethernet1', 'Ethernet2', 'Ethernet3', 'Ethernet4', 'Ethernet5', 'Ethernet6', 'Ethernet7', 'Ethernet8', 'Ethernet9', 'Ethernet10', 'Ethernet11', 'Ethernet12', 'Ethernet13', 'Ethernet14', 'Ethernet15', 'Ethernet16', 'Ethernet17', 'Ethernet18', 'Ethernet19', 'Ethernet20', 'Ethernet21', 'Ethernet22', 'Ethernet23', 'Ethernet24', 'Ethernet25', 'Ethernet26', 'Ethernet27', 'Ethernet28', 'Ethernet29', 'Ethernet30', 'Ethernet31', 'Ethernet32', 'Ethernet33', 'Ethernet34', 'Ethernet35', 'Ethernet36', 'Ethernet37', 'Ethernet38', 'Ethernet39', 'Ethernet40', 'Ethernet41', 'Ethernet42', 'Ethernet43', 'Ethernet44', 'Ethernet45', 'Ethernet46', 'Ethernet47'], 'stage': 'ingress', 'type': 'MIRRORV6'}, 'NTP_ACL': {'policy_desc': 'NTP_ACL', 'services': ['NTP'], 'stage': 'ingress', 'type': 'CTRLPLANE'}, 'SNMP_ACL': {'policy_desc': 'SNMP_ACL', 'services': ['SNMP'], 'stage': 'ingress', 'type': 'CTRLPLANE'}, 'SSH_ONLY': {'policy_desc': 'SSH_ONLY', 'services': ['SSH'], 'stage': 'ingress', 'type': 'CTRLPLANE'}}

From decoded traceback, it seems it crashed at auto const& entry = client.hgetall<map<string, string>>(key); in function ConfigDBConnector_Native::get_table. but /usr/include/c++/8/bits/basic_string.h: No such file or directory. looks incorrect.

map<string, map<string, string>> ConfigDBConnector_Native::get_table(string table)
{
    auto& client = get_redis_client(m_db_name);
    string pattern = to_upper(table) + m_table_name_separator + "*";
    const auto& keys = client.keys(pattern);
    map<string, map<string, string>> data;
    for (auto& key: keys)
    {
        auto const& entry = client.hgetall<map<string, string>>(key);
        size_t pos = key.find(m_table_name_separator);
        string row;
        if (pos == string::npos)
        {
            continue;
        }
        row = key.substr(pos + 1);
        data[row] = entry;
    }
    return data;
}

@liuh-80
Copy link
Contributor

liuh-80 commented Jun 7, 2022

Can't reproduce this issue with latest 202112 build:

https://sonic-build.azurewebsites.net/ui/sonic/pipelines/142/builds/105963/artifacts/198518?branchName=202012&artifactName=sonic-buildimage.vs

admin@vlab-01:~$ show version

SONiC Software Version: SONiC.202012.105963-5d2ae332d
Distribution: Debian 10.12
Kernel: 4.19.0-12-2-amd64
Build commit: 5d2ae33
Build date: Fri Jun 3 15:31:21 UTC 2022
Built by: AzDevOps@sonic-build-workers-001KMH

admin@vlab-01:~$ acl-loader update full ./scale_ipv6.json
admin@vlab-01:~$ ls /var/core
admin@vlab-01:~$ sudo iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-A INPUT -s 127.0.0.1/32 -i lo -j ACCEPT
-A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type 0 -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type 3 -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type 11 -j ACCEPT
-A INPUT -p udp -m udp --dport 67:68 -j ACCEPT
-A INPUT -p udp -m udp --dport 546:547 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 179 -j ACCEPT
-A INPUT -p tcp -m tcp --sport 179 -j ACCEPT
-A INPUT -d 10.1.0.32/32 -j DROP
-A INPUT -d 10.250.0.0/32 -j DROP
-A INPUT -d 192.168.0.1/32 -j DROP
-A INPUT -d 10.0.0.56/32 -j DROP
-A INPUT -d 10.0.0.58/32 -j DROP
-A INPUT -d 10.0.0.60/32 -j DROP
-A INPUT -d 10.0.0.62/32 -j DROP
-A INPUT -m ttl --ttl-lt 2 -j ACCEPT
-A INPUT -j DROP
admin@vlab-01:~$


admin@vlab-01:~$ sudo docker ps
CONTAINER ID        IMAGE                                COMMAND                  CREATED             STATUS              PORTS               NAMES
588ddb3d5abb        docker-sonic-mgmt-framework:latest   "/usr/local/bin/supe…"   6 minutes ago       Up 3 minutes                            mgmt-framework
792a362086ce        docker-sonic-telemetry:latest        "/usr/local/bin/supe…"   6 minutes ago       Up 3 minutes                            telemetry
88096a1fef86        docker-snmp:latest                   "/usr/local/bin/supe…"   6 minutes ago       Up 3 minutes                            snmp
f69bad37e3c1        docker-platform-monitor:latest       "/usr/bin/docker_ini…"   8 minutes ago       Up 3 minutes                            pmon
e0cecfbba95a        docker-router-advertiser:latest      "/usr/bin/docker-ini…"   9 minutes ago       Up 3 minutes                            radv
f99380455ec7        docker-lldp:latest                   "/usr/bin/docker-lld…"   9 minutes ago       Up 3 minutes                            lldp
b429272c7016        docker-dhcp-relay:latest             "/usr/bin/docker_ini…"   9 minutes ago       Up 3 minutes                            dhcp_relay
ad7301bfd135        docker-gbsyncd-vs:latest             "/usr/local/bin/supe…"   9 minutes ago       Up 3 minutes                            gbsyncd
3e8053c3f05a        docker-syncd-vs:latest               "/usr/local/bin/supe…"   9 minutes ago       Up 3 minutes                            syncd
c94d0427b04b        docker-teamd:latest                  "/usr/local/bin/supe…"   9 minutes ago       Up 3 minutes                            teamd
960718fe8ac5        docker-orchagent:latest              "/usr/bin/docker-ini…"   3 days ago          Up 3 minutes                            swss
6e52ee6eabc9        docker-fpm-frr:latest                "/usr/bin/docker_ini…"   3 days ago          Up 3 minutes                            bgp
be2d6ac77868        docker-database:latest               "/usr/local/bin/dock…"   3 days ago          Up 9 minutes                            database
admin@vlab-01:~$

@liuh-80
Copy link
Contributor

liuh-80 commented Jun 7, 2022

Confirmed this issue only happen with read hardware, will debug with hardware later.

@liuh-80
Copy link
Contributor

liuh-80 commented Jun 17, 2022

Create a draft PR for this issue: sonic-net/sonic-swss-common#634
Install local build package with this issue to test device and can't reproduce issue with manually test.

@liuh-80
Copy link
Contributor

liuh-80 commented Jun 20, 2022

After offline discussion the caclmgrd need code change:

  1. For performance reason, libswsscommon is not thread safe by design.
  2. caclmgrd share config DB connection cross thread, so it need code change to fix this issue.

ZhaohuiS added a commit to sonic-net/sonic-mgmt that referenced this issue Sep 22, 2022
What is the motivation for this PR?
test_cacl_scale_rules_ipv4 and test_cacl_scale_rules_ipv6 always fails because of this issue:
sonic-net/sonic-buildimage#10883

How did you do it?
Skip these 2 cases until the issue is fixed.

How did you verify/test it?
Run cacl/test_cacl_application.py::test_cacl_scale_rules_ipv4 or cacl/test_cacl_application.py::test_cacl_scale_rules_ipv6

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
wangxin pushed a commit to sonic-net/sonic-mgmt that referenced this issue Oct 10, 2022
What is the motivation for this PR?
test_cacl_scale_rules_ipv4 and test_cacl_scale_rules_ipv6 always fails because of this issue:
sonic-net/sonic-buildimage#10883

How did you do it?
Skip these 2 cases until the issue is fixed.

How did you verify/test it?
Run cacl/test_cacl_application.py::test_cacl_scale_rules_ipv4 or cacl/test_cacl_application.py::test_cacl_scale_rules_ipv6

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
wangxin pushed a commit to sonic-net/sonic-mgmt that referenced this issue Oct 10, 2022
What is the motivation for this PR?
test_cacl_scale_rules_ipv4 and test_cacl_scale_rules_ipv6 always fails because of this issue:
sonic-net/sonic-buildimage#10883

How did you do it?
Skip these 2 cases until the issue is fixed.

How did you verify/test it?
Run cacl/test_cacl_application.py::test_cacl_scale_rules_ipv4 or cacl/test_cacl_application.py::test_cacl_scale_rules_ipv6

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Azarack pushed a commit to Azarack/sonic-mgmt that referenced this issue Oct 17, 2022
What is the motivation for this PR?
test_cacl_scale_rules_ipv4 and test_cacl_scale_rules_ipv6 always fails because of this issue:
sonic-net/sonic-buildimage#10883

How did you do it?
Skip these 2 cases until the issue is fixed.

How did you verify/test it?
Run cacl/test_cacl_application.py::test_cacl_scale_rules_ipv4 or cacl/test_cacl_application.py::test_cacl_scale_rules_ipv6

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
allen-xf pushed a commit to allen-xf/sonic-mgmt that referenced this issue Oct 28, 2022
What is the motivation for this PR?
test_cacl_scale_rules_ipv4 and test_cacl_scale_rules_ipv6 always fails because of this issue:
sonic-net/sonic-buildimage#10883

How did you do it?
Skip these 2 cases until the issue is fixed.

How did you verify/test it?
Run cacl/test_cacl_application.py::test_cacl_scale_rules_ipv4 or cacl/test_cacl_application.py::test_cacl_scale_rules_ipv6

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@ZhaohuiS
Copy link
Contributor Author

ZhaohuiS commented Apr 3, 2023

Will enhance caclmgrd code to avoid multiple threads accessing the same db connector.
In sub thread, it will use a new DB connector to connect db and read/write db.

ZhaohuiS added a commit to sonic-net/sonic-host-services that referenced this issue Jun 9, 2023
Fix the issue sonic-net/sonic-buildimage#10883.
For performance reason, libswsscommon is not thread safe by design.
caclmgrd share config DB connection cross thread, so change to use new db connector in child thread.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@ZhaohuiS
Copy link
Contributor Author

Fixed it in sonic-net/sonic-host-services#62

yejianquan pushed a commit to sonic-net/sonic-mgmt that referenced this issue Jun 14, 2023
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>

Approach
What is the motivation for this PR?
Since sonic-net/sonic-buildimage#10883 was fixed, enable scale cacl test cases.

How did you do it?
Remove skipped section of scale cases from conditional mark file.
There is one more status column, that's why increase table length from 6 to 7.

admin@str-e1031-acs-1:~$ show acl rule
Table    Rule    Priority    Action    Match    Status
-------  ------  ----------  --------  -------  --------
admin@str-e1031-acs-1:~$ 

How did you verify/test it?
Run tests/cacl/test_cacl_application.py
@ZhaohuiS
Copy link
Contributor Author

This fix hasn't been ported into 202012 and 202205, reopen it to make sure scale cases are skipped on feature branch.
Will close it until it's ported to feature branches.

cacl/test_cacl_application.py::test_cacl_scale_rules_ipv4:
skip:
reason: "caclmgrd may crash after loading scale ipv4 cacl rules."
conditions: #10883

cacl/test_cacl_application.py::test_cacl_scale_rules_ipv6:
skip:
reason: "caclmgrd may crash after loading scale ipv6 cacl rules."
conditions: #10883

@ZhaohuiS ZhaohuiS reopened this Jun 14, 2023
qiluo-msft pushed a commit that referenced this issue Jul 4, 2023
)

Cherry pick PR for sonic-net/sonic-host-services#62

#### Why I did it
Fix the issue #10883.

##### Work item tracking
- Microsoft ADO **(17795594)**:

#### How I did it
For performance reason, libswsscommon is not thread safe by design.
caclmgrd share config DB connection cross thread, so change to use new db connector in child thread.

#### How to verify it
Load scale ipv4/ipv6 rules and verify if caclmgrd is crashed
ZhaohuiS added a commit to ZhaohuiS/sonic-host-services that referenced this issue Jul 8, 2023
Fix the issue sonic-net/sonic-buildimage#10883.
For performance reason, libswsscommon is not thread safe by design.
caclmgrd share config DB connection cross thread, so change to use new db connector in child thread.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@ZhaohuiS
Copy link
Contributor Author

The fix was merged into 202012 and 202305 latest image. Close it.

@ZhaohuiS ZhaohuiS reopened this Aug 29, 2023
yejianquan pushed a commit to sonic-net/sonic-mgmt that referenced this issue Aug 30, 2023
Description of PR
Summary:
Since sonic-net/sonic-buildimage#10883 was fixed, enable scale cacl test cases.

Cherry pick #8592

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
yejianquan pushed a commit to sonic-net/sonic-mgmt that referenced this issue Aug 30, 2023
Description of PR
Summary:
Since sonic-net/sonic-buildimage#10883 was fixed, enable scale cacl test cases.

Approach
What is the motivation for this PR?
Cherry pick #8592

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
mrkcmo pushed a commit to Azarack/sonic-mgmt that referenced this issue Oct 3, 2023
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>

Approach
What is the motivation for this PR?
Since sonic-net/sonic-buildimage#10883 was fixed, enable scale cacl test cases.

How did you do it?
Remove skipped section of scale cases from conditional mark file.
There is one more status column, that's why increase table length from 6 to 7.

admin@str-e1031-acs-1:~$ show acl rule
Table    Rule    Priority    Action    Match    Status
-------  ------  ----------  --------  -------  --------
admin@str-e1031-acs-1:~$ 

How did you verify/test it?
Run tests/cacl/test_cacl_application.py
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this issue Jan 25, 2024
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>

Approach
What is the motivation for this PR?
Since sonic-net/sonic-buildimage#10883 was fixed, enable scale cacl test cases.

How did you do it?
Remove skipped section of scale cases from conditional mark file.
There is one more status column, that's why increase table length from 6 to 7.

admin@str-e1031-acs-1:~$ show acl rule
Table    Rule    Priority    Action    Match    Status
-------  ------  ----------  --------  -------  --------
admin@str-e1031-acs-1:~$ 

How did you verify/test it?
Run tests/cacl/test_cacl_application.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MSFT Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

4 participants