Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Napalm net.cli race condition #55332

Open
danfoster opened this issue Nov 15, 2019 · 7 comments
Open

Napalm net.cli race condition #55332

danfoster opened this issue Nov 15, 2019 · 7 comments
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged
Milestone

Comments

@danfoster
Copy link

Description of Issue

If you run 2 parallel net.cli commands, the output can get mixed up. There is also a change it can break the napalm proxy-minion if the wrong process reads consumes the prompt.

In the example below I run the following two loops:

while [ true ]; do salt 'mlnx1.switch' net.cli "show version"; done
while [ true ]; do salt 'mlnx1.switch' net.cli "show int status"; done

This quickly reproduces the issue, but a more common, real world example is when the minion grains gets refreshed at the same time as applying a state.

In the above example, you would expect the first command to produce the following output:

            Product name:      Onyx
            Product release:   3.8.2004
            Build ID:          #1-dev
            Build date:        2019-09-23 14:19:47
            Target arch:       x86_64
            Target hw:         x86_64
            Built by:          jenkins@7ae5fd122b61
            Version summary:   X86_64 3.8.2004 2019-09-23 14:19:47 x86_64

            Product model:     x86onie
            Host ID:           B8599F0E752C
            System serial num: MT1917K00431
            System UUID:       946e383c-69ed-11e9-8000-b8599f63da40

            Uptime:            7d 23h 1m 28.512s
            CPU load averages: 3.01 / 3.05 / 3.07
            Number of CPUs:    4
            System memory:     2628 MB used / 5175 MB free / 7803 MB total
            Swap:              0 MB used / 0 MB free / 0 MB total

And the 2nd command:

           --------------------------------------------------------------------------------------------------------------------------------

            Port                Operational state   Admin                         Speed           Description
            --------------------------------------------------------------------------------------------------------------------------------
            mgmt0               Up                  Enabled                       1000Mb/s (auto) -
            Eth1/1              Down                Enabled                       Unknown         -
            Eth1/2              Down                Enabled                       Unknown         -
            Eth1/3              Down                Enabled                       Unknown         -
            Eth1/4              Down                Enabled                       Unknown         -
            Eth1/5              Down                Enabled                       Unknown         -
            Eth1/6              Down                Enabled                       Unknown         -
            Eth1/7              Down                Enabled                       Unknown         -
            Eth1/8              Down                Enabled                       Unknown         -
            Eth1/9              Down                Enabled                       Unknown         -
            Eth1/10             Down                Enabled                       Unknown         -
            Eth1/11             Down                Enabled                       Unknown         -
            Eth1/12             Down                Enabled                       Unknown         -
            Eth1/13             Down                Enabled                       Unknown         -
            Eth1/14             Down                Enabled                       Unknown         -
            Eth1/15             Down                Enabled                       Unknown         -
            Eth1/16             Down                Enabled                       Unknown         -
            Eth1/17             Down                Enabled                       Unknown         -
            Eth1/18             Down                Enabled                       Unknown         -
            Eth1/19             Up                  Enabled                       40G             -
            Eth1/20             Up                  Enabled                       40G             -
            Eth1/21             Down                Enabled                       Unknown         -
            Eth1/22             Down                Enabled                       Unknown         -

In the file attached, I show an example of the output of the proxy-minion in debug mode when the two commands are getting merged.

  • The first command is alloctaed JID 20191115154146169012.
  • The 2nd command is allocated JID: 20191115154146711930

20191115154146169012 returns:

[DEBUG   ] minion return: {u'fun_args': [u'show version'], u'jid': u'20191115154146169012', u'return': {u'comment': u'', u'result': True, u'out': {u'show version':
    u'us\n--------------------------------------------------------------------------------------------------------------------------------\n\nPort                Operational state   Admin                         Speed
    Description                             \n--------------------------------------------------------------------------------------------------------------------------------\nmgmt0               Up
    Enabled                       1000Mb/s (auto) -\nEth1/1              Down                Enabled                       Unknown         -                                       \nEth1/2              Down
    Enabled                       Unknown         -                                       \nEth1/3              Down                Enabled                       Unknown         -                                       \nEth1/
    4              Down                Enabled                       Unknown         -                                       \nEth1/5              Down                Enabled
    Unknown         -                                       \nEth1/12             Down                Enabled                       Unknown         -                                       \nEth1/13             Down
    Enabled                       Unknown         -                                       \nEth1/14             Down                Enabled                       Unknown         -                                       \nEth1/
    15             Down                Enabled                       Unknown         -                                       \nEth1/16             Down                Enabled
    Unknown         -                                       \nEth1/17             Down                Enabled                       Unknown         -                                       \nEth1/18             Down
    Enabled                       Unknown         -                                       \n\n'}}, u'retcode': 0, u'success': True, u'fun': u'net.cli'}

20191115154146711930 returns:

[DEBUG   ] minion return: {u'fun_args': [u'show int status'], u'jid': u'20191115154146711930', u'return': {u'comment': u'', u'result': True, u'out': {u'show int status': u'
    Down                Enabled                       Unknown         -                                       \nEth1/7              Down                Enabled
    Unknown         -                                       \nEth1/8              Down                Enabled                       Unknown         -                                       \nEth1/9              Down
    Enabled                       Unknown         -                                       \nEth1/10             Down                Enabled                       Unknown         -                                       \nEth1/
    11             Down                Enabled                       Unknown         -                                       \nEth1/19             Up                  Enabled
    40G             -                                       \nEth1/20             Up                  Enabled                       40G             -                                       \nEth1/21             Down
    Enabled                       Unknown         -                                       \nEth1/22             Down                Enabled                       Unknown         -                                       \n'}},
    u'retcode': 0, u'success': True, u'fun': u'net.cli'}

Versions Report

# salt --versions-report
Salt Version:
           Salt: 2019.2.2

Dependency Versions:
           cffi: 1.12.3
       cherrypy: unknown
       dateutil: 2.7.2
      docker-py: 3.5.0
          gitdb: 2.0.5
      gitpython: 2.1.11
          ioflo: Not Installed
         Jinja2: 2.9.6
        libgit2: Not Installed
        libnacl: 1.6.1
       M2Crypto: 0.31.0
           Mako: 1.0.12
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: 1.2.5
      pycparser: 2.19
       pycrypto: 2.6.1
   pycryptodome: 3.7.3
         pygit2: Not Installed
         Python: 2.7.5 (default, Aug  7 2019, 00:51:29)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: 2.0.5
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.7.1908 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-1062.1.2.el7.x86_64
         system: Linux
        version: CentOS Linux 7.7.1908 Core

Using the onyx napalm driver (https://github.com/napalm-automation-community/napalm-onyx), but I don't think this should make a difference?

napalm_bug.txt

@Ch3LL
Copy link
Contributor

Ch3LL commented Jan 6, 2020

ping @saltstack/team-network any ideas here?

@Ch3LL Ch3LL added Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged and removed needs-triage labels Jan 6, 2020
@Ch3LL Ch3LL added this to the Blocked milestone Jan 6, 2020
@stale
Copy link

stale bot commented Feb 5, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Feb 5, 2020
@danfoster
Copy link
Author

I'm still interested in how we can debug this further and hopefully solve it, so commenting to prevent the stalebot from closing this issue.

@stale
Copy link

stale bot commented Feb 5, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale stale bot removed the stale label Feb 5, 2020
@stale
Copy link

stale bot commented Mar 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Mar 6, 2020
@danfoster
Copy link
Author

danfoster commented Mar 12, 2020

I'm still interested in how we can debug this further (anti stale-bot comment)

@stale
Copy link

stale bot commented Mar 12, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale stale bot removed the stale label Mar 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged
Projects
No open projects
[Test] Triage
  
Needs triage
Development

No branches or pull requests

3 participants