Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salt command does not really retry messages #56015

Open
sindhukothe opened this issue Jan 29, 2020 · 1 comment
Open

salt command does not really retry messages #56015

sindhukothe opened this issue Jan 29, 2020 · 1 comment
Labels
Bug broken, incorrect, or confusing behavior severity-low 4th level, cosemtic problems, work around exists
Projects
Milestone

Comments

@sindhukothe
Copy link
Contributor

sindhukothe commented Jan 29, 2020

Description of Issue

When a salt command is issued e.g.salt '*'test.ping and if the master is busy, it is supposed to retry 3 times (by default). But the salt command does not send other messages on the bus to master

Steps to Reproduce Issue

Not sure how to make master busy not to respond the messages in 5 seconds, But I am attaching the relevant code snippets here.

Versions Report

Salt Version:
           Salt: 2019.2.2

Dependency Versions:
           cffi: 1.12.3
       cherrypy: unknown
       dateutil: 2.7.2
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.9.6
        libgit2: 0.27.7
        libnacl: Not Installed
       M2Crypto: 0.33.0
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: 2.17
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.27.4
         Python: 2.7.16 (default, Jan 14 2020, 10:29:59)
   python-gnupg: 0.4.2
         PyYAML: 5.1.1
          PyZMQ: 16.0.3
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.1.6

System Versions:
           dist:
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-957.5.1.el7.x86_64
         system: Linux
        version: Not Installed


@sindhukothe
Copy link
Contributor Author

In file https://github.com/saltstack/salt/blob/master/salt/transport/zeromq.py under class AsyncReqMessageClient:

    def send(self, message, timeout=None, tries=3, future=None, callback=None, raw=False):

sends the message to the master.
It then waits for 5 seconds and calls timeout_message() function.
timeout_message function then calls the self.send() second time with the same message.
But because the self.send_queue already has 1 element in it, following code in send() blocks the 2nd message from dequeueing and actually sending on the bus.

    def send(self, message, timeout=None, tries=3, future=None, callback=None, raw=False):
....
....
        if len(self.send_queue) == 0:
            self.io_loop.spawn_callback(self._internal_send_recv)

I think the solution is to remove from send_queue as soon as it is sent on the bug and not after the future is complete in _internal_send_recv()

@Akm0d Akm0d added Bug broken, incorrect, or confusing behavior P4 Priority 4 and removed needs-triage labels Feb 7, 2020
@Akm0d Akm0d added this to the Approved milestone Feb 7, 2020
@Akm0d Akm0d added this to Considering in Sodium Feb 7, 2020
@sagetherage sagetherage removed this from Considering in Sodium Apr 24, 2020
@sagetherage sagetherage removed the P4 Priority 4 label Jun 3, 2020
@sagetherage sagetherage added the severity-low 4th level, cosemtic problems, work around exists label Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior severity-low 4th level, cosemtic problems, work around exists
Projects
No open projects
Aluminium
Awaiting triage
Development

No branches or pull requests

3 participants