Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 3003 salt-minion msgpack deserialization failure when running salt-master custom runner in high concurrency. #60094

Open
ninoY25 opened this issue Apr 28, 2021 · 3 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt msgpack Phosphorus v3005.0 Release code name and version severity-high 2nd top severity, seen by most users, causes major problems

Comments

@ninoY25
Copy link

ninoY25 commented Apr 28, 2021

Description
I create a custom runner module to upload bash script to master and run this script in multiple minions using cmd.script module. But when I was doing load testing for this runner, strange things happened. If I run multiple custom runner in parallel, one of minions will raise Unpack failed: incomplete input. And I set the salt-minion log_level_logfile to trace, and I see the incoming message is complete when coming from the master in the first place. But then it will show the msgpack deserialization failure when process manager fork a process to execute the job.

salt-master custom runner script.py
import logging
import os
import errno

__virtualname__ = 'script'


def __virtual__():
    return True


log = logging.getLogger(__name__)
SCRIPT_DICT = '/var/cache/salt/scripts'


def run(file_name,
        file_content,
        minion_ids,
        runas=None,
        env=None,
        saiga_jid=''):
    """
    A function to run script for saiga

    CLI Example::

    salt-run script.run file_name="helloworld.sh" file_content="echo \$key" minion_ids=minion_id1,minion_id2 runas=root env="{'key': 'value'}"
    """
    try:
        res = _make_file(file_name, file_content)
        log.debug(res)
    except Exception as e:
        return {
            'retcode': 2,
            'error_msg': 'create file failed, error_msg: %s' % str(e)
        }

    try:
        result_jid, minions = _execute_scripts(
            file_name, minion_ids, runas, env)
    except Exception as e:
        return {
            'retcode': 3,
            'error_msg': 'execute scripts failed, error_msg: %s' % str(e)
        }

    return {
        'retcode': 0,
        'result': {
            'jid': result_jid,
            'minions': minions
        }
    }


def _make_file(file_name, file_content):
    import salt.utils.files

    filepath = '%s/%s' % (SCRIPT_DICT, file_name)
    if not os.path.exists(os.path.dirname(filepath)):
        try:
            os.makedirs(os.path.dirname(filepath))
        except OSError as exc:
            if exc.errno != errno.EEXIST:
                raise

    contents = []
    for line in file_content:
        contents.append("{}\n".format(line))
    with salt.utils.files.fopen(filepath, "w") as ofile:
        ofile.write(salt.utils.stringutils.to_str("".join(contents)))
    return 'Wrote {} lines to "{}"'.format(len(contents), filepath)


def _execute_scripts(file_name, minion_ids, runas, env):
    import salt

    salt_filepath = 'salt://%s' % file_name
    client = salt.client.LocalClient(__opts__["conf_file"])
    ckminion = salt.utils.minions.CkMinions(__opts__)
    minions = ckminion.check_minions(
        minion_ids,
        'list',
        greedy=False
    )['minions']
    result_jid = client.cmd_async(tgt=minion_ids,
                                  fun="cmd.script",
                                  arg=[salt_filepath],
                                  kwarg={
                                      'runas': runas,
                                      'env': env
                                  },
                                  tgt_type='list')
    return result_jid, minions
salt-minion log for msgpack failed
2021-04-21 19:01:17,476 [salt.payload     :131 ][CRITICAL][9002] Could not deserialize msgpack message. This often happens when trying to read a file not in binary mode. To see message payload, enable debug logging and retry. Exception: Unpack failed: incomplete input
2021-04-21 19:01:17,477 [salt.payload     :133 ][DEBUG   ][9002] Msgpack deserialization failure on message: <EF><BF><BD><EF><BF><BD>tgt_type<EF><BF><BD>list<EF><BF><BD>jid<EF><BF><BD>20210421110117018015<EF><BF><BD>tgt<EF><BF>
<BD>^@x<EF><BF><BD>^@$033fda5d-8cc0-4a57-8879-b57833f35b08<EF><BF><BD>^@$bf921cd5-ac3a-4dc7-917b-719052d396c6<EF><BF><BD>^@$8c5358af-9e66-4c01-8e57-2ec77e6d3df9<EF><BF><BD>^@$ac93959a-4d9e-4a66-8f2b-62f4b4029eec<EF><BF><BD>^@$690c0a8c-095d-4642-8f89-19a0e0147f32<EF><BF><BD>^@$abb48b03-e2c9-4745-8610-9c571e55adea<EF><BF><BD>^@$f53f3bbc-f078-4f0d-8198-2c1c1f02569b<EF><BF><BD>^@$42b88c2c-2a4e-4a8f-a633-51e4d5259e76<EF><BF><BD>^@$c5284f34-97aa-41a0-994b-b3ebdaecabe7<EF><BF><BD>^@$c3391f11-370d-4393-8867-0477fb069930<EF><BF><BD>^@$0b9e7a8d-d834-47f5-a011-5d80d2871ce7<EF><BF><BD>^@$afea455e-2b44-4bdc-bccc-aa012a58dbe5<EF><BF><BD>^@$5b0d83a8-86df-4ec7-ae62-85289a7ba94b<EF><BF><BD>^@$d06bea71-24ca-4f52-aaa0-7a5ff7090683<EF><BF><BD>^@$9456a5b9-af32-419e-b986-4689fae85634<EF><BF><BD>^@$90b3a370-f2d5-4757-9975-23b4d4630b58<EF><BF><BD>^@$03e32cd4-e2fb-40ab-97e8-eab293598a21<EF><BF><BD>^@$858f4a60-eb10-44c8-9085-28d4805a8468<EF><BF><BD>^@$85bed53a-4611-4b48-8e0d-a6bfb954ec8c<EF><BF><BD>^@$41516d80-32ec-4e9d-8880-363ddef64d0a<EF><BF><BD>^@$c77bea32-1cff-43d0-8b6f-2182044b1be7<EF><BF><BD>^@$8f0fa6fe-722d-4713-8750-44e193ac61e0<EF><BF><BD>^@$9a0ecc02-db7b-432f-8466-255ff45f22bf<EF><BF><BD>^@$e4f8b0b1-d2a6-485c-8418-2ce50068366d<EF><BF><BD>^@$1bdd3cee-353b-4c43-9e97-679517bb497e<EF><BF><BD>^@$ff588eab-be02-44f6-9ef3-67f253c0e52d<EF><BF><BD>^@$8bd1d636-7150-4669-8995-3181ddc4c96f<EF><BF><BD>^@$5404a218-376e-4f2b-9232-d3f6d7a0e4c0<EF><BF><BD>^@$5813023b-6344-4b57-a3c3-c414662220f9<EF><BF><BD>^@$076ccc10-cabb-4776-84d8-93c6c334201b<EF><BF><BD>^@$a4c34c24-6a0d-474c-965c-bc0d351711fe<EF><BF><BD>^@$7f07ac43-e007-4297-9d9f-aecdb2709520<EF><BF><BD>^@$5909f602-59c7-47db-a365-fe34add54b2b<EF><BF><BD>^@$d96765e3-0d97-4547-ab52-4a20c26a66b5<EF><BF><BD>^@$5ce7c047-238c-44fd-aae6-5fd3b7aff15a<EF><BF><BD>^@$c7b42b1e-e3e1-4047-af22-d47ce891d0e0<EF><BF><BD>^@$8a2c4207-c8e5-43af-bfe9-937b92d29fdb<EF><BF><BD>^@$3398d456-8ad6-4c2f-a521-dec7fc2e210a<EF><BF><BD>^@$357df518-22b8-4fcb-a7f7-f6a86d18ada3<EF><BF><BD>^@$9ada5a66-9c47-42cd-bfce-99cc1a078b15<EF><BF><BD>^@$730716a9-7afb-4473-a68c-948233dc3ce9<EF><BF><BD>^@$3d3c3b07-1b12-45f3-94fe-a57b8470bf94<EF><BF><BD>^@$16dd5916-15b7-4f6b-9654-7a6fe19f8f86<EF><BF><BD>^@$13a816ec-efd0-43a7-aaf5-2894652ae101<EF><BF><BD>^@$e2db5dd3-a8d7-4d18-9291-3a236162bc37<EF><BF><BD>^@$6bea95c8-250b-4c5c-8e8a-80e03e593348<EF><BF><BD>^@$e74c7748-7836-4d0b-a829-34abe10e93ce<EF><BF><BD>^@$ee04417c-255a-49ad-a532-12e4560c79c7<EF><BF><BD>^@$482e405a-52d7-4a3e-a039-e5134e573404<EF><BF><BD>^@$d7a14ba6-d152-400f-b0a6-615e824aa860<EF><BF><BD>^@$9fe71805-5f42-4ec2-abb8-e2a180da51bd<EF><BF><BD>^@$0292c9a1-f03e-4f0d-8594-ff2dc7080102<EF><BF><BD>^@$ad93bfc3-cda5-47ef-a375-8d6253f28472<EF><BF><BD>^@$bc17ce00-bbaf-45f1-b13f-dd718ef22b3a<EF><BF><BD>^@$e20b7301-eb5b-46b0-8eff-e39822db5458<EF><BF><BD>^@$159545b1-6056-4271-8151-27c3540e7db1<EF><BF><BD>^@$12fa7dd1-05ab-4979-b122-8b15388f891a<EF><BF><BD>^@$0db30562-c0ef-49aa-a82e-074fb569e869<EF><BF><BD>^@$438521c2-8be9-4fac-bded-a7ca630d2571<EF><BF><BD>^@$9865fc81-db67-410c-94d1-afaceb5521ea<EF><BF><BD>^@$f338a84c-53d6-4908-a48b-221cf0836e1e<EF><BF><BD>^@$6e0f995c-44b4-4dd3-86a9-cd28f168f170<EF><BF><BD>^@$2f5f37f4-da51-4ed7-9707-4aa7448b98f6<EF><BF><BD>^@$23c3c24f-76c1-4c9f-862f-fc3c82cae311<EF><BF><BD>^@$95883e68-baf1-48af-8897-0b3292e8d3b8<EF><BF><BD>^@$27c435b8-e07f-43f8-9030-d8a42d38fb86<EF><BF><BD>^@$9942e6f7-e3e2-4b7a-a6d6-32bb62524647<EF><BF><BD>^@$0ce31c7c-4b75-4800-8e6a-e063eb0e5abc<EF><BF><BD>^@$ce9a5f22-2443-4fb3-87b5-0dfb016fbf3b<EF><BF><BD>^@$1cc83b1e-1ce7-435c-9efb-5975b7f0fea7<EF><BF><BD>^@$e43a2d6f-b171-43d8-ba12-7be669ebb727<EF><BF><BD>^@$0cc5d1b5-b83a-46b7-bfa7-585f74c7675f
<EF><BF><BD>^@$baf393e7-caa1-4e44-b9dd-92e3d7d7f043<EF><BF><BD>^@$70dd057c-65b4-42c6-b31b-f105a5b52e3d<EF><BF><BD>^@$ef95b4f0-e9f0-4ce6-a3a1-a2e1f15662cc<EF><BF><BD>^@$86abf25c-a249-440f-9a3a-5490ef38727a<EF><BF><BD>^@$de8d30f4-814d-4f6e-8e9c-4662c7e27e6f<EF><BF><BD>^@$85f92425-a9eb-4e70-95fd-d52512d81b6b<EF><BF><BD>^@$f2c2e43a-461a-4168-9e29-b8eba81ac961<EF><BF><BD>^@$e000885d-a17e-45b6-847a-2b7c3909e2d4<EF><BF><BD>^@$86e72073-4563-492c-b7b6-e7691c0a4bfd
<EF><BF><BD>^@$c154f26a-70a0-4cc1-b711-31d9d239ba5a<EF><BF><BD>^@$5e96f3af-0470-4359-aa67-c8e47f257432<EF><BF><BD>^@$b4b8b6e6-b4cb-4054-bda6-a7e6aed4ecef<EF><BF><BD>^@$2be43752-a622-412a-9363-0b466e377c8f<EF><BF><BD>^@$6cd1f903-f3e5-4107-a2e2-f62abeba2dd0<EF><BF><BD>^@$75a6e34c-b35d-4011-bc18-b190eece1a3c<EF><BF><BD>^@$ef182283-ce4c-4e0a-ba4e-a66db7e547ab<EF><BF><BD>^@$47e74215-246e-4b2c-857d-b5e42cd7eaae<EF><BF><BD>^@$58851bed-8c73-4d24-a245-35c9421c8294
<EF><BF><BD>^@$eb2d5c15-652a-474a-9028-79e27883bdcb<EF><BF><BD>^@$bdd8a136-8113-4263-9ec8-b69bf1a6d0a2<EF><BF><BD>^@$031a9508-53d5-4b74-8a93-807aac2ed167<EF><BF><BD>^@$2b0b11dd-1902-4f14-b814-d1442e44af5a<EF><BF><BD>^@$9917504c-8c6d-471e-84b6-a843c3706db4<EF><BF><BD>^@$1825f1ad-73f3-4442-aeab-d7e4becac82a<EF><BF><BD>^@$e9dc558d-498f-4e52-84b2-ef51a245775a<EF><BF><BD>^@$cca1108f-c11e-4a43-ba33-488b859e17e6<EF><BF><BD>^@$f296cd79-b5f0-4b08-a2a2-765a26812efa
<EF><BF><BD>^@$22251dc1-c409-4da8-9c76-094c2875c85a<EF><BF><BD>^@$4fa45f60-0bd4-4978-8626-bf1637404e72<EF><BF><BD>^@$1cf7504c-287b-45ef-9000-eaba219bd4a0<EF><BF><BD>^@$91352229-5e6e-491e-a492-ccda77304fc4<EF><BF><BD>^@$f534eaa5-5508-4ae3-857e-aba7e

But if I disable the process_count_max for every minion, the msgpack deserialization failed exception will not happen again. And if I decrease the concurrency, and this situation will not happen neither.

Setup
salt-master setup

apt-get install salt-master

salt-master config

file_roots:
  base:
     - /var/cache/salt/scripts

salt-master machine info

num_cpus:
    16
mem_total:
    63399
os:
    Debian
os_family:
    Debian
osarch:
    amd64
oscodename:
    buster
osfinger:
    Debian-10
osfullname:
    Debian
osmajorrelease:
    10
osrelease:
    10
osrelease_info:
    - 10

salt-minion setup

apt-get install salt-minion

salt-minion config

master: {master_ip}
process_count_max: 10

salt-minion machine info

num_cpus:
    2
mem_total:
    3955
os:
    Debian
os_family:
    Debian
osarch:
    amd64
oscodename:
    stretch
osfinger:
    Debian-9
osfullname:
    Debian
osmajorrelease:
    9
osrelease:
    9.13
osrelease_info:
    - 9
    - 13

Salt-master Versions Report

salt-master --versions-report
Salt Version:
          Salt: 3003

Dependency Versions:
          cffi: Not Installed
      cherrypy: 8.9.1
      dateutil: 2.7.3
     docker-py: Not Installed
         gitdb: 2.0.5
     gitpython: 2.1.11
        Jinja2: 2.11.2
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.5.6
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: 3.10.1
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.7.3 (default, Jul 25 2020, 13:03:44)
  python-gnupg: Not Installed
        PyYAML: 5.3.1
         PyZMQ: 17.1.2
         smmap: 2.0.5
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.1

System Versions:
          dist: debian 10 buster
        locale: utf-8
       machine: x86_64
       release: 4.19.0-13-amd64
        system: Linux
       version: Debian GNU/Linux 10 buster

Salt-minion Versions Report

salt-minion --versions-report
Salt Version:
          Salt: 3003

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.5.3
     docker-py: 1.9.0
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.9.4
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.5.6
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: 2.6.1
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.5.3 (default, Nov 18 2020, 21:09:16)
  python-gnupg: Not Installed
        PyYAML: 3.12
         PyZMQ: 17.1.2
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.2.1

System Versions:
          dist: debian 9 stretch
        locale: ANSI_X3.4-1968
       machine: x86_64
       release: 4.9.0-14-amd64
        system: Linux
       version: Debian GNU/Linux 9 stretch
@ninoY25 ninoY25 added Bug broken, incorrect, or confusing behavior needs-triage labels Apr 28, 2021
@welcome
Copy link

welcome bot commented Apr 28, 2021

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar.
If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!

@ninoY25 ninoY25 closed this as completed Apr 28, 2021
@ninoY25 ninoY25 changed the title [BUG] [BUG] salt-minion msgpack deserialization failure when running salt-master custom runner in high concurrency. May 10, 2021
@ninoY25 ninoY25 reopened this May 10, 2021
@EmberLevy EmberLevy added Core relates to code central or existential to Salt severity-high 2nd top severity, seen by most users, causes major problems and removed needs-triage labels May 11, 2021
@EmberLevy EmberLevy added this to the Approved milestone May 11, 2021
@sagetherage sagetherage changed the title [BUG] salt-minion msgpack deserialization failure when running salt-master custom runner in high concurrency. [BUG] 3003 salt-minion msgpack deserialization failure when running salt-master custom runner in high concurrency. May 11, 2021
@vincentor
Copy link

Same situation for me. Hope someone can fix it ASAP.

@sagetherage sagetherage added the Phosphorus v3005.0 Release code name and version label Jun 18, 2021
@sagetherage sagetherage modified the milestones: Approved, Phosphorus Jun 18, 2021
@frebib
Copy link
Contributor

frebib commented Oct 4, 2021

We can reproduce this as far back as 2019.2 with msgpack 1.x on occasion

@twangboy twangboy added the Sulfur v3006.0 release code name and version label Mar 2, 2022
@waynew waynew removed the Sulfur v3006.0 release code name and version label Dec 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt msgpack Phosphorus v3005.0 Release code name and version severity-high 2nd top severity, seen by most users, causes major problems
Projects
None yet
Development

No branches or pull requests

7 participants