Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pillar.data and state.highstate do not always update the pillar on the minion #5716

Closed
Mrten opened this issue Jun 25, 2013 · 22 comments
Closed
Assignees
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around

Comments

@Mrten
Copy link
Contributor

Mrten commented Jun 25, 2013

I had a pillar that generated an error (bug #5608 test, it's called list-all-minions).

here I remove the reference to the pillar:

root@salt-master:/home/salt/conf/machine/monitoring/munin-master# vi /home/salt/pillar/top.sls

And here I run highstate on that minion:

root@salt-master:/home/salt/conf/machine/monitoring/munin-master# salt monitoring state.highstate test=true
monitoring:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS list-all-minions failed, render error:


[snip same traceback as in #5608]

pillar data works:

root@salt-master:/home/salt/conf/machine/monitoring/munin-master# salt monitoring pillar.data
monitoring:
    ----------
    allhosts:
        ----------
        bulk:
[lots of private pillar data]

But state.highstate still does not work:

root@salt-master:/home/salt/conf/machine/monitoring/munin-master# salt monitoring state.highstate test=true
monitoring:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS list-all-minions failed, render error:

[same trackback as in bug #5608]

Is this related to test=true looking at another cache? No, without test=true I still get the error.

I thought you had said pillar.data always updates the cache on the minion...

@Mrten
Copy link
Contributor Author

Mrten commented Jun 25, 2013

saltutil.refresh_pillar did the trick.

@basepi
Copy link
Contributor

basepi commented Jun 25, 2013

I was under the impression that pillar.data did a pillar refresh first. Odd. We'll look into it.

@Mrten
Copy link
Contributor Author

Mrten commented Jun 25, 2013

I was even under the impression that state.highstate would do that. If you need the pillar to reproduce this, in #5608 you'll find the code.

@basepi
Copy link
Contributor

basepi commented Jun 25, 2013

state.highstate definitely should. Is it not? Because if that's the case this is high severity.

@Mrten
Copy link
Contributor Author

Mrten commented Jun 25, 2013

Well, not according to my copy/pasted shell log from above. Should've mentioned that I only inserted some comments into something that I copied verbatim from my terminal.

@basepi
Copy link
Contributor

basepi commented Jun 25, 2013

Well, I've put this at the top of my to-do list, we'll definitely look into it.

@Mrten
Copy link
Contributor Author

Mrten commented Jun 25, 2013

Hope you can reproduce it... I've found no logs of significance today (ie. /var/log/salt/* is silent for today on both the master and the minion)

@basepi
Copy link
Contributor

basepi commented Jun 25, 2013

Ya, me too. =\ We'll see.

@smithjm
Copy link

smithjm commented Aug 9, 2013

I can confirm this behavior with 0.16.2. Buggy pillars remain in the cache, breaking highstate, until "salt '*' saltutil.refresh_pillar" is run, at which point things are fine. Easy enough to work around once you know the issue is there, but looking forward to 0.17 and the fix :-)

@JasonSwindle
Copy link
Contributor

I was just hiy by this....... I even blew away my pillar folder and it was still matching to broken items in the cache.

@ghost ghost assigned terminalmage Sep 9, 2013
@terminalmage
Copy link
Contributor

I took a look at this on CentOS 6 today, using the latest from the git develop branch, and was unable to reproduce the error. After intentionally triggering a yaml render error in my pillar SLS, I fixed it and ran a highstate, and the pillar data was updated before the highstate was run.

Can anyone who experienced this before test again against git?

@ruimarinho
Copy link

@terminalmage the original issue I was seeing (from issue #6083) is apparently still present on git develop (0.17.0-364-gfd4866a).

$ salt -v 'staging1' state.highstate

staging1:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS web.projects.myproject failed, render error:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/utils/templates.py", line 63, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python2.7/dist-packages/salt/utils/templates.py", line 128, in render_jinja_tmpl
    output = jinja_env.from_string(tmplstr).render(**unicode_context)
  File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 894, in render
    return self.environment.handle_exception(exc_info, True)
  File "<template>", line 179, in top-level template code
UndefinedError: 'dict object' has no attribute 'ec2_public-hostname'

After running salt 'staging1' saltutil.sync_all, I get this output:

staging1:
    ----------
    grains:
        - grains.ec2_info
    modules:
    outputters:
    renderers:
    returners:
    states:

This confirms the custom grain is synced successfully after a manual run. I can then execute salt -v 'staging1' saltutil.refresh_pillar followed by salt -v 'staging1' state.highstate without any more issues. Not sure if it is relevant or not, but I'm still using salt-cloud 0.8.9 to bootstrap the AWS machines.

@RoboTeddy
Copy link

I'm having the same problem, also with a custom grain. [this is on 0.16.4]

These steps should (hopefully) reproduce the problem:

  1. Create a custom grain. Mine looks a bit like this:
def revision():
    parts = __opts__.get('id', '').split('.')
    return {'revision': (parts[1] if len(parts) >1 else '')}
  1. Create a pillar sls that references the custom grain, i.e.
    revision: {{ grains['revision'] }}
  2. Bring up a new minion (I used salt cloud)
    sudo salt-cloud -p medium_aws xyzzy
  3. Run state.highstate
    sudo salt xyzzy state.highstate

Observe error:

xyzzy:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS 'mypillar' failed, render error:
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/salt/utils/templates.py", line 63, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/pymodules/python2.7/salt/utils/templates.py", line 116, in render_jinja_tmpl
    output = jinja_env.from_string(tmplstr).render(**context)
  File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 894, in render
    return self.environment.handle_exception(exc_info, True)
  File "<template>", line 1, in top-level template code
UndefinedError: 'dict object' has no attribute 'revision'
  1. After wiping away your tears, observe that syncing grains is productive:
    sudo salt xyzzy saltutil.sync_grains:
xyzzy:
   - grains.revision

(if the custom grain had already been synced, it wouldn't be listed in the output)

The fact that this sync actually sent a custom grain implies that state.highstate attempted to render pillar before custom grains had been synced. That might be why it failed: in this case, the pillar tree depends on a custom grain that hasn't been synced to the minion yet. With this order of operations, there's no way to have a good outcome (if my understanding of Salt is correct): if the custom grain hasn't even touched the minion, there's no way the correct grain value can be rendered into pillar.

(At this point, there's still some bad pillar or grain cache somewhere, and state.highstate fails with the same error as in (4). But now that the grain has been synced, running saltutil.refresh_pillar resolves the problem, and state.highstate works.)

@terminalmage
Copy link
Contributor

OK, giving this another look now.

@terminalmage
Copy link
Contributor

I was able to replicate this given the procedure laid out by @RoboTeddy last night. It's clear that we need to sync the grains before compiling the pillar, but I'm looking into the best way of doing this.

@ruimarinho
Copy link

Great work @terminalmage. That solves a long standing bug for me 👍

@terminalmage
Copy link
Contributor

For those interested, the proposed fix is in #7630, though it needs to be discussed, refined and reviewed before it is ultimately merged.

@Mrten
Copy link
Contributor Author

Mrten commented Oct 5, 2013

Not to throw a wrench in the process, but I don't think I was using custom grains yet when I reported this. I did have a custom state and a custom module, though, but I don't think they were involved in the pillar I mentioned.

@RoboTeddy
Copy link

Yeah, the issue being laid out / fixed here is probably better described by #6083. It looks like there might be more than one bug that causes pillar rendering to fail. @Mrten I bet the bug you ran into would get solved if there were specific reproduction steps for it.

@thatch45
Copy link
Contributor

Looks like this is fixed in the pull req, please re-open if this is not so

@ruimarinho
Copy link

@thatch45, I can confirm the pull request from @terminalmage fixed this issue. Thank you!

@basepi
Copy link
Contributor

basepi commented Oct 21, 2013

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Projects
None yet
Development

No branches or pull requests

8 participants