Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grains: jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'os' when rendering Pillar #59205

Open
a-wildman opened this issue Dec 30, 2020 · 28 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior severity-critical top severity, seen by most users, serious issues v3002.6 vulnerable version
Milestone

Comments

@a-wildman
Copy link

Description
Having just upgraded to 3002.2, we are seeing random instances where Pillar fails to render due to core grains (specifically 'os') not being defined.

Exception in master log is as follows:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 260, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 505, in render_jinja_tmpl
    raise SaltRenderError("Jinja variable {}{}".format(exc, out), buf=tmplstr)
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'os'
2020-12-29 19:08:01,847 [salt.pillar      :888 ][CRITICAL][3653] Rendering SLS 'global.os_defaults' failed, render error:
Jinja variable 'dict object' has no attribute 'os'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 498, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/asyncsupport.py", line 76, in render
    return original_render(self, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "<template>", line 1, in top-level template code
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'os'

Setup
The Pillar SLS being rendered which fails with the above is quite straightforward:

$ cat pillar/global/os_defaults.sls 
{% if grains['os'] == 'MacOS' %}
homedirbase: '/Users'
roothomedir: '/var/root'
{% elif grains['os'] == 'Ubuntu' %}
homedirbase: '/home'
roothomedir: '/root'
{% endif %}

Steps to Reproduce the behavior
Unfortunately I've not found how to reliably reproduce this. We have a few hundred minions, and following the upgrade (we were previously at 2017.7.8) we're observing this at random (i.e. most of the time, rendering happens correctly, but when it doesn't, forcing a saltutil.refresh_grains + saltutil.refresh_pillar does the trick, but the problem can crop up again after an indeterminate amount of time).

I can attest that the 'os' grain is the only grain that's failing to be defined, and this is only observed for MacOS minions.

Expected behavior
grains['os'] to reliably evaluate within Pillar SLS definitions

Versions Report

salt --versions-report

master:

Salt Version:
          Salt: 3002.2
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.6.1
     docker-py: 2.5.1
         gitdb: 2.0.3
     gitpython: 2.1.8
        Jinja2: 2.10
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.5.6
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: 2.6.1
  pycryptodome: 3.4.7
        pygit2: Not Installed
        Python: 3.6.9 (default, Jul 17 2020, 12:50:27)
  python-gnupg: 0.4.1
        PyYAML: 3.12
         PyZMQ: 17.1.2
         smmap: 2.0.3
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.2.5
 
System Versions:
          dist: ubuntu 18.04 Bionic Beaver
        locale: UTF-8
       machine: x86_64
       release: 4.15.0-112-generic
        system: Linux
       version: Ubuntu 18.04 Bionic Beaver

minions:

Salt Version:
          Salt: 3002.2
 
Dependency Versions:
          cffi: 1.12.2
      cherrypy: unknown
      dateutil: 2.8.0
     docker-py: Not Installed
         gitdb: 2.0.6
     gitpython: 2.1.15
        Jinja2: 2.10.1
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: 1.0.7
       msgpack: 1.0.0
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.19
      pycrypto: Not Installed
  pycryptodome: 3.9.8
        pygit2: Not Installed
        Python: 3.7.4 (default, Nov 16 2020, 16:27:58)
  python-gnupg: 0.4.4
        PyYAML: 5.3.1
         PyZMQ: 18.0.1
         smmap: 3.0.2
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.1
 
System Versions:
          dist: darwin 19.6.0 
        locale: UTF-8
       machine: x86_64
       release: 19.6.0
        system: Darwin
       version: 10.15.7 x86_64
@a-wildman a-wildman added the Bug broken, incorrect, or confusing behavior label Dec 30, 2020
@garethgreenaway garethgreenaway added this to the Blocked milestone Jan 5, 2021
@garethgreenaway garethgreenaway added Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged and removed needs-triage labels Jan 5, 2021
@garethgreenaway
Copy link
Contributor

@saltstack/team-macos FYI, any thoughts on this one?

@a-wildman
Copy link
Author

This looks very related to #59015 -- we're only setting custom grains (in /etc/salt/minion) on our macOS minions

@wwalker
Copy link

wwalker commented Jan 19, 2021

I'm seeing this exact problem. Edit - on linux

The problem does not exist in salt 3001.3.

I upgrade to 3002.2 and any state.* call fails.

Then I remove grains from my effusive config file:

vagrant@test-salt-01:~$ cat /etc/salt/minion
grains:
  productname: vagrant
master: 172.28.128.229
vagrant@test-salt-01:~$ sudo vim /etc/salt/minion
vagrant@test-salt-01:~$ cat /etc/salt/minion
master: 172.28.128.229

and it works again.
captured-the-issue.txt

@peasejay
Copy link

peasejay commented Feb 10, 2021

We just ran into a similar problem on an Ubuntu 18.04 minion instance running with the 3002.2 minion:

2021-02-10 01:49:14,625 [salt.utils.templates:273 ][ERROR   ][25866] Rendering exception occurred
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 498, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/asyncsupport.py", line 76, in render
    return original_render(self, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "<template>", line 6, in top-level template code
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'host'

--versions-report for the affected minion is below

Salt Version:
          Salt: 3002.2
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.6.1
     docker-py: 2.5.1
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.10.1
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.1
  msgpack-pure: Not Installed
  mysql-python: 1.3.10
     pycparser: Not Installed
      pycrypto: 2.6.1
  pycryptodome: 3.4.7
        pygit2: Not Installed
        Python: 3.6.9 (default, Oct  8 2020, 12:12:24)
  python-gnupg: 0.4.1
        PyYAML: 5.1.2
         PyZMQ: 18.1.0
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.2
System Versions:
          dist: ubuntu 18.04 Bionic Beaver
        locale: UTF-8
       machine: x86_64
       release: 4.15.0-135-generic
        system: Linux
       version: Ubuntu 18.04 Bionic Beaver

In our case, the master was able to load the grains correctly from the minion when running salt 'targetname' grains.items

    host:
        localhost

but acted like the grain was undefined when I ran orchestrations. We tried lots of things to cajole it - restarting master and minion services, saltutil.sync_all on master and minion, removing the key from the master and readding it.

The pillar .sls attempting to use the grain is straightforward. Interesting that the error occurs for us on the host grain but not the id grain on the line before.:

      appname: {{ grains['id'] }}
      display_name: {{ grains['host'] }}

As with the original post, saltutil.refresh_grains + saltutil.refresh_pillar fixed the condition for us - thanks for that.

Only additional detail I can offer is the failing instance had been recently cloned onto a different vm instance/architecture by our IT team, so perhaps the salt master grains cache became confused by some detail of the updated architecture on the minion machine.

Our minion files use custom grains (which hadn't changed after the machine was cloned).

@grem11n
Copy link

grem11n commented Apr 12, 2021

Looks like we are facing the same issue with Salt 3003 and Ubuntu 20.04. Although, in our case the basic grain, which fails to render is oscodename

Here is a version report:

Salt Version:
          Salt: 3003

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.2
     docker-py: 3.3.0
         gitdb: 2.0.6
     gitpython: 3.0.7
        Jinja2: 2.11.3
       libgit2: Not Installed
      M2Crypto: 0.31.0
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: 2.6.1
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.8.5 (default, Jan 27 2021, 15:41:15)
  python-gnupg: 0.4.5
        PyYAML: 3.12
         PyZMQ: 18.1.1
         smmap: 2.0.5
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.2

System Versions:
          dist: ubuntu 20.04 focal
        locale: utf-8
       machine: x86_64
       release: 5.4.0-1041-aws
        system: Linux
       version: Ubuntu 20.04 focal

state.highstate on a minion, which is running Ubuntu 20 and Salt Minion 3002.6 results in:

    Pillar failed to render with the following messages:
----------
    Specified SLS 'base.None' in environment 'base' is not available on the salt master

base is also a name of a pillar here. This pillar is:

include:
  - base.{{ grains.get('oscodename') }}

There are several errors in the master logs as well. Which look like Jinja errors, to be honest:

render error:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 1088, in render
    return self._module
  File "<template>", line 13, in root
  File "/usr/local/lib/python3.8/dist-packages/jinja2/runtime.py", line 747, in _fail_with_undefined_er
ror
    _log_message(self)
jinja2.exceptions.UndefinedError: 'salt.loader_context.NamedLoaderContext object' has no attribute 'reg
ion'

@sagetherage sagetherage removed this from the Blocked milestone Apr 20, 2021
@sagetherage sagetherage removed the Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged label Apr 20, 2021
@danielrobbins danielrobbins self-assigned this Apr 22, 2021
@danielrobbins
Copy link

danielrobbins commented Apr 22, 2021

Does anyone have a reliable way to reproduce this, or is it random?

If it is reproducible, then I would like to bisect of the salt code to narrow down what code change caused this regression. We know it used to work, and we know it now works unreliably at best in some cases.

If anyone has a scenario where they can reliably reproduce the failure, please include any specific details. I will then try to reproduce here, and then narrow it down. Include any details related to the setup. The simpler the reproduction setup, the better.

Thanks in advance.

@sagetherage sagetherage added the info-needed waiting for more info label Apr 27, 2021
@danielrobbins danielrobbins removed their assignment Apr 29, 2021
@sagetherage sagetherage added severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around and removed info-needed waiting for more info needs-triage labels Apr 30, 2021
@sagetherage sagetherage added this to the Approved milestone Apr 30, 2021
@rasstef
Copy link

rasstef commented Apr 30, 2021

I'm having the same problem with Salt 3002.6+ds-1 on Ubuntu 20.04.2 LTS.

I have a macro that contains

{% macro verify_grain(grain) -%}
  {%- set local_grain = grains[grain] -%}
  ...
  {{ local_grain }}
  ...

and every time it is used it returns an error:

salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'oscodename'
...
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'fqdn_ip4'
...
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'os'

This is a blocker.

@rasstef
Copy link

rasstef commented Apr 30, 2021

I'm having the same problem with Salt 3002.6+ds-1 on Ubuntu 20.04.2 LTS.

I have a macro that contains

{% macro verify_grain(grain) -%}
  {%- set local_grain = grains[grain] -%}
  ...
  {{ local_grain }}
  ...

and every time it is used it returns an error:

salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'oscodename'
...
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'fqdn_ip4'
...
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'os'

This is a blocker.

I dove into this and found the following:

I have some grains defined in /etc/salt/minion.d/:

cat /etc/salt/minion.d/grains.conf
grains:
  platform: mobile
  salt_initialized: False
  salt_complete: False
  salt_debug: False
  ...

And I found that the grains dictionary contains only those grains defined in the grains.conf file. The "core" grains just as os are missing. Yet, salt-call grains.items returns them all.

This behaviour was different in 3002.1+ds-1.

@rasstef
Copy link

rasstef commented May 10, 2021

Hello, could anybody fix this bug please? Because of this we cannot install the critical security patches provided by 3002.6

@danielrobbins
Copy link

If anyone has any information on how to reproduce this issue reliably, please post. We will need to attempt to reproduce this issue before we can fix it. If it can't be reliably reproduced, we will at least need to get an environment set up where it can be reproduced randomly, and then try to track it down from there.

@grem11n
Copy link

grem11n commented May 13, 2021

@danielrobbins , In my case it was a minion started with Salt Cloud in AWS public cloud (simple EC2 machines). Both master and minion were running 3002 version on Ubuntu 20.04 with the default Python 3 interpreter. Unfortunately, I cannot disclose concrete details about the setup e.g. config files, states, etc. Hopefully, this information helps.

This bug doesn't occur with versions 3000 and 3003 (both master and minion should be 3003). So, whoever encounters this bug as well, you can try to upgrade your masters and minions to 3003

@sagetherage sagetherage added severity-critical top severity, seen by most users, serious issues and removed severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around labels May 13, 2021
@sagetherage sagetherage added the Silicon v3004.0 Release code name label May 13, 2021
@sagetherage sagetherage removed the Silicon v3004.0 Release code name label May 13, 2021
@sagetherage sagetherage modified the milestones: Silicon, Approved May 13, 2021
@waynew
Copy link
Contributor

waynew commented May 13, 2021

FWIW that original error I don't think was due to os grains not being defined - at least grains['os'] would not produce that error message. That message would be produced if grains was a dictionary and it was accessed like grains.os. 🤔

More generally, thing = {}; thing.os is more or less what triggered this exception. Exactly why, I don't know but for sure (yet?) 🤔 :suspect:

@rasstef
Copy link

rasstef commented May 18, 2021

I simplified my setup as much as I could. With that, reproduce the bug with the following simple files:

# cat /etc/salt/master
fileserver_backend:
  - roots
file_roots:
  base:
    - /srv/salt
reactor:
  - 'salt/minion/*/start':
    - '/srv/reactor/initial_state_apply.sls'



# cat /etc/salt/minion
master: saltmaster001
saltenv: base
pillarenv_from_saltenv: True
pillarenv: base
start_event_grains:
  - salt_initialized
  - salt_complete
mine_interval: 2
mine_functions:
  network.ip_addrs:
    - interface: ens3
  network.get_hostname: []
grains:
  saltenv: base
  shell: /bin/bash
  salt_initialized: True
  salt_complete: True
  domain: mo-staging00-nonprod.ams1.cloud
  environment: staging
  region: ams1



# cat /srv/salt/top.sls
base:
  '*':
    - consul



# cat /srv/salt/consul/init.sls
{% from "consul/settings.jinja" import consul with context %}
/etc/consul/config.json:
  file.serialize:
    - user: consul
    - group: consul
    - mode: '0600'
    - dataset: {{ consul.config }}
    - formatter: json
    - makedirs: True
    - watch_in:
      - service: consul
    - require:
        - user: consul



# cat /srv/salt/consul/settings.jinja
{% set consul = salt['pillar.get']('consul', None) %}
{% if not consul %}
  {{ salt.test.exception("consul pillar data missing!") }}
{% endif %}
{% set pillar_cluster_name = consul['cluster']['name'] %}
{% set master_dict = salt['mine.get']('G@role:consul and G@cluster:{name}'.format(name=pillar_cluster_name), 'network.ip_addrs', 'compound') %}
{% set master_nodes = master_dict.keys() %}
{% if consul.auto_bootstrap -%}
    {% do consul.config.update({'retry_join': master_nodes}) %}
{% else %}
    {% do consul.config.update({'retry_join': consul.config.retry_join}) %}
{% endif %}
{% do consul.config.update({'retry_join_wan': consul.config.get('retry_join_wan', [])}) %}



# cat srv/salt/consul/files/config.json.sls
{% from "consul/settings.jinja" import consul with context %}
{{ consul.config | json(indent=4) }}



# cat /srv/pillar/top.sls
base:
  '*':
    - base.consul



# cat /srv/pillar/base/consul.sls
{%- set datacenter = [ grains['region'], grains['environment'] ] %}
{%- set domain = grains['domain'] %}
{%- set interface_address = grains['fqdn_ip4'] %}

consul:
  cluster:
    name: consul

  auto_bootstrap: false

  config:
    advertise_addr: "{{ interface_address }}"
    bind_addr: "0.0.0.0"
    client_addr: "0.0.0.0"
    datacenter: {{ datacenter|join("-") }}
    encrypt: XXXXXXXXXXXXX
    data_dir: /var/lib/consul
    domain: {{ domain }}
    retry_join_wan: []
    retry_join:
      - consul001.{{ domain }}
    enable_debug: false
    enable_syslog: true
    log_level: info
    server: false
    ui: false



# cat /srv/reactor/initial_state_apply.sls
sync_grains:
  local.saltutil.sync_all:
    - tgt: {{ data['id'] }}

{% set salt_initialized = data['grains']['salt_initialized'] %}
{% set salt_complete = data['grains']['salt_complete'] %}
{% if not salt_complete %}
{% if salt_initialized %}
state_apply_1:
  local.state.apply:
    - tgt: {{ data['id'] }}
    - kwarg:
        queue: True
{% else %}
{% if data['id'].startswith('saltmaster') %}
state_apply_2:
  local.state.apply:
    - tgt: {{ data['id'] }}
    - arg:
      - saltstack.master
    - kwarg:
        queue: True
{% else %}
state_apply_3:
  local.state.apply:
    - tgt: {{ data['id'] }}
    - arg:
      - saltstack
    - kwarg:
        queue: True
{% endif %}
{% endif %}
{% endif %}

Then install the packages and the error pops up immediately:

# apt install salt-common=3002.6+ds-1 salt-master=3002.6+ds-1 salt-minion=3002.6+ds-1
# salt-key -Ay
# cat /var/log/salt/master
2021-05-18 17:15:16,855 [salt.utils.parsers:1104][WARNING ][191500] Master received a SIGTERM. Exiting.
2021-05-18 17:15:30,052 [salt.utils.templates:274 ][ERROR   ][193494] Rendering exception occurred
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 501, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/asyncsupport.py", line 76, in render
    return original_render(self, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "<template>", line 3, in <module>
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'fqdn_ip4'

With saltstack 3002.1+ds-1 the error does not show up.

@rasstef
Copy link

rasstef commented May 18, 2021

This zip contains the config files list above:
salt-bug.zip

@rasstef
Copy link

rasstef commented May 18, 2021 via email

Copy link
Contributor

@rasstef Using the zipfile you provided in an Ubuntu container I haven't see the error that you were seeing. Are you running a particular state or Salt command when you see the issue? Thanks!

@rasstef
Copy link

rasstef commented May 19, 2021

Ok, I wasn't explaining it at full lengths:

  • state.apply is executed automatically when the minion is started (or the key added) triggered by the event salt/minion/*/start, this is defined in /etc/salt/master and the reactor in /srv/reactor/initial_state_apply.sls.
  • the error occurs in the state.apply run in the reactor only, but not when state.apply is executed on the command line!!! Therefore you find the error log in /var/log/salt/master.
  • the grain complained about (fqdn_ip4) does actually exist, as you can verify by running e.g.
    salt-call grains.get fqdn_ip4
  • Conclusion: Some code change in 3002.6+ds-1 (or a previous one - in 3002.1+ds-1 the bug was not present anyway) caused the reactor not to have access to the core grains (os, oscodename, fqdn_ip4, ...) anymore, the grains defined in /etc/salt/minion on the contrary seem to be known.

I haven't checked the intermediate versions (3002.2 ... 3002.5) so I cannot tell in which version the bug was actually introduced, but I can do that now if you'd like me to do it. I would just need access to the Focal packages, on http://repo.saltstack.com/py3/ubuntu/20.04/amd64/3002/ I can see the 3002.6 versions only.

@rasstef
Copy link

rasstef commented May 19, 2021

Ah, this reactor is already sufficient to trigger the error:

cat /srv/reactor/initial_state_apply.sls
sync_grains:
  local.saltutil.sync_all:
    - tgt: {{ data['id'] }}

Just restart the minion and there we go...

# cat /var/log/salt/master
2021-05-18 17:15:16,855 [salt.utils.parsers:1104][WARNING ][191500] Master received a SIGTERM. Exiting.
2021-05-18 17:15:30,052 [salt.utils.templates:274 ][ERROR   ][193494] Rendering exception occurred
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 501, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/asyncsupport.py", line 76, in render
    return original_render(self, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "<template>", line 3, in <module>
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'fqdn_ip4'

@rasstef
Copy link

rasstef commented May 19, 2021

I could simplify the code that causes the exception to occur even more. Please find it here:
saltstack_3002.6.zip

Just restart the minion and have a look into /var/log/salt/master.

It all depends on the grain defined in the first line of /srv/pillar/base/consul.sls.
For grains just as oscodename, fqdn_ip4, cpuarch the exception

jinja2.exceptions.UndefinedError: 'dict object' has no attribute

is raised, for grains defined in /etc/salt/minion (saltenv, environment, ...) no exception is raised.

@sagetherage sagetherage modified the milestones: Approved, BugFix May 20, 2021
@rasstef
Copy link

rasstef commented May 25, 2021

@sagetherage When will the version with the bugfix be available? Could you provide a patch once you have it, please? We need this urgently....

@sagetherage
Copy link
Contributor

@rasstef
@waynew is looking into this and he hasn't had time to give an ETA on a patch, yet, but we are working towards it and discussed this week. He will give more info soon.

@waynew
Copy link
Contributor

waynew commented May 26, 2021

Well, I can reliably reproduce the symptom, but not the cause.

Specifically, I just added {% let x = grains.pop('os') %} and then I see the
exact same error message. I'm betting that grains.pop doesn't exist
anywhere in your states, though, or you wouldn't see the error
intermittently.

Given it's an intermittent issue that can be solved with a refresh_grains and
refresh_pillar, it makes me wonder if one of the intermittent pillar/grain
refreshes are failing. I'll take a look and see if I can figure out where it
might be failing.

@waynew
Copy link
Contributor

waynew commented May 27, 2021

After some further research - I was able to get this to happen by blocklisting
the os grain and enabling grains caching.

If you add these lines to your minion config:

grains_cache: True
grains_blacklist:
  - os

And then run salt-call --local pillar.items you should see the error appear.
Remove the grains_blacklist option and re-run, and the problem should remain.
Once grains_cache is removed, however, then it should start working
again.

I'm not sure how or why this is happening, for your current setup, but so far
that's the only thing that I'm seeing that would lead to this type of behavior.
If the grains cache is empty or fails to load, it should be reloading the
grains.

If you encounter this again naturally, what is the output of salt 'broken-minion' grains.items, or salt-call --local -g? Does it look like full grains output?
Is it only missing the os grain? I'm assuming (and hoping) that once the
minion is broken it stays broken until you force a refresh...

If this solves the problem, great! But if not, aside from the -g output, does
restarting the minion service retain the broken behavior? If so, excellent! You
can turn on trace debugging and you should see "Loading X grain" or "Filtering
X grain" in the minion log. Probably egrep "(Loading|Filtering)\s+\w+\s+grain" /var/log/salt/minion >/tmp/graindebug_log.txt would get you the appropriate
information.

Let us know, thanks!

@rasstef
Copy link

rasstef commented May 27, 2021

Dear waynew,

After some further research - I was able to get this to happen by blocklisting
the os grain and enabling grains caching.

If you add these lines to your minion config:

grains_cache: True
grains_blacklist:
  - os

And then run salt-call --local pillar.items you should see the error appear.
Remove the grains_blacklist option and re-run, and the problem should remain.
Once grains_cache is removed, however, then it should start working
again.

I'm not sure how or why this is happening, for your current setup, but so far
that's the only thing that I'm seeing that would lead to this type of behavior.
If the grains cache is empty or fails to load, it should be reloading the
grains.

If you encounter this again naturally, what is the output of salt 'broken-minion' grains.items, or salt-call --local -g? Does it look like full grains output?
Is it only missing the os grain? I'm assuming (and hoping) that once the
minion is broken it stays broken until you force a refresh...

If this solves the problem, great! But if not, aside from the -g output, does
restarting the minion service retain the broken behavior? If so, excellent! You
can turn on trace debugging and you should see "Loading X grain" or "Filtering
X grain" in the minion log. Probably egrep "(Loading|Filtering)\s+\w+\s+grain" /var/log/salt/minion >/tmp/graindebug_log.txt would get you the appropriate
information.

Let us know, thanks!

Thanks a lot for that @waynew, but I don't see how this could be related to the problem we have.
In our case, all grains seem to exist if I run salt 'broken-minion' grains.items or salt-call ... or whatever, including the grains shown as non-existing in /var/log/salt/master:

Jinja variable 'dict object' has no attribute 'fqdn_ip4'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 501, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/asyncsupport.py", line 76, in render
    return original_render(self, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "<template>", line 1, in <module>
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'fqdn_ip4'

(here: fqdn_ip4, but it turned out that it affects absolutely all grains, except the few defined in /etc/salt/minion)

# cat /etc/salt/minion
master: saltmaster001
saltenv: base
pillarenv_from_saltenv: True
pillarenv: base
grains:
  saltenv: base
  shell: /bin/bash
  environment: staging

All other grains would cause a jinja2.exceptions.UndefinedError: 'dict object' has no attribute error.

To summarize:

according to salt-call --local -g the grains (e.g. os, fqdn_ip4) DO exist, according to /var/log/salt/master the same grains do NOT exist, which obviously is a contradiction.

@miraznawaz
Copy link

miraznawaz commented Jun 28, 2021

I am facing a similar issue in my project :
Salt and Salt-minion version is : 3003

Error log :

Traceback (most recent call last):

  File "/usr/lib/python3.6/site-packages/salt/utils/templates.py", line 497, in render_jinja_tmpl

    output = template.render(**decoded_context)

  File "/usr/lib/python3.6/site-packages/jinja2/environment.py", line 1090, in render

    self.environment.handle_exception()

  File "/usr/lib/python3.6/site-packages/jinja2/environment.py", line 832, in handle_exception

    reraise(*rewrite_traceback_stack(source=source))

  File "/usr/lib/python3.6/site-packages/jinja2/_compat.py", line 28, in reraise

    raise value.with_traceback(tb)

  File "<template>", line 12, in top-level template code

  File "/usr/lib/python3.6/site-packages/jinja2/sandbox.py", line 384, in getitem

    return obj[argument]

jinja2.exceptions.UndefinedError: 'salt.loader_context.NamedLoaderContext object' has no attribute 'raw'

 

During handling of the above exception, another exception occurred:

 

Traceback (most recent call last):

  File "/usr/lib/python3.6/site-packages/salt/utils/templates.py", line 261, in render_tmpl

    output = render_str(tmplstr, context, tmplpath)

  File "/usr/lib/python3.6/site-packages/salt/utils/templates.py", line 504, in render_jinja_tmpl

    raise SaltRenderError("Jinja variable {}{}".format(exc, out), buf=tmplstr)

salt.exceptions.SaltRenderError: Jinja variable 'salt.loader_context.NamedLoaderContext object' has no attribute 'raw'

[CRITICAL] Rendering SLS 'base:common.glexSetup' failed: Jinja variable 'salt.loader_context.NamedLoaderContext object' has no attribute 'raw'

[Mon Jun 28 00:49:45 GMT 2021] (from ./prep_deploy_abc.sh) executed abc.up

Succeeded: 36 (changed=19)

Failed:    32

-------------

Total states run:     68

Total run time:    9.912 s


    - Rendering SLS 'base:common.glexSetup' failed: Jinja variable 'salt.loader_context.NamedLoaderContext object' has no attribute 'raw'

@waynew @rasstef @sagetherage

Appreciate if you could help me resolve this error coming in for salt deployment

@sagetherage
Copy link
Contributor

We are looking into this and it is a bit difficult to nail down. @waynew may not have made progress and I don't know that he has had much time to look deep into it, either, but he will continue. @waynew any more comments?

@waynew
Copy link
Contributor

waynew commented Jun 28, 2021

@rasstef have you tried using the grains_blacklist option?

@waynew
Copy link
Contributor

waynew commented Jun 29, 2021

@miraznawaz that error looks like it's from a different problem - you may find some help in one of our other community resources: scroll down on https://saltproject.io/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior severity-critical top severity, seen by most users, serious issues v3002.6 vulnerable version
Projects
None yet
Development

No branches or pull requests