What happened?
Description
Two asymmetries in the pgjsonb writer functions cause database failures to be either propagated to callers that do not expect them or dropped without enough context.
returner(ret) only catches salt.exceptions.SaltMasterError, which _get_serv raises on psycopg2.OperationalError during connect. A psycopg2.DatabaseError during the actual cur.execute(...) into salt_returns (foreign-key violation, NOT NULL violation, deadlock, DataError) propagates to the caller. The publish path in master.py:_prep_pub is wrapped in except Exception → log.critical, so it absorbs the error. But the syndic-aggregate path at master.py:1699-1701 is not wrapped:
if load.get("load") and self.opts["master_job_cache"]:
fstr = "{}.save_load".format(self.opts["master_job_cache"])
self.mminion.returners[fstr](load["jid"], load["load"]) # unhandled
So a single bad row in syndic aggregation can escape into a master subprocess.
event_return(events) has no local error handling at all. The event-bus caller in EventReturn._flush_event_single already wraps it in except Exception → log.error, which keeps the master alive, but the entire event_queue is dropped on any failure with no local context — operators see a generic stacktrace and no indication of how many events were lost.
There is a smaller leftover too: event_return passes the events list as the positional ret argument to _get_serv(events, commit=True). The ret parameter is meant to be the return-dict of one returner call (a dict carrying optional ret_config / ret_kwargs); passing a list happens to behave like ret=None because salt.returners.get_returner_options does in checks that are permissive on a list, but it is misleading copy-paste from returner(ret).
Setup
Any deployment using master_job_cache: pgjsonb or event_return: pgjsonb.
Steps to Reproduce the behavior
For the syndic path:
- Set
master_job_cache: pgjsonb and run a master in syndic mode.
- Have a syndic publish a return whose row would violate a constraint on
salt_returns (e.g. wrong encoding in return, oversized id).
- The
psycopg2.DataError raised by cur.execute propagates out of pgjsonb.returner, through salt-master's syndic-handler, with no log line that names the offending jid/id.
For the events path:
- Set
event_return: pgjsonb and event_return_queue: 5000.
- During a flush, raise any
DatabaseError (e.g. take down salt_events's primary key with a manual schema change).
- All 5000 buffered events are dropped; the only log line is the generic
Could not store events - returner '...' raised exception: ... from EventReturn.
Expected behavior
- A
psycopg2.DatabaseError during cur.execute should not propagate out of pgjsonb.returner regardless of which call path used it.
- Each drop should be logged with enough context to correlate it to the lost payload — jid/id for returns, batch size for events.
event_return should not pretend its events list is a return-dict.
Additional context
The accompanying PR adds a psycopg2.DatabaseError catch to both functions that logs through log.exception (so the traceback is preserved) and tightens the existing SaltMasterError messages to include the same context. While there it also drops event_return's misleading positional events argument to _get_serv. Four behavioural unit tests cover both error paths and the corrected event_return happy path (none existed before).
Builds on #69049 (which already routed _get_serv / _purge_jobs / _archive_jobs through Salt's logger) and should be merged after it.
Type of salt install
Official deb
Major version
3006.x, 3007.x
What supported OS are you seeing the problem on? Can select multiple. (If bug appears on an unsupported OS, please open a GitHub Discussion instead)
debian-11, debian-12
salt --versions-report output
salt --versions-report
Salt Version:
Salt: 3007.13
Python Version:
Python: 3.10.19 (main, Feb 5 2026, 07:05:38) [GCC 11.2.0]
Dependency Versions:
cffi: 2.0.0
cherrypy: unknown
cryptography: 42.0.5
dateutil: 2.8.2
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.6
libgit2: 1.9.1
looseversion: 1.3.0
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.7
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 24.0
pycparser: 2.21
pycrypto: Not Installed
pycryptodome: 3.19.1
pygit2: 1.18.2
python-gnupg: 0.5.2
PyYAML: 6.0.1
PyZMQ: 25.1.2
relenv: 0.22.3
smmap: Not Installed
timelib: 0.3.0
Tornado: 6.5.4
ZMQ: 4.3.4
Salt Extensions:
saltext.vault: 1.5.0
Salt Package Information:
Package Type: onedir
System Versions:
dist: debian 12.13 bookworm
locale: utf-8
machine: x86_64
release: 6.12.73+deb12-amd64
system: Linux
version: Debian GNU/Linux 12.13 bookworm
What happened?
Description
Two asymmetries in the pgjsonb writer functions cause database failures to be either propagated to callers that do not expect them or dropped without enough context.
returner(ret)only catchessalt.exceptions.SaltMasterError, which_get_servraises onpsycopg2.OperationalErrorduring connect. Apsycopg2.DatabaseErrorduring the actualcur.execute(...)intosalt_returns(foreign-key violation, NOT NULL violation, deadlock, DataError) propagates to the caller. The publish path inmaster.py:_prep_pubis wrapped inexcept Exception → log.critical, so it absorbs the error. But the syndic-aggregate path atmaster.py:1699-1701is not wrapped:So a single bad row in syndic aggregation can escape into a master subprocess.
event_return(events)has no local error handling at all. The event-bus caller inEventReturn._flush_event_singlealready wraps it inexcept Exception → log.error, which keeps the master alive, but the entireevent_queueis dropped on any failure with no local context — operators see a generic stacktrace and no indication of how many events were lost.There is a smaller leftover too:
event_returnpasses the events list as the positionalretargument to_get_serv(events, commit=True). Theretparameter is meant to be the return-dict of one returner call (adictcarrying optionalret_config/ret_kwargs); passing a list happens to behave likeret=Nonebecausesalt.returners.get_returner_optionsdoesinchecks that are permissive on a list, but it is misleading copy-paste fromreturner(ret).Setup
Any deployment using
master_job_cache: pgjsonborevent_return: pgjsonb.Steps to Reproduce the behavior
For the syndic path:
master_job_cache: pgjsonband run a master in syndic mode.salt_returns(e.g. wrong encoding inreturn, oversizedid).psycopg2.DataErrorraised bycur.executepropagates out ofpgjsonb.returner, throughsalt-master's syndic-handler, with no log line that names the offending jid/id.For the events path:
event_return: pgjsonbandevent_return_queue: 5000.DatabaseError(e.g. take downsalt_events's primary key with a manual schema change).Could not store events - returner '...' raised exception: ...fromEventReturn.Expected behavior
psycopg2.DatabaseErrorduringcur.executeshould not propagate out ofpgjsonb.returnerregardless of which call path used it.event_returnshould not pretend its events list is a return-dict.Additional context
The accompanying PR adds a
psycopg2.DatabaseErrorcatch to both functions that logs throughlog.exception(so the traceback is preserved) and tightens the existingSaltMasterErrormessages to include the same context. While there it also dropsevent_return's misleading positionaleventsargument to_get_serv. Four behavioural unit tests cover both error paths and the correctedevent_returnhappy path (none existed before).Builds on #69049 (which already routed
_get_serv/_purge_jobs/_archive_jobsthrough Salt's logger) and should be merged after it.Type of salt install
Official deb
Major version
3006.x, 3007.x
What supported OS are you seeing the problem on? Can select multiple. (If bug appears on an unsupported OS, please open a GitHub Discussion instead)
debian-11, debian-12
salt --versions-report output
salt --versions-report Salt Version: Salt: 3007.13 Python Version: Python: 3.10.19 (main, Feb 5 2026, 07:05:38) [GCC 11.2.0] Dependency Versions: cffi: 2.0.0 cherrypy: unknown cryptography: 42.0.5 dateutil: 2.8.2 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 3.1.6 libgit2: 1.9.1 looseversion: 1.3.0 M2Crypto: Not Installed Mako: Not Installed msgpack: 1.0.7 msgpack-pure: Not Installed mysql-python: Not Installed packaging: 24.0 pycparser: 2.21 pycrypto: Not Installed pycryptodome: 3.19.1 pygit2: 1.18.2 python-gnupg: 0.5.2 PyYAML: 6.0.1 PyZMQ: 25.1.2 relenv: 0.22.3 smmap: Not Installed timelib: 0.3.0 Tornado: 6.5.4 ZMQ: 4.3.4 Salt Extensions: saltext.vault: 1.5.0 Salt Package Information: Package Type: onedir System Versions: dist: debian 12.13 bookworm locale: utf-8 machine: x86_64 release: 6.12.73+deb12-amd64 system: Linux version: Debian GNU/Linux 12.13 bookworm