[DOCS] Improvements to core Salt doc topics requested by VMware customers #64473

barbaricyawps · 2023-06-12T22:11:25Z

The VMware customer support team requested some improvements to core Salt documentation that is causing a lot of confusion with VMware customers. They wanted a Salt engineer to review a few of these docs to let us know which elements are out of date, contain missing information, or that are wrong:

Troubleshooting the Salt Master
- We just need a quick pass on this to validate that the content is still accurate.
- To give an example of something that we think is maybe outdated or no longer necessary, the section about too many files is still a concern but not as important as a concern as it used to be.
High availability features in Salt
- This could be good for a general review.
- One thing is that we probably just need to add that minions actually send events back to all masters now. This is relatively new functionality and it needs to be called out specifically since it appears to be undocumented.
Using Salt at scale
- This also just needs a general review for validation.
- For example, there are recommended configuration options related to SOC that might be irrelevant.

After a Salt engineer did a review, he made these recommendations:

For Troubleshooting The Salt Master I think the entire "Too Many Files" section can get axed. I also think the section below it "Master Stops Responding" can get axed. Especially if other team members agree they can go away.
The high availability features of salt section is okay for now. As far as events getting sent to both masters, That might be a bug, not a feature or if it is a feature it's one I'm not aware of. The rest of that doc looks accurate to me.
Scaling doc looks okay but I'm sure it can be improved. I think we'll have some more options on both those in a future release.

OrangeDog · 2023-06-13T09:17:15Z

Just having a quick look myself and there are immediate problems. For example, Using Salt at scale says:

If many pillars are encrypted using [the] gpg renderer, it is possible to cache GPG data. To do this, see the set of master configuration options which are prefixed with gpg_cache.

Yet there are none listed at Configuring the Salt Master.

barbaricyawps · 2023-06-13T21:44:29Z

Thanks for the great feedback, @OrangeDog . If you go through those docs and notice anything else, please let me know. I plan to actively work on this fairly soon.

OrangeDog · 2023-06-14T09:40:55Z

It is actually covered in the example config. Though note it's largely a copy-paste of the pillar_cache docs so has a bunch of errors like claiming "a fresh pillar is recompiled and stored". That example is also not up-to-date with other fixes in the main docs, e.g. it says "Same defaults are provided" instead of "Sane defaults are provided". When it's documented correctly it should be clear what the cache is keyed on (I assume it's the cyphertext, so the same secret only needs to be decrypted once if it's used in multiple minions' pillars, and changing the secret doesn't have any stale data issues).

And speaking of pillar_cache, the security warnings seem to be written by someone lacking understanding:

Secondly, and perhaps more importantly, this means that unencrypted pillars will be accessible to any process which can examine the memory of the salt-master! This may represent a substantial security risk.

That is true even if a cache is not being used. Something with access to Salt's memory can read any and all unencrypted data that Salt is currently using - including all pillars and all private keys. This documentation seems to be trying to present the memory cache as fundamentally less secure than the file cache, when in reality the file cache will be less secure in most models.

Further, as the cyphertext and the decryption key are both going to be on the same disk in most cases, neither caching method makes much difference at all to overall security. Similarly, for remote pillar data (e.g. sdb, vault), the credentials needed to read the secrets are already on the disk. Anyone who can read the master's disk or memory can exfiltrate all pillar data, without needing a cache to be used. It only makes it slightly easier.

barbaricyawps · 2023-06-15T16:30:39Z

As always, this is great feedback, @OrangeDog . Thank you!

barbaricyawps added Documentation Relates to Salt documentation needs-triage labels Jun 12, 2023

barbaricyawps self-assigned this Jun 12, 2023

barbaricyawps added severity-high 2nd top severity, seen by most users, causes major problems time-estimate-sprint VMware and removed needs-triage labels Jun 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOCS] Improvements to core Salt doc topics requested by VMware customers #64473

[DOCS] Improvements to core Salt doc topics requested by VMware customers #64473

barbaricyawps commented Jun 12, 2023

OrangeDog commented Jun 13, 2023

barbaricyawps commented Jun 13, 2023

OrangeDog commented Jun 14, 2023 •

edited

barbaricyawps commented Jun 15, 2023

[DOCS] Improvements to core Salt doc topics requested by VMware customers #64473

[DOCS] Improvements to core Salt doc topics requested by VMware customers #64473

Comments

barbaricyawps commented Jun 12, 2023

OrangeDog commented Jun 13, 2023

barbaricyawps commented Jun 13, 2023

OrangeDog commented Jun 14, 2023 • edited

barbaricyawps commented Jun 15, 2023

OrangeDog commented Jun 14, 2023 •

edited