Skip to content

Use Adler32 algorithm to compute string checksums#48812

Merged
cachedout merged 7 commits into
saltstack:developfrom
isbm:isbm-hash-randomisation-python3
Aug 27, 2018
Merged

Use Adler32 algorithm to compute string checksums#48812
cachedout merged 7 commits into
saltstack:developfrom
isbm:isbm-hash-randomisation-python3

Conversation

@isbm
Copy link
Copy Markdown
Contributor

@isbm isbm commented Jul 28, 2018

What does this PR do?

An alternative to #46649

Previous Behavior

server_id is changing due to Python 3 is salting hash function. A workaround is to shell-out Python with environment variable PYTHONHASHSEED=0. Unfortunately, this is the only option.

However, what if we use Adler-32 instead of hash? This would avoid shell-out Python.
@terminalmage what do you think?

Tests written?

Should apply existing

@rallytime rallytime requested a review from terminalmage July 30, 2018 14:46
@terminalmage terminalmage added the pending-discussion The issue or pull request needs more discussion before it can be closed or merged label Jul 31, 2018
Copy link
Copy Markdown
Contributor

@terminalmage terminalmage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open to this change, but if we are going to make it then I think we need to at minimum be very explicit about the change in the release notes, due to the fact that this will change a grain on which users may be relying.

@thatch45 @cachedout @rallytime thoughts? Should we gate this behind a config item and make it the new default method after a couple release cycles? Or, should we make it the default right away?

blind-review

@cachedout
Copy link
Copy Markdown
Contributor

I really don't like doing this sort of thing. We're going to change all of these hashes out from under people but for no reason other than it's only just a fraction of a bit faster and the code is slightly cleaner and I just don't feel like that's worth the disruption to users.

@rallytime
Copy link
Copy Markdown
Contributor

I agree with @cachedout here. I don't think this is something that we want to disrupt upgrades for with users, IMO.

@cachedout
Copy link
Copy Markdown
Contributor

We didn't get a response from @isbm so I'm closing this for now. Bo, if you like we can discuss this more on our next call. :)

@cachedout cachedout closed this Aug 17, 2018
@isbm
Copy link
Copy Markdown
Contributor Author

isbm commented Aug 20, 2018

@cachedout @rallytime I was on the vacations in the woods off the grid, and now this PR is just closed. 😢

The part that changing hashes probably isn't that much problematic, but I've got the point. So why we indeed do not gate this behind the option and retire then it later, just like @terminalmage said? In the meanwhile firing up another interpreter for just hash is really what has to be done due to impossibility of doing better. Which is fine (-ish, because on the Mainframes with the VM of Ferrari-price RAM), but still how we can go away from this hack in the future clean?

Another problem with the hack is that Adler32 is fixed, finished algorithm. It won't change. But are we sure Python next versions won't turn off that environment variable at all and we have them this problem back? I would still look for the option to gating this behind the option & press release this.

@rallytime
Copy link
Copy Markdown
Contributor

I can re-open this for now, certainly. We can discuss this tomorrow in our meeting. :)

@rallytime rallytime reopened this Aug 23, 2018
@isbm isbm requested a review from a team as a code owner August 24, 2018 13:32
@isbm
Copy link
Copy Markdown
Contributor Author

isbm commented Aug 27, 2018

As per conversation, a valid concern from @cachedout for not to update defaults abruptly has been addressed as per suggestion from @terminalmage . I also added choices between Adler32 (going to be default) and CRC32. The latter is slower (the slowness is not any significant for this task though) but much more reliable for collisions. However, returns a bigger number then Adler32. Current default is OFF and the Python shell-out workaround is used. In two versions this workaround will be retired. This way concern has been solved.

@rallytime what are the next steps here?

@terminalmage I think I probably/likely still need to add some better docs in press-release, or? Any suggestions?

@cachedout cachedout merged commit a813cac into saltstack:develop Aug 27, 2018
@isbm isbm deleted the isbm-hash-randomisation-python3 branch November 7, 2018 15:21
agraul added a commit to agraul/salt that referenced this pull request Jan 27, 2025
Generate the same numeric value across all Python versions and platforms

Re-add getting hash by Python shell-out method

Add an option to choose between default hashing, Adler32 or CRC32 algorithms

Set default config option  for server_id hashing to False on minion

Choose CRC method, default to faster but less reliable "adler32", if crc is in use

Add warning for Sodium.

Move server_id deprecation warning to reduce log spamming (bsc#1135567) (bsc#1135732)

Remove deprecated warning that breaks miniion execution when "server_id_use_crc" opts are missing

BACKPORT-UPSTREAM=saltstack#48812
DOWNSTREAM-REF=openSUSE/salt#159
DOWNSTREAM-REF=openSUSE/salt@73e357d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pending-discussion The issue or pull request needs more discussion before it can be closed or merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants