Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support sha2 and sha2_hex with digest size of 256 #63

Merged
merged 5 commits into from
Apr 6, 2024

Conversation

ifm-pgarner
Copy link
Contributor

@ifm-pgarner ifm-pgarner commented Mar 27, 2024

We're using Snowflake sha2 function in our project that uses fakesnow for testing (it's great, thank you!)

But duckdb does not have sha2 function, instead it has a sha256 function

sha256(foo) is the same as Snowflake sha2('foo', 256) (or you can omit the length arg and it defaults to 256) https://docs.snowflake.com/en/sql-reference/functions/sha2#arguments

So this PR adds a transform for these function calls

Copy link
Owner

@tekumara tekumara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, thank you for this!

.transform(values_columns)
.sql()
== "INSERT INTO table1 (name) SELECT SHA2_HEX('foo', 256, 'wtf')"
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind also adding a test to test_fakes.py too? this will guarantee integrated test coverage that includes duckdb's execution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added now, thanks

@tekumara tekumara changed the title feat: add a sha2 -> sha256 transform feat: support sha2 and sha2_hex with digest size of 256 Mar 28, 2024
@anentropic
Copy link
Contributor

I realised I could use sha2_binary and store my hashes in a 32 byte column instead of varchar(64) for hexdigest

duckdb doesn't have that function but unhex(sha256(...)) seems to give same result

In [24]: duckdb.sql("SELECT unhex(sha256('Snowflake'))").show()
┌────────────────────────────────────────────────────────────────────────────────────────┐
│                               unhex(sha256('Snowflake'))                               │
│                                          blob                                          │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ \x1D\xBDY\xF6a\xD6\x8B\x90rO!\x08C\x96\xB8eIqs\xE4\xD2qOM\x91\xCF\x05\xFA_\xC5\xE1\x8D │
└────────────────────────────────────────────────────────────────────────────────────────┘


In [25]: sha256(b"Snowflake").digest()
Out[25]: b'\x1d\xbdY\xf6a\xd6\x8b\x90rO!\x08C\x96\xb8eIqs\xe4\xd2qOM\x91\xcf\x05\xfa_\xc5\xe1\x8d'

so I have added that transform as well

@tekumara
Copy link
Owner

tekumara commented Apr 6, 2024

Perfect, thank you!

@tekumara tekumara merged commit ce345e9 into tekumara:main Apr 6, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants