-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Go][Parquet] writer.WriteBatch silently fail when using encryption #14940
Comments
I'll take a look at this today |
Thank you! Let me know if I can contribute in anyway. |
@ahmb84 I don't think you pushed your updated version of the test. At least I'm not able to easily find your updated version of the test in your fork, could you give me a direct link to the commit? |
@zeroshade my bad forgot to commit yesterday. Just pushed, you can find the changed test file here: https://github.com/ahmb84/arrow/blob/5a651992d5ca9db0a5eefc2569276cdfe3ccf038/go/parquet/encryption_read_config_test.go |
@ahmb84 Thanks for your great reproducer, i was able to narrow down the issue and not only fix it, but fix things to properly propagate that error through and added checks in the tests to check for any error returned from |
@zeroshade thank you for the quick solution! I'll see the PR, make sure that everything works well, and get back to you asap - probably tomorrow. |
Hello @zeroshade, sorry for the delay. I've been able to test your PR and it's working well. Thank you! |
In theory it should be entirely possible to do so, I'll have to take a look to see why it's panic'ing. Could you file another GH issue to track that? If you could provide the stack trace of the panic that would also be extremely helpful. |
Missed a `default` causing a fall through to always return an error that wasn't being returned through and checked. Test updated to validate number of values/rows written. Error is now propagated through if encountered. * Closes: #14940 Authored-by: Matt Topol <zotthewizard@gmail.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
@zeroshade thanks for your feedback! I'll open another issue with the stack trace. |
Describe the bug, including details regarding any error messages, version, and platform.
Hello, I am trying to leverage the go arrow library to encrypt/decrypt parquet files.
Unfortunately, it seems that the encrypted values are not written in the parquet file when calling the WriteBatch function.
I have written a test function that writes and then reads a parquet file. The function is working correctly without encryption, but when I add encryption, the values for the encrypted columns do not exist. This is confirmed by two reasons:
After noticing this behavior and testing different configurations (serial/buffer, encoding, compression), I've looked at the test suite and noticed what I assume is a bug.
I’ve run the test suite and all the tests pass. The decryptions work perfectly for the files coming from the repo https://github.com/apache/parquet-testing, but for the file created in the previous test (the encryption one), the test is actually silently failing.
The number of rows is zero, even if it should be 50. The test does check if the number of rows read equals the number of rows in the stats, but it does not check if there are rows at all.
I am assuming that something is wrong with the flush when using encryption.
I have forked the repo and updated the test to include a check for the number of rows. You can find the forked repo here => https://github.com/ahmb84/arrow/tree/parquet-go-add-test-for-number-of-rows-in-decryption
Let me know if my assumptions are correct or if I am missing anything. Thank you!
Component(s)
Go, Parquet
The text was updated successfully, but these errors were encountered: