Refactor encryptor and decryptor interfaces to enable parallel decrypt in external DBPA#246
Refactor encryptor and decryptor interfaces to enable parallel decrypt in external DBPA#246sofia-tekdatum merged 7 commits intomainfrom
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or See also: |
argmarco-tkd
left a comment
There was a problem hiding this comment.
Overall LGTM. Thanks for this!
Left a couple of comments - not blockers (but one of them may spark a separate discussion on performace)
| ::arrow::util::span<const uint8_t> aad, | ||
| ::arrow::util::span<uint8_t> ciphertext) { | ||
| ::arrow::util::span<uint8_t> ciphertext, | ||
| std::unique_ptr<EncodingProperties> encoding_properties) { |
There was a problem hiding this comment.
This LGTM, but thinking performance - is passing a ptr to EncodingProperties, or passing by value more efficient (can be left for a later pass - but decided to ask given that we're changing the interface)
There was a problem hiding this comment.
I'd rather keep it this way to explicitly transfer ownership of the pointer. The move here is calling the pointer only, not the deep copy of the EncodingProperties within.
| // Some Encryptors and Decryptors may need to understand the page encoding before the | ||
| // encryption process. This method will be called from the Encrypt and Decrypt | ||
| // WithManagedBuffer methods. | ||
| std::unique_ptr<EncodingProperties> UpdateEncodingProperties( |
There was a problem hiding this comment.
nit:"BuildEncodingProperties" or similar may be a a better name for this.
sofia-tekdatum
left a comment
There was a problem hiding this comment.
Iterated over the PR, PTAL.
| // Some Encryptors and Decryptors may need to understand the page encoding before the | ||
| // encryption process. This method will be called from the Encrypt and Decrypt | ||
| // WithManagedBuffer methods. | ||
| std::unique_ptr<EncodingProperties> UpdateEncodingProperties( |
| ::arrow::util::span<const uint8_t> aad, | ||
| ::arrow::util::span<uint8_t> ciphertext) { | ||
| ::arrow::util::span<uint8_t> ciphertext, | ||
| std::unique_ptr<EncodingProperties> encoding_properties) { |
There was a problem hiding this comment.
I'd rather keep it this way to explicitly transfer ownership of the pointer. The move here is calling the pointer only, not the deep copy of the EncodingProperties within.
114db2b to
0ef0003
Compare
argmarco-tkd
left a comment
There was a problem hiding this comment.
Overall LGTM, thanks!. left a question on a change in file_deserialize_test.cc
| compression_codec_(compression_codec) {} | ||
|
|
||
| [[nodiscard]] bool CanCalculateLengths() const override { return true; } | ||
| [[nodiscard]] bool CanCalculateLengths() const override { return false; } |
There was a problem hiding this comment.
why did we need to make this change?
There was a problem hiding this comment.
Because otherwise the regular Decrypt would be called, and that one has no EncodingProperties sent, so they would be null and the test would fail.
| int32_t InvokeExternalEncrypt(::arrow::util::span<const uint8_t> plaintext, | ||
| ::arrow::ResizableBuffer* ciphertext, | ||
| std::map<std::string, std::string> encoding_attrs); | ||
| std::unique_ptr<EncodingProperties> encoding_properties); |
Refactor encryptor and decryptor interfaces to enable parallel processing in decrypt.
Removing UpdateEncodingProperties as a method and adding it in the helpers.
Removing the restriction on using ExternalDBPA and the multi thread flag.
Updated tests.
All tests pass locally and base and canonical app run successfully.