-
Notifications
You must be signed in to change notification settings - Fork 3.7k
GH-37891: [C++][Parquet] Refine several classes in Parquet encryption #46202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @wgtmac this lgtm but I don't know whether this affect py or other interfaces
I've reverted the incorrect change within key_wrapping_test.cc. The file_system there shouldn't be moved. |
Are these the only places that we can apply it? Any other similar place in the parquet subdirectory? |
I'm sure there are more (e.g. FileKeyWrapper). Further changes can be included in this PR if that is desirable. |
Yes, I think we should not create similar PRs for a single issue. |
b3597e8
to
591c82a
Compare
@wgtmac should we take an another round in this? |
@kapoisu Were you planning to add more changes to this PR? Otherwise I think it can be merged. |
I shared some additional opinions in the original issue, but haven’t received a response from the original author yet. I’d like to get some feedback from others on that first. Besides, the original issue isn’t limited to parquet namespace. It just happens that my current modifications are still within that scope. If we try to cover all the necessary changes for the entire project in this PR, it might result in at least hundreds of commits to review. |
I noticed that the FileKeyUnwrapper class has public constructors that accept either owned or non-owned pointer to KeyToolkit, originating from PR #40329. Internally, the implementation only uses the non-owned pointer, but it keeps a copy of shared_ptr when the owned version is used, to avoid dropping the reference count to 0. If this class is intended for internal use only, there doesn’t seem to be a strong reason to keep the owned versions. Removing them would make the semantics clearer, as users would then understand they are responsible for managing the lifetime of the object. It would also simplify future additions by avoiding the need to introduce two constructors for every new parameter combination. @adamreeve any thoughts on this? |
This sounds good to me, but given this is technically a public class, maybe we should first mark the constructors that accept an owned pointer as deprecated and then remove them in a later release? Although I doubt this change would affect many, if any, people so I wouldn't be against just removing them. |
* Pass std::string by value to constructors and Make(). * Pass std::shared_ptr<::arrow::fs::FileSystem> by value to constructors and Make(). * Use emplace() instead of insert() in AddKeyMaterial() because the parameters can be moved.
* Pass movable objects by value to functions that keep a copy of them. * Update the call sites accordingly.
* Pass movable objects by value to functions that keep a copy of them. * Remove unnecessary const member variable that makes the class non- copyable and non-movable.
* Pass movable objects by value to functions that keep a copy of them. * Replace const std::shared_ptr<T>& with const T& in cases where the function doesn't manage or share ownership. * Update the call sites accordingly.
* Pass movable objects by value to functions that keep a copy of them. * Remove unnecessary const member variables that make the class non- copyable and non-movable. * Remove unnecessary constructors and destructors that can be generated automatically by compilers.
* Pass movable objects by value to functions that keep a copy of them. * Remove unnecessary const member variables that make the class non- copyable and non-movable. * Replace const std::shared_ptr<T>& with const T& in cases where the function doesn't manage or share ownership. * Update the call sites accordingly.
* Pass movable objects to functions that keep a copy of them. * Remove unnecessary const member variables that make the class non- copyable and non-movable. * Remove unnecessary constructors and destructors that can be generated automatically by compilers.
* Pass movable objects by value to functions that keep a copy of them. * Update the call sites accordingly.
* Pass movable objects to functions that keep a copy of them. * Remove unnecessary destructors that can be generated automatically by compilers.
I've rebased on git main, will merge if CI passes |
After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 0bd4e73. There were 118 benchmark results with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
What changes are included in this PR?
Are these changes tested?
Yes, by existing tests.
Are there any user-facing changes?
No.