New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PARQUET-1477: Thrift crypto updates #124
PARQUET-1477: Thrift crypto updates #124
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2: optional binary footer_key_metadata | ||
|
||
/** Offset of encrypted Parquet footer **/ | ||
3: required i64 footer_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'footer_offset' field still exists in the diagram at the section 'Encrypted footer mode' of the design document (https://github.com/ggershinsky/parquet-format/blob/p1232-encryption-docs/Encryption.md).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an outdated link. the current doc is at
https://github.com/apache/parquet-format/blob/encryption/Encryption.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the Parquet-1178(https://docs.google.com/document/d/1T89G7xR0zHFV1f2pjTO28jtfVm8qoNVGEJQ70Rsk-bY) to let it point to the current doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do.
* Retrieval metadata of key used for signing the footer. | ||
* Used only in encrypted files with plaintext footer. | ||
*/ | ||
9: optional binary footer_signing_key_metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not mentioned in the diagram 'Plaintext footer mode' of the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
/** In files encrypted with AAD prefix without storing it, | ||
* readers must supply the prefix **/ | ||
3: optional bool supply_aad_prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are 'supply_aad_prefix' and 'aad_prefix'/'aad_file_unique' mutual exclusive? If yes, should we make it as union?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are not.
@@ -863,23 +869,27 @@ struct ColumnIndex { | |||
} | |||
|
|||
struct AesGcmV1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AesGcmV1 and AesGcmCtrV1 seems identical. If they are same, can we just use one structure instead of duplicating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we use them to specify two different algorithms. since we dont use enums, its a union with two structs
- the design is signed off. we shouldn't change it, lets only make sure the Thrift pr conforms to the merged design.
@shangxinli, please have a look at the new doc for encryption. If you want to know more about some of the choices in that doc, please look at #114 where most of the discussion happened. Let's focus in these pull requests on getting the code to match the doc. |
Thanks. I just looked at the new doc and reviewed the pr again. It looks good to me. I also tested it with a spark application by encrypting a column and decrypt it. It works. |
@shangxinli Thanks for the review and test, sounds good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me.
Will wait an additional 24 hours and merge to the feature branch.
No description provided.