Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-1477: Thrift crypto updates #124

Merged

Conversation

ggershinsky
Copy link
Contributor

No description provided.

Copy link

@shangxinli shangxinli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2: optional binary footer_key_metadata

/** Offset of encrypted Parquet footer **/
3: required i64 footer_offset

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'footer_offset' field still exists in the diagram at the section 'Encrypted footer mode' of the design document (https://github.com/ggershinsky/parquet-format/blob/p1232-encryption-docs/Encryption.md).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an outdated link. the current doc is at
https://github.com/apache/parquet-format/blob/encryption/Encryption.md

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the Parquet-1178(https://docs.google.com/document/d/1T89G7xR0zHFV1f2pjTO28jtfVm8qoNVGEJQ70Rsk-bY) to let it point to the current doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do.

* Retrieval metadata of key used for signing the footer.
* Used only in encrypted files with plaintext footer.
*/
9: optional binary footer_signing_key_metadata

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not mentioned in the diagram 'Plaintext footer mode' of the design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


/** In files encrypted with AAD prefix without storing it,
* readers must supply the prefix **/
3: optional bool supply_aad_prefix
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are 'supply_aad_prefix' and 'aad_prefix'/'aad_file_unique' mutual exclusive? If yes, should we make it as union?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are not.

@@ -863,23 +869,27 @@ struct ColumnIndex {
}

struct AesGcmV1 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AesGcmV1 and AesGcmCtrV1 seems identical. If they are same, can we just use one structure instead of duplicating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • we use them to specify two different algorithms. since we dont use enums, its a union with two structs
  • the design is signed off. we shouldn't change it, lets only make sure the Thrift pr conforms to the merged design.

@ggershinsky
Copy link
Contributor Author

@rdblue @zivanfi Please check and merge this pull request. Then I'll be able to send the next one (to parquet-mr). Without Thrift merged in parquet-format/encryption branch, the mr pull requests won't pass the Travis check.

@rdblue
Copy link
Contributor

rdblue commented Jan 28, 2019

@shangxinli, please have a look at the new doc for encryption. If you want to know more about some of the choices in that doc, please look at #114 where most of the discussion happened. Let's focus in these pull requests on getting the code to match the doc.

@shangxinli
Copy link

@shangxinli, please have a look at the new doc for encryption. If you want to know more about some of the choices in that doc, please look at #114 where most of the discussion happened. Let's focus in these pull requests on getting the code to match the doc.

Thanks. I just looked at the new doc and reviewed the pr again. It looks good to me. I also tested it with a spark application by encrypting a column and decrypt it. It works.

@ggershinsky
Copy link
Contributor Author

@shangxinli Thanks for the review and test, sounds good.

Copy link
Contributor

@gszadovszky gszadovszky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.
Will wait an additional 24 hours and merge to the feature branch.

@gszadovszky gszadovszky merged commit f3527ef into apache:encryption Mar 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants