Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inline descriptor Protobuf bytes decoder #13192

Merged
merged 5 commits into from
Oct 11, 2022

Conversation

jon-wei
Copy link
Contributor

@jon-wei jon-wei commented Oct 6, 2022

This PR adds a new ProtobufBytesDecoder implementation that allows the user to pass inline the contents of a Protobuf descriptor file, encoded as a Base64 string.

This can be useful when the user does not have the ability/convenience of placing files on the filesystems of the Druid cluster nodes.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@abhishekagarwal87 abhishekagarwal87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM though please fix the CI failures. I bet there are coverage checks failing.

{
DynamicSchema dynamicSchema;
try {
byte[] decodedDesc = StringUtils.decodeBase64String(descriptorString);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of decode failure, IllegalArgumentException is thrown. We should catch it and turn it into a ParseException

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the descriptor can't be decoded, then we can't proceed, since we won't be able to parse any records, so I don't think this should be a parse exception

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But other exceptions caught by the try-catch here are turned into ParseException.

My concern is that, if the descriptorString is not a valid base64 string, the thrown IllegalArgumentException is not a user-face exception that maybe it's not intuitive for user to realize it's due to the wrong input instead they might think there's bug at the Druid server side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is not a kind of exception that we can just move over. Though @FrankChen021 makes a good point of surfacing a nicer user-friendly error and throwing an IAE isn't a good way of getting that done.
@jon-wei - What error does the user get with this change if the descriptor string is not valid / can't be decoded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to catch IllegalArgumentException and throw a new exception indicating Base64 encoding was not valid.

For valid Base64 and invalid descriptor the existing block

    catch (Descriptors.DescriptorValidationException e) {
      throw new ParseException(null, e, "Invalid descriptor string: " + descriptorString);
    }

handles that

import java.util.Objects;
import java.util.Set;

public class InlineDescriptorProtobufBytesDecoder implements ProtobufBytesDecoder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is very similar to FileBasedProtobufBytesDecoder. I'm wondering if we can abstract some code so that these two classes can share them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to share a base calss

@JsonProperty("protoMessageType") String protoMessageType
)
{
this.descriptorString = descriptorString;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to check if it's null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added null checks

Copy link
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jon-wei for your work.

…ruid/data/input/protobuf/InlineDescriptorProtobufBytesDecoder.java

Co-authored-by: Frank Chen <frankchen@apache.org>
@jon-wei jon-wei merged commit 9b8e69c into apache:master Oct 11, 2022
@kfaraz kfaraz added this to the 25.0 milestone Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants