-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-26414][hive] Hive dialect supports macro #19561
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luoyuxia Thanks for your contribution, I left some comments.
private Operation convertCreateMacro(HiveParserASTNode ast) throws SemanticException { | ||
String macroName = ast.getChild(0).getText(); | ||
if (FunctionUtils.isQualifiedFunctionName(macroName)) { | ||
throw new SemanticException("Temporary macro cannot be created with a qualified name."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the method FunctionUtils.isQualifiedFunctionName
, this error message may be not cleared for user? so I think we can improve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
@@ -139,6 +139,13 @@ public void testFunction() throws Exception { | |||
assertDDLType(HiveASTParser.TOK_SHOWFUNCTIONS, "show functions"); | |||
} | |||
|
|||
@Test | |||
public void testMacro() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be better if we add a test that create a mcro with qualified name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may not need for macro with a qualified name will fail in parse phase.
The antrl segment is like follows:
KW_CREATE KW_TEMPORARY KW_MACRO Identifier LPAREN columnNameTypeList? RPAREN expression -> ^(TOK_CREATEMACRO Identifier columnNameTypeList? expression)
* serialized to string and held on in the HiveFunctionWrapper. | ||
*/ | ||
public HiveFunctionWrapper(String className, UDFType serializableInstance) { | ||
this.className = className; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can call it directly this(className);
} catch (ClassNotFoundException e) { | ||
throw new FlinkHiveUDFException( | ||
String.format("Failed to deserialize function %s", className), e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Failed to deserialize function %s."
throw new SemanticException("Temporary macro cannot be created with a qualified name."); | ||
} | ||
|
||
List<FieldSchema> arguments = getColumns((HiveParserASTNode) ast.getChild(1), true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The arguments
readability is not good? IMO, it refers the all columns of the table? In other words, we should add an annotation about it.
return actualColumnNames; | ||
} | ||
|
||
private void getMacroColumnData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can return a Tuple2 here directly? BTW, maybe we can merge the method getMacroColumnData
and getBody
into one, then return a Tuple3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return a Tuple2 is good to me. But I would like separate the getMacroColumnData
and getBody
,
} | ||
Set<String> expectedColumnNames = new LinkedHashSet<>(macroColumnNames); | ||
if (!expectedColumnNames.equals(actualColumnNames)) { | ||
throw new SemanticException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String.format("Expected columns [%s], but found [%s].", expectedColumnNames, actualColumnNames) here ?
throw new SemanticException( | ||
"Expected columns " + expectedColumnNames + " but found " + actualColumnNames); | ||
} | ||
if (expectedColumnNames.size() != macroColumnNames.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about expectedColumnNames.size() > macroColumnNames.size()
or expectedColumnNames.size() < macroColumnNames.size()
case, whether the error message is the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expectedColumnNames.size() > macroColumnNames.size()
will nerver happen for expectedColumnNames
is a set of macroColumnNames
.
for (FieldSchema argument : arguments) { | ||
TypeInfo columnType = TypeInfoUtils.getTypeInfoFromTypeString(argument.getType()); | ||
rowResolver.put( | ||
"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
org.apache.commons.lang3.StringUtils.EMPTY?
rowResolver.put( | ||
"", | ||
argument.getName(), | ||
new ColumnInfo(argument.getName(), columnType, "", false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above
a35efa3
to
a24ba17
Compare
@lsyldliu Thanks for your review. I have adrressed your comments. |
a24ba17
to
9739bf9
Compare
@flinkbot run azure |
serializableInstance instanceof Serializable, | ||
String.format( | ||
"The UDF %s should be an instance of Serializable.", | ||
serializableInstance.getClass().getName())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check the class name is equal to the class of serializableInstance
.
try { | ||
return (UDFType) | ||
SerializationUtilities.deserializeObject( | ||
udfSerializedString, (Class<Serializable>) getUDFClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is problematic in the case the hive connector is dynamically loaded, i.e. in the user classloader instead of the current class loader. We should use the user classloader to load classes. Could you create a JIRA issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, I have created FLINK-28430 for it.
private Operation convertCreateMacro(HiveParserASTNode ast) throws SemanticException { | ||
String macroName = ast.getChild(0).getText(); | ||
if (FunctionUtils.isQualifiedFunctionName(macroName)) { | ||
throw new SemanticException("The name of the temporary macro can't contain `.`."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also print the wrong function name. For example:
CREATE TEMPORARY MACRO doesn't allow "." character in the macro name, but the name is "%s"
.
Besides, is there a test for the exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add the test for the exception in HiveDialectITCase#testMacro just now.
5764a8b
to
e7e1338
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What is the purpose of the change
Hive dialect supports macro.
Brief change log
create/drop macro
macro
to invoking Hive's GenericUDFMacroVerifying this change
Added test in ...link-connector-hive/src/test/java/org/apache/flink/connectors/hive/HiveDialectITCase.java#testMacro
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation