Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-28096][hive] Hive dialect support set variables #20000

Merged
merged 3 commits into from
Aug 1, 2022

Conversation

luoyuxia
Copy link
Contributor

What is the purpose of the change

To make Hive dialect support set variables.

Brief change log

  • When it's for set command, extract the key and value, and then set the value according what kind of variable it belongs.

Verifying this change

UT

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? N/A

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 17, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Member

@fsk119 fsk119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. I left some comments.

if (!key.equals("silent")) {
HiveParserSetProcessor.setVariable(hiveConf, hiveVariables, key, value);
}
return new NopOperation();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better to also set the config into table config because users may set some optimization options here.

import static org.apache.hadoop.hive.conf.SystemVariables.METACONF_PREFIX;
import static org.apache.hadoop.hive.conf.SystemVariables.SYSTEM_PREFIX;

/** Counterpart of hive's {@link org.apache.hadoop.hive.ql.processors.SetProcessor}. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I made a mistake. It's better use {@link SetProcessor}

@wuchong
Copy link
Member

wuchong commented Jul 22, 2022

The compile phase is failed.

@luoyuxia luoyuxia force-pushed the FLINK-28096-1 branch 3 times, most recently from 95d6432 to 432037b Compare July 24, 2022 08:13
Copy link
Member

@fsk119 fsk119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for update. I left some comments. It looks good in general.

@@ -82,6 +88,7 @@ public class HiveParser extends ParserImpl {
private static final Method getCurrentTSMethod =
HiveReflectionUtils.tryGetMethod(
SessionState.class, "getQueryCurrentTimestamp", new Class[0]);
private static final String HIVE_VARIABLE_PREFIX = "__hive.variable__";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to move to HiveInternalOptions and describe this in the descriptions.

I think it's better to use __hive.variables__? WDYT?

processSetCmd(
hiveConf, statement.substring(commandTokens[0].length()).trim()));
} else {
throw new UnsupportedOperationException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's better to tell users what is unsupported.

if (part[0].equals("silent")) {
throw new UnsupportedOperationException("Unsupported command 'set silent'.");
}
HiveSetProcessor.setVariable(hiveConf, tableConfig, hiveVariables, part[0], part[1]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It better to let SessionContext to do the set.

Comment on lines 283 to 285
String option = HiveSetProcessor.getVariable(hiveConf, hiveVariables, nwcmd);
// for the variable
throw new UnsupportedOperationException("Unsupported SET command which misses '='.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I think we can remove it.

Comment on lines 252 to 254
String options =
HiveSetProcessor.dumpOptions(
hiveConf.getChangedProperties(), hiveConf, hiveVariables);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove if we don't support this.

hiveConf.verifyAndSet(key, value);
}

public static String getVariable(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to support to show variables using SET? If not, I think we can add this when needs.

assertThat(hiveCatalog.getHiveConf().get("yyy")).isEqualTo("5");
// test set hivevar:
tableEnv.executeSql("set hivevar:a=1");
tableEnv.executeSql("set hiveconf:zzz=${hivevar:a}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should renew a sql parser to test whether the test config memorizes hive-vars.

For example,

        tableEnv.getConfig().setSqlDialect(SqlDialect.DEFAULT);
        tableEnv.executeSql("show tables");
        tableEnv.getConfig().setSqlDialect(SqlDialect.HIVE);

@luoyuxia luoyuxia force-pushed the FLINK-28096-1 branch 3 times, most recently from 79fbc1e to cbb3ce2 Compare July 27, 2022 07:02
Comment on lines +937 to +998
tableEnv.getConfig().setSqlDialect(SqlDialect.DEFAULT);
tableEnv.executeSql("show tables");
tableEnv.getConfig().setSqlDialect(SqlDialect.HIVE);
tableEnv.executeSql("set hiveconf:zzz1=${hivevar:a}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Use set statement to switch dialect?
  2. Set a flink config under hive dialect and check it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1: We maynot use set statement to switch dialect directly for the SetOperation is executed in SqlClient.
2: The Flink's SetOperation will be executed in SqlClient , so we can't set it in tableEnv. But the test set.q in flink-sql-client has covered the case that set flink config in HiveDialect.

Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Do you have other concerns? @fsk119

@wuchong wuchong merged commit a6a7063 into apache:master Aug 1, 2022
huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants