load hive-site.xml for flink catalog#1527
load hive-site.xml for flink catalog#1527zhangjun0x01 wants to merge 30 commits intoapache:masterfrom
Conversation
| private void loadHiveConf(Configuration configuration, Map<String, String> properties) { | ||
| String hiveConfPath = properties.get(HIVE_SITE_PATH); | ||
| Path path = new Path(hiveConfPath); | ||
| if (hiveConfPath.startsWith("hdfs")) { |
There was a problem hiding this comment.
Could we support more than just HDFS (like s3)?
There was a problem hiding this comment.
yes ,you are right ,we should support more storage to store hive configuration file
There was a problem hiding this comment.
Could we also use a named constant for these string literals for the filesystem scheme? Maybe HDFS_SCHEME in this case would work.
| return createCatalog(name, properties, clusterHadoopConf()); | ||
| Configuration configuration = clusterHadoopConf(); | ||
| String catalogType = properties.get(ICEBERG_CATALOG_TYPE); | ||
| if (catalogType.equals("hive")) { |
There was a problem hiding this comment.
Same note here about a named constant for this string literal.
|
|
||
| private void download(Configuration configuration, Path hdfsHiveSitePath) { | ||
| try { | ||
| File tmpFile = File.createTempFile("hive-site.xml-", ""); |
There was a problem hiding this comment.
Is the temp file hive-site.xml- supposed to have that trailing dash? Is this possibly a typo?
If it's not a typo and it's used to distinguish from the actual hive-site.xml, perhaps we could use a name that clarifies more the intent?
There was a problem hiding this comment.
My initial idea was to add a prefix to distinguish it from the actual hive-site.xml. We can add a .tmp suffix to make users understand that it is only a temporary file
There was a problem hiding this comment.
That sounds much better to me. .tmp definitely better conveys the intent to me. Thanks for updating that!
Co-authored-by: Marton Bod <mbod@cloudera.com>
|
@zhangjun0x01, looks like this is picking up changes from other PRs. Could you fix that, please? |
|
|
||
| public static final String HIVE_SITE_PATH = "hive-site-path"; | ||
| public static final String HIVE_SITE_SCHEMA_FILE = "file"; | ||
| public static final String HIVE_SITE_SCHEMA_HDFS = "hdfs"; |
There was a problem hiding this comment.
Possible nit / typo / my own confusion on the above two: I think the term is SCHEME and not SCHEMA, but could be one of those instances where the term is overloaded. I've always heard of that part of the URI referred to as scheme.
There was a problem hiding this comment.
I see in your getSchema function that the call to the Hadoop fs API uses scheme so I think it is scheme.
String schema = path.toUri().getScheme();
#1437
In order to be compatible with various flink submission modes, FlinkCatalogFactory provides a variable hive-site-path to specify the path of hive-site.xml, which can be a local path or an hdfs path.