New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-2325] Add hive sync support to kafka connect #3660
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,6 +30,7 @@ | |
import org.apache.hudi.common.util.SerializationUtils; | ||
import org.apache.hudi.common.util.StringUtils; | ||
import org.apache.hudi.connect.ControlMessage; | ||
import org.apache.hudi.connect.writers.KafkaConnectConfigs; | ||
import org.apache.hudi.exception.HoodieException; | ||
import org.apache.hudi.keygen.BaseKeyGenerator; | ||
import org.apache.hudi.keygen.CustomAvroKeyGenerator; | ||
|
@@ -63,6 +64,7 @@ | |
public class KafkaConnectUtils { | ||
|
||
private static final Logger LOG = LogManager.getLogger(KafkaConnectUtils.class); | ||
private static final String HOODIE_CONF_PREFIX = "hoodie."; | ||
|
||
public static int getLatestNumPartitions(String bootstrapServers, String topicName) { | ||
Properties props = new Properties(); | ||
|
@@ -85,9 +87,15 @@ public static int getLatestNumPartitions(String bootstrapServers, String topicNa | |
* | ||
* @return | ||
*/ | ||
public static Configuration getDefaultHadoopConf() { | ||
public static Configuration getDefaultHadoopConf(KafkaConnectConfigs connectConfigs) { | ||
Configuration hadoopConf = new Configuration(); | ||
hadoopConf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName()); | ||
connectConfigs.getProps().keySet().stream().filter(prop -> { | ||
// In order to prevent printing unnecessary warn logs, here filter out the hoodie | ||
// configuration items before passing to hadoop/hive configs | ||
return !prop.toString().startsWith(HOODIE_CONF_PREFIX); | ||
}).forEach(prop -> { | ||
hadoopConf.set(prop.toString(), connectConfigs.getProps().get(prop.toString()).toString()); | ||
}); | ||
Comment on lines
+92
to
+98
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see similar code in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could, but i have to convert the kafka map configs to hadoop configs. also later we can change this logic if required. for instance, if we want to just have hadoop confs start with "conf.hadoop." in kafka connect, we can do that independently. wdyt? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. Then let's keep this method. |
||
return hadoopConf; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow-up, maybe we can add all these Kafka-related environment setup, including schema registry, to the docker demo, to make it easier for users to try out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.