-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Closed
Labels
kind:featureFeature RequestsFeature Requestskind:new provider requestlabel to mark request for adding new providerlabel to mark request for adding new providerneeds-triagelabel for new issues that we didn't triage yetlabel for new issues that we didn't triage yet
Description
Description
Apache XTable translates metadata among datalakes, allowing users to read from datalake with the tools don't have native support.
XTable can be executed with command like
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml [--hadoopConfig hdfs-site.xml] [--convertersConfig converters.yaml] [--icebergCatalogConfig catalog.yaml]
An Airflow operator can be created to wrap this command and provide both file and dict input for those XTable config in YAML files.
Use case/motivation
AWS provides an example XTableOperator for XTable. This blog has good explanation about the Open table formats XTable provides. While this example operator is essentially an MVP version, and serves as an MWAA plugin. We can create Apache XTable provider making it available for more Airflow users, and providing more flexible user input.
Related issues
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
kind:featureFeature RequestsFeature Requestskind:new provider requestlabel to mark request for adding new providerlabel to mark request for adding new providerneeds-triagelabel for new issues that we didn't triage yetlabel for new issues that we didn't triage yet