-
Notifications
You must be signed in to change notification settings - Fork 3k
Doc: fix error and enhanced iceberg catalog description for the flink DataStream API #2389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| For an unpartitioned iceberg table, its data will be completely overwritten by `INSERT OVERWRITE`. | ||
|
|
||
| ## Reading with DataStream | ||
| ## Iceberg Operation with DataStream API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to name this title as Access iceberg table in Java API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. And adding a section on accessing the Iceberg Java API can be a separate level-2 section, so there would be no need to change the heading level of all the remaining headings in this doc. I think that would be much better because it is fewer changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. And adding a section on accessing the Iceberg Java API can be a separate level-2 section, so there would be no need to change the heading level of all the remaining headings in this doc. I think that would be much better because it is fewer changes.
well, I will change this later.
|
|
||
| ## Reading with DataStream | ||
| ## Iceberg Operation with DataStream API | ||
| ### Load Iceberg Catalog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I think we should introduce the background why we introduce the CatalogLoader & TableLoader: We flink operator want to access the iceberg table while Catalog & Table are not serializable because they depends on some resources that could not be serializable (such as Connection). So we have to introduce the Loader to maintain the configurations which are required to initialize the Catalog and Table.
Also we'd better to list the general CatalogLoaders and explain what do they mean:
a. HiveCatalogLoader ;
b. HadoopCatalogLoader ;
c. CustomCatalogLoader.
Ditto for TableLoader.
| #### Load Hadoop Catalog | ||
|
|
||
| ```java | ||
| Map<String, String> properties = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the indent for the following java sentences here.
| properties.put("property-version", "1"); | ||
| properties.put("warehouse", "hdfs://nn:8020/warehouse/path"); | ||
| CatalogLoader catalogLoader = CatalogLoader.hadoop(HADOOP_CATALOG, new Configuration(), properties); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a sentence to indicate that how to load TableLoader & load iceberg table.
| ```java | ||
| StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment(); | ||
| TableLoader tableLoader = TableLoader.fromHadooptable("hdfs://nn:8020/warehouse/path"); | ||
| TableLoader tableLoader = TableLoader.fromHadoopTable("hdfs://nn:8020/warehouse/path"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixing !
| CatalogLoader catalogLoader = CatalogLoader.hive(HIVE_CATALOG, new Configuration(), properties); | ||
| ``` | ||
|
|
||
| *Note*: The following are examples of Load Hadoop Catalog. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The following are examples of Load Hadoop Catalog. -> The following will take loading hadoop table as an example to demonstrate how to use Java API to design flink data stream jobs
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
fix error and enhanced iceberg catalog description for the flink DataStream API