Skip to content

Conversation

@XuQianJin-Stars
Copy link
Contributor

fix error and enhanced iceberg catalog description for the flink DataStream API

@github-actions github-actions bot added the docs label Mar 28, 2021
For an unpartitioned iceberg table, its data will be completely overwritten by `INSERT OVERWRITE`.

## Reading with DataStream
## Iceberg Operation with DataStream API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to name this title as Access iceberg table in Java API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. And adding a section on accessing the Iceberg Java API can be a separate level-2 section, so there would be no need to change the heading level of all the remaining headings in this doc. I think that would be much better because it is fewer changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. And adding a section on accessing the Iceberg Java API can be a separate level-2 section, so there would be no need to change the heading level of all the remaining headings in this doc. I think that would be much better because it is fewer changes.

well, I will change this later.


## Reading with DataStream
## Iceberg Operation with DataStream API
### Load Iceberg Catalog
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I think we should introduce the background why we introduce the CatalogLoader & TableLoader: We flink operator want to access the iceberg table while Catalog & Table are not serializable because they depends on some resources that could not be serializable (such as Connection). So we have to introduce the Loader to maintain the configurations which are required to initialize the Catalog and Table.

Also we'd better to list the general CatalogLoaders and explain what do they mean:

a. HiveCatalogLoader ;
b. HadoopCatalogLoader ;
c. CustomCatalogLoader.

Ditto for TableLoader.

#### Load Hadoop Catalog

```java
Map<String, String> properties = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the indent for the following java sentences here.

properties.put("property-version", "1");
properties.put("warehouse", "hdfs://nn:8020/warehouse/path");
CatalogLoader catalogLoader = CatalogLoader.hadoop(HADOOP_CATALOG, new Configuration(), properties);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a sentence to indicate that how to load TableLoader & load iceberg table.

```java
StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();
TableLoader tableLoader = TableLoader.fromHadooptable("hdfs://nn:8020/warehouse/path");
TableLoader tableLoader = TableLoader.fromHadoopTable("hdfs://nn:8020/warehouse/path");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixing !

CatalogLoader catalogLoader = CatalogLoader.hive(HIVE_CATALOG, new Configuration(), properties);
```

*Note*: The following are examples of Load Hadoop Catalog.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The following are examples of Load Hadoop Catalog. -> The following will take loading hadoop table as an example to demonstrate how to use Java API to design flink data stream jobs

@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 17, 2024
@github-actions
Copy link

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants