Skip to content

[core] add file-format-per-level for paimon#1500

Merged
JingsongLi merged 2 commits intoapache:masterfrom
zhangjun0x01:file-format-per-level
Jul 20, 2023
Merged

[core] add file-format-per-level for paimon#1500
JingsongLi merged 2 commits intoapache:masterfrom
zhangjun0x01:file-format-per-level

Conversation

@zhangjun0x01
Copy link
Contributor

Purpose

Linked issue: #1474

Tests

org.apache.paimon.flink.CatalogTableITCase#testFileFormatPerLevel

API and Format

Documentation

private void updateFormatByLevel(int level) {
String fileFormat = levelFormats.get(level);
if (null != fileFormat) {
pathFactory.setFormatIdentifier(fileFormat);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not provide setFormatIdentifier, This is difficult to maintain.

A better way is to pass parameters when you need to do something, rather than setting its class variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I udpate it by construct a new DataFilePathFactory

@zhangjun0x01 zhangjun0x01 force-pushed the file-format-per-level branch from bf01d54 to 118f4dd Compare July 6, 2023 10:00
private final RowType recordType;
private FormatWriterFactory writerFactory;
@Nullable private TableStatsExtractor tableStatsExtractor;
private DataFilePathFactory pathFactory;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should maintain a Map<String, Factories> field to deal with different formats.

suggestedFileSize);
}

private void updateFormatByLevel(int level) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This style is not good because it dynamically decontaminates class member variables, which can lead to all sorts of problems.

In general, we recommend a functional style, which means that class members should ideally be immutable, with at most some cache stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it.

@zhangjun0x01 zhangjun0x01 force-pushed the file-format-per-level branch 2 times, most recently from 7969c33 to 91d2926 Compare July 12, 2023 15:14
@zhangjun0x01 zhangjun0x01 marked this pull request as draft July 12, 2023 15:14
@zhangjun0x01 zhangjun0x01 force-pushed the file-format-per-level branch 3 times, most recently from 5187edd to 03e738d Compare July 13, 2023 15:02
@zhangjun0x01 zhangjun0x01 marked this pull request as ready for review July 14, 2023 01:24
@zhangjun0x01 zhangjun0x01 force-pushed the file-format-per-level branch from 03e738d to d0a9b0f Compare July 18, 2023 01:32
Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that there is still some work needed here to make the code clearer, such as putting format related things into one class in write and introducing a LazyMap class to avoid creating all the structures. In addition, the handling of default format should also be placed inside this class, rather than outside as it is now.

@zhangjun0x01
Copy link
Contributor Author

ok,I will deal it later

@JingsongLi JingsongLi force-pushed the file-format-per-level branch from e28222f to 9bb8c5e Compare July 20, 2023 07:02
@JingsongLi JingsongLi force-pushed the file-format-per-level branch from 9bb8c5e to ef1af77 Compare July 20, 2023 07:30
@JingsongLi
Copy link
Contributor

I added a commit ef1af77 to improve code structure

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 755e9d8 into apache:master Jul 20, 2023
JingsongLi pushed a commit that referenced this pull request Jul 20, 2023
@zhangjun0x01
Copy link
Contributor Author

I added a commit ef1af77 to improve code structure

thank you very much .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants