Skip to content

[feature](load) support compressed JSON format data for broker load#30801

Closed
elvestar wants to merge 1 commit intoapache:masterfrom
elvestar:compressed-json
Closed

[feature](load) support compressed JSON format data for broker load#30801
elvestar wants to merge 1 commit intoapache:masterfrom
elvestar:compressed-json

Conversation

@elvestar
Copy link
Contributor

@elvestar elvestar commented Feb 4, 2024

Proposed changes

Issue Number: close #xxx

We have a large amount of data stored in HDFS in GZIP-compressed JSON format, and currently Doris does not support loading compressed JSON format data.

Upgrade points:

  1. Upgraded the NewJsonReader (in new_json_reader.cpp) to construct the _decompressor during its initialization based on the compression type specified during load, and then use the _decompressor to initialize the LineReader.
  2. Refactored the _create_decompressor() method of CsvReader by extracting the code that constructs the _decompressor based on _file_compress_type or _file_format_type. This code was moved to the Decompressor class as a factory method create_decompressor(). This allows the code to be reused by NewJsonReader.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

return st;
}

Status Decompressor::create_decompressor(TFileCompressType::type type, Decompressor** decompressor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'create_decompressor' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status Decompressor::create_decompressor(TFileCompressType::type type, Decompressor** decompressor) {
static Status Decompressor::create_decompressor(TFileCompressType::type type, Decompressor** decompressor) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already declared the method as static in the header file, and this warning is likely a false positive.

return Status::OK();
}

Status Decompressor::create_decompressor(TFileFormatType::type type, Decompressor** decompressor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'create_decompressor' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status Decompressor::create_decompressor(TFileFormatType::type type, Decompressor** decompressor) {
static Status Decompressor::create_decompressor(TFileFormatType::type type, Decompressor** decompressor) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already declared the method as static in the header file, and this warning is likely a false positive.

@elvestar elvestar closed this Feb 4, 2024
@elvestar elvestar deleted the compressed-json branch February 4, 2024 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments