Skip to content

Conversation

@HuangZhenQiu
Copy link
Contributor

What is the purpose of the change

Allow passing hadoop config into parquet reader

Brief change log

  • Change AvroParquetRecordFormat to take hadoop config in AvroParquetReader builder.

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no )
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Collaborator

flinkbot commented Apr 5, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

return new AvroParquetRecordReader<E>(
AvroParquetReader.<E>builder(new ParquetInputFile(stream, fileLen))
.withDataModel(getDataModel())
.withConf(HadoopUtils.getHadoopConfiguration(config))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here if the optional flink-hadoop-fs is not on the classpath?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exists in flink runtime https://github.com/apache/flink/blob/master/flink-runtime/pom.xml#L103, so it is not a concern?

@mbalassi mbalassi self-requested a review April 5, 2024 15:58
Copy link
Contributor

@mbalassi mbalassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my concern, looks good.

@mbalassi
Copy link
Contributor

Merging this given no objections. We had an offline discussion with @HuangZhenQiu to see if a test could be added to cover this with reasonable effort. We did not find a straight-forward approach at this time.

@mbalassi mbalassi merged commit c0891cf into apache:master Apr 10, 2024
hanyuzheng7 pushed a commit to hanyuzheng7/flink that referenced this pull request May 6, 2024
apache#24623)

Co-authored-by: Peter (ACS) Huang <zhenqiu_huang@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants