Skip to content

[FLINK-39530][s3] Replace S3 compatible auto-detection with explicit chunked-encoding and checksum-validation config options#28010

Merged
gaborgsomogyi merged 1 commit into
apache:release-2.3from
gaborgsomogyi:FLINK-39530-2.3
Apr 24, 2026
Merged

[FLINK-39530][s3] Replace S3 compatible auto-detection with explicit chunked-encoding and checksum-validation config options#28010
gaborgsomogyi merged 1 commit into
apache:release-2.3from
gaborgsomogyi:FLINK-39530-2.3

Conversation

@gaborgsomogyi
Copy link
Copy Markdown
Contributor

What is the purpose of the change

This is a backport of #28009.

The native S3 connector previously treated any custom s3.endpoint as an indicator of an S3-compatible server, silently enabling path-style access and disabling chunked encoding and checksum validation. This caused legitimate AWS regional endpoints (e.g. s3.us-west-2.amazonaws.com) to be misconfigured with no workaround available.

This PR removes the auto-detection entirely and introduces two explicit config options:

  • s3.chunked-encoding.enabled (default: true)
  • s3.checksum-validation.enabled (default: true)

The existing s3.path-style-access option already served as an explicit knob. Users targeting S3-compatible servers (MinIO, Ceph, etc.) now configure all three flags explicitly.

Additionally, S3ClientProvider now stores and exposes all directly configurable flags (pathStyleAccess, chunkedEncoding, checksumValidation, maxConnections, maxRetries) for testability, and the test suite was refactored to assert actual config values rather than fs != null.

Brief change log

  • Removed auto server detection
  • Added s3.chunked-encoding.enabled (default: true)
  • Added s3.checksum-validation.enabled (default: true)
  • Tests simplifition
  • Additional tests for coverage

Verifying this change

Existing and newly added tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: yes

Documentation

  • Does this pull request introduce a new feature? nono
  • If yes, how is the feature documented? not applicable
Was generative AI tooling used to co-author this PR?
  • Yes

Generated-by: Claude Code

…chunked-encoding and checksum-validation config options
@gaborgsomogyi
Copy link
Copy Markdown
Contributor Author

cc @mbalassi @ferenc-csaky

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Apr 23, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Copy Markdown
Contributor

@Samrat002 Samrat002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering CI is green

@github-actions github-actions Bot added target:release-2.3 community-reviewed PR has been reviewed by the community. labels Apr 23, 2026
@gaborgsomogyi gaborgsomogyi merged commit 2cf4c36 into apache:release-2.3 Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community. target:release-2.3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants