Skip to content

[FLINK-39201] Skip null values during length enforcement#27730

Merged
fapaul merged 1 commit intoapache:masterfrom
fapaul:flink-39201
Mar 3, 2026
Merged

[FLINK-39201] Skip null values during length enforcement#27730
fapaul merged 1 commit intoapache:masterfrom
fapaul:flink-39201

Conversation

@fapaul
Copy link
Contributor

@fapaul fapaul commented Mar 3, 2026

What is the purpose of the change

CharLengthConstraint.enforce() and BinaryLengthConstraint.enforce() crash with a NullPointerException when a nullable CHAR/VARCHAR/BINARY/VARBINARY column
contains a null value. The root cause is that both methods access the field value (e.g. rowData.getString(idx).numChars() / rowData.getBinary(idx).length) without
first checking rowData.isNullAt(idx). The NotNullConstraint only guards NOT NULL columns, so nullable columns with length constraints are unprotected.

Brief change log

- Added a `rowData.isNullAt(fieldIdx)` guard in `CharLengthConstraint.enforce()` to skip null values before calling `getString()`
- Added a `rowData.isNullAt(fieldIdx)` guard in `BinaryLengthConstraint.enforce()` to skip null values before calling `getBinary()`
- Added two semantic test programs (`CHAR_LENGTH_TRIM_PAD_WITH_NULLABLE_COLUMNS`, `BINARY_LENGTH_TRIM_PAD_WITH_NULLABLE_COLUMNS`) in `ConstraintEnforcerTestPrograms`

to verify null values in nullable columns pass through correctly with TRIM_PAD enforcement

Verifying this change

This change added tests and can be verified as follows:

- Added `CHAR_LENGTH_TRIM_PAD_WITH_NULLABLE_COLUMNS` semantic test program that feeds rows with null values into nullable `CHAR(8)`/`CHAR(6)`/`VARCHAR(6)` columns

with TRIM_PAD enforcement enabled, verifying that null fields pass through unchanged while non-null fields are correctly trimmed/padded
- Added BINARY_LENGTH_TRIM_PAD_WITH_NULLABLE_COLUMNS semantic test program that feeds rows with null values into nullable BINARY(8)/BINARY(6)/VARBINARY(6)
columns with TRIM_PAD enforcement enabled, verifying the same behavior for binary types

Does this pull request potentially affect one of the following parts:

- Dependencies (does it add or upgrade a dependency): **no**
- The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no**
- The serializers: **no**
- The runtime per-record code paths (performance sensitive): **yes** (adds a null check in the sink constraint enforcement hot path, but it is a single `isNullAt`

call which is negligible)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no

Documentation

- Does this pull request introduce a new feature? **no**
- If yes, how is the feature documented? **not applicable**

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 3, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@AHeise AHeise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@gustavodemorais gustavodemorais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix LGTM!

Optional suggestion: the fix protects both TRIM_PAD and ERROR code paths, correct? But I see the tests only cover TRIM_PAD. Adding an ERROR mode test would ensure no regression

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Mar 3, 2026
@fapaul fapaul merged commit 9da7811 into apache:master Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants