Skip to content

[FLINK-39185][table] Introduce BITMAP type for Table API/SQL#27778

Open
dylanhz wants to merge 2 commits intoapache:masterfrom
dylanhz:FLINK-39185
Open

[FLINK-39185][table] Introduce BITMAP type for Table API/SQL#27778
dylanhz wants to merge 2 commits intoapache:masterfrom
dylanhz:FLINK-39185

Conversation

@dylanhz
Copy link
Contributor

@dylanhz dylanhz commented Mar 17, 2026

What is the purpose of the change

This pull request adds Table API/SQL support for the BITMAP data type introduced in FLIP-556. It integrates BITMAP into Flink's type system, internal data format, planner, and code generation, enabling BITMAP columns to be used in SQL queries and Table API programs.

This is the third PR in the FLIP-556 series:

  • PR 1 (FLINK-39183): Parser support
  • PR 2 (FLINK-39184): DataStream API support (flink-core)
  • PR 3 (FLINK-39185): Table API/SQL support (this PR)

Brief change log

Suggested review order:

  1. LogicalType system: Added BitmapType, LogicalTypeRoot.BITMAP, LogicalTypeFamily.EXTENSION, visitor support, cast rules, and type parsing
  2. DataType / API layer: Added DataTypes.BITMAP(), registered type mappings in ClassDataTypeConverter, TypeInfoDataTypeConverter, and ValueDataTypeConverter
  3. Internal data format: Extended RowData/ArrayData with getBitmap(), implemented in BinaryRowData/BinaryArrayData/GenericRowData/GenericArrayData/NestedRowData; added BinarySegmentUtils.readBitmap() and BinaryWriter.writeBitmap()
  4. Planner integration: Added BitmapRelDataType, integrated into FlinkTypeFactory (bidirectional conversion between BitmapType and BitmapRelDataType), extended CodeGenUtils for code generation, and updated ExpressionReducer
  5. Cast rules: Added BitmapToStringCastRule and BitmapToBinaryCastRule (with trim/pad semantics); restricted CAST(x AS BITMAP) in SqlCastFunction
  6. Data converters: Added BitmapBitmapConverter, DataFormatConverters.BitmapConverter, and JSON serialization/deserialization for BitmapType

Verifying this change

This change added tests and can be verified as follows:

  • BitmapSemanticTest: End-to-end integration tests for BITMAP in SQL/Table API, covering source/sink roundtrip, projection, filtering, UDF invocation, and UDAF aggregation
  • BinaryRowDataTest / BinaryArrayDataTest: Unit tests for BITMAP read/write in binary row and array formats
  • RowDataTest: Verifies BITMAP field access and FieldGetter in RowData
  • DataTypesTest: Verifies DataTypes.BITMAP() resolution and class mapping
  • LogicalTypesTest: Tests BitmapType properties, serialization string, and cast compatibility
  • ProjectionCodeGeneratorTest: Verifies BITMAP field projection in generated code
  • TypeInferenceExtractorTest: Tests type inference for UDFs that accept/return BITMAP, including rejection of custom Bitmap implementations

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes (RowData, ArrayData, DataTypes, BinaryWriter)
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): yes (new getBitmap/writeBitmap code paths, but only activated for BITMAP type columns)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? not documented (documentation will be added when the full BITMAP type support is complete, including built-in functions)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 17, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@dylanhz dylanhz marked this pull request as ready for review March 19, 2026 12:04
Copy link
Contributor

@lincoln-lil lincoln-lil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dylanhz Thanks for working on this! I've left some comments there, and for the tests, can you add cases covering cast call results, e.g., CAST(bitmap AS STRING) and CAST(bitmap AS VARBINARY).

Another question for the python part, should we adapt bitmap type in PythonTableUtils.converter()?

* <p>The serializable string representation of this type is {@code BITMAP}.
*/
@PublicEvolving
public final class BitmapType extends LogicalType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add serialVersionUID

}

/** Gets a copied bitmap. Returns null if {@code other} is null. */
static Bitmap from(Bitmap other) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least add java doc here to clearly note that currently only RoaringBitmapData is supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants