Summary
Bitmap should support the same segment lifecycle used by vector-style distributed indexing:
create_index_uncommitted(fragment_ids=...) builds a durable, uncommitted physical segment.
merge_existing_index_segments(segments) optionally consolidates Bitmap segments.
commit_existing_index_segments(index_name, column, segments) publishes the final physical segments as one logical Bitmap index.
This work must be additive. Existing merged Bitmap indices and the legacy distributed Bitmap workflow based on merge_index_metadata(index_uuid, "BITMAP") must continue to work.
Compatibility boundary
Do not change the legacy meaning of index_uuid in the existing Bitmap distributed path. Today it is a shared merge directory / job id for part_*_bitmap_page_lookup.lance files, not a physical segment id. The new canonical segment path must use independent physical segment UUIDs.
Tasks
Suggested order
- Build canonical Bitmap segments.
- Commit and query multiple Bitmap segments.
- Merge Bitmap segments.
- Wire Bitmap into
IndexSegmentBuilder.
- Preserve and document the legacy path.
- Discuss deprecation and default migration separately.
Non-goals for this tracking issue
- Do not remove
merge_index_metadata(index_uuid, "BITMAP").
- Do not remove
BitmapParameters.shard_id.
- Do not change ordinary
create_scalar_index("col", "BITMAP") into a multi-segment build by default.
- Do not change Python defaults before a separate deprecation / migration discussion.
Summary
Bitmap should support the same segment lifecycle used by vector-style distributed indexing:
create_index_uncommitted(fragment_ids=...)builds a durable, uncommitted physical segment.merge_existing_index_segments(segments)optionally consolidates Bitmap segments.commit_existing_index_segments(index_name, column, segments)publishes the final physical segments as one logical Bitmap index.This work must be additive. Existing merged Bitmap indices and the legacy distributed Bitmap workflow based on
merge_index_metadata(index_uuid, "BITMAP")must continue to work.Compatibility boundary
Do not change the legacy meaning of
index_uuidin the existing Bitmap distributed path. Today it is a shared merge directory / job id forpart_*_bitmap_page_lookup.lancefiles, not a physical segment id. The new canonical segment path must use independent physical segment UUIDs.Tasks
Suggested order
IndexSegmentBuilder.Non-goals for this tracking issue
merge_index_metadata(index_uuid, "BITMAP").BitmapParameters.shard_id.create_scalar_index("col", "BITMAP")into a multi-segment build by default.