Skip to content

[client][server] Support dynamically modifying table.auto-increment.cache-size via ALTER TABLE #3005

@matrixsparse

Description

@matrixsparse

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Background

Currently, table.auto-increment.cache-size can only be configured during table creation and cannot be modified afterward. In production environments, the optimal cache size may vary across different data ingestion phases, but users have no way to adjust it without recreating the table—an operation that is costly and disruptive in production.

Use Case

In our production environment, we have a typical two-phase data ingestion pattern for auto-increment primary key tables:

Phase 1: Initial Batch Loading

  • Need to backfill millions of historical records
  • Set table.auto-increment.cache-size = 100000 to minimize RPC calls to the coordinator and maximize write throughput
  • Large cache size significantly improves batch ingestion performance

Phase 2: Real-time Streaming (After Initial Load)

  • Daily ingestion volume: ~1M records/day
  • A large cache size (100K) becomes wasteful and causes:
    • Unnecessary memory consumption on tablet servers
    • Larger cache invalidation overhead if coordinator restarts
    • Potentially larger ID gaps during failures

Current Workaround

  • Option 1: Keep the suboptimal large cache size permanently (wasteful of memory and resources)
  • Option 2: Recreate the table (costly and disruptive to production workloads)
    • Production tables may have existing partitions, materialized views, permissions, and synchronization tasks
    • Recreating the table leads to:
      • Partition data re-import
      • Business interruption
      • Extremely high metadata migration costs

Solution

Support dynamically modifying the cache size via ALTER TABLE:

-- Increase cache size for batch loading
ALTER TABLE my_table SET ('table.auto-increment.cache-size' = '100000');

-- Reduce cache size for real-time streaming
ALTER TABLE my_table SET ('table.auto-increment.cache-size' = '5000');

Anything else?

Implementation Plan

I've investigated the codebase and identified the key components that need to be modified:

Key Files to Modify:

  1. CoordinatorService.alterTable()

    • Add validation logic for table.auto-increment.cache-size changes
    • Trigger cache refresh on tablet servers after metadata update
  2. MetadataManager.alterTableProperties()

    • Persist the new cache size to ZooKeeper
    • Validate the new value is within acceptable bounds
  3. AutoIncrementManager

    • Add method to update cache size dynamically
    • Invalidate old SequenceGenerator and create new one with updated cache size
  4. BoundedSegmentSequenceGenerator

    • The actual sequence generator that holds cacheSize field
    • Need to support runtime cache size update
  5. CoordinatorEventProcessor.postAlterTableProperties()

    • Notify tablet servers to refresh auto-increment cache with new size
  6. TableDescriptorValidation - Add validation for auto-increment cache size range

Implementation Challenges:

  • AutoIncrementManager and SequenceGenerator are created during table initialization with fixed cacheSize
  • Need to implement thread-safe cache size update without disrupting ongoing writes
  • Must handle the transition period when old cache is being exhausted

Proposed Timeline:

  • Week 1: Implement ALTER TABLE validation and metadata persistence
  • Week 2: Add dynamic cache size update mechanism in AutoIncrementManager
  • Week 3: Add tablet server coordination and comprehensive tests
  • Week 4: Address review feedback and finalize

I'm actively working on this feature and will provide a POC within the next week. Happy to discuss the design details and adjust based on community feedback!

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions