Skip to content

[refactor] Introduce ConnectorObjectFactory#397

Merged
ZacBlanco merged 1 commit intobytedance:mainfrom
yingsu00:connector_refactor_2
Apr 3, 2026
Merged

[refactor] Introduce ConnectorObjectFactory#397
ZacBlanco merged 1 commit intobytedance:mainfrom
yingsu00:connector_refactor_2

Conversation

@yingsu00
Copy link
Copy Markdown
Contributor

@yingsu00 yingsu00 commented Mar 16, 2026

What problem does this PR solve?

Issue Number: Partially resolves #250

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 🚀 Performance improvement (optimization)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🔨 Refactoring (no logic changes)
  • 🔧 Build/CI or Infrastructure changes
  • 📝 Documentation only

Description

Introduce ConnectorObjectFactory

Performance Impact

  • No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).

  • Positive Impact: I have run benchmarks.

    Click to view Benchmark Results
    Paste your google-benchmark or TPC-H results here.
    Before: 10.5s
    After:   8.2s  (+20%)
    
  • Negative Impact: Explained below (e.g., trade-off for correctness).

Release Note

Please describe the changes in this PR

Release Note: None

Checklist (For Author)

  • I have added/updated unit tests (ctest).
  • I have verified the code with local build (Release/Debug).
  • I have run clang-format / linters.
  • (Optional) I have run Sanitizers (ASAN/TSAN) locally for complex C++ changes.
  • No need to test or manual test.

Breaking Changes

  • No

  • Yes (Description: ...)

    Click to view Breaking Changes
    Breaking Changes:
    - Description of the breaking change.
    - Possible solutions or workarounds.
    - Any other relevant information.
    

@yingsu00 yingsu00 force-pushed the connector_refactor_2 branch 7 times, most recently from 53a4015 to 665fe8d Compare March 19, 2026 12:21
@yingsu00
Copy link
Copy Markdown
Contributor Author

@ZacBlanco can you please review?

@ZacBlanco ZacBlanco self-requested a review March 19, 2026 16:58
Copy link
Copy Markdown
Collaborator

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small nits and a question or two on the API design. Otherwise I am excited for this!

Comment thread bolt/connectors/Connector.h Outdated
Comment thread bolt/connectors/Connector.h
Comment thread bolt/connectors/ConnectorObjectFactory.cpp
Comment thread bolt/connectors/hive/tests/HiveObjectFactoryTest.cpp Outdated
Comment thread bolt/connectors/ConnectorObjectFactory.h Outdated
Comment thread bolt/connectors/ConnectorObjectFactory.h Outdated
@yingsu00 yingsu00 force-pushed the connector_refactor_2 branch 6 times, most recently from 54e0e06 to 0edffd1 Compare April 1, 2026 12:15
Copy link
Copy Markdown
Collaborator

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more minor things

Comment thread bolt/exec/TableScan.cpp
hiveSplit->length != std::numeric_limits<uint64_t>::max()) {
preloadBytes = static_cast<int64_t>(hiveSplit->length);
}
int64_t preloadBytes = split->splitSizeBytes();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with this change we need to implement the splitSizeBytes() method inside of HiveConnectorSplit?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread bolt/connectors/hive/HiveConnector.cpp Outdated
Comment thread bolt/connectors/hive/HiveDataSink.h
Comment thread bolt/connectors/ConnectorObjectFactory.cpp
Comment thread bolt/connectors/ConnectorObjectFactory.h Outdated
@yingsu00 yingsu00 force-pushed the connector_refactor_2 branch 6 times, most recently from 78f090a to 0f51182 Compare April 3, 2026 15:55
…ns API

- Add ConnectorObjectFactory abstract interface with factory methods for
  creating connector objects (splits, column handles, table handles, etc.)
- Add ConnectorOptions base class and DynamicConnectorOptions subclass for
  typed factory method options; add makeOptions() helper
- Add HiveObjectFactory implementing ConnectorObjectFactory for Hive connector
- Add splitSizeBytes() virtual method to ConnectorSplit base class
- Add registerObjectFactory() to ConnectorFactory for factory registration
- Use ConnectorObjectFactory in TableScan operator
@yingsu00 yingsu00 force-pushed the connector_refactor_2 branch from 0f51182 to aa9feb0 Compare April 3, 2026 16:11
@yingsu00
Copy link
Copy Markdown
Contributor Author

yingsu00 commented Apr 3, 2026

I would prefer to have instances of connectorObjectFactory associated already with a connectorId, r

Done.

@ZacBlanco Thank you for your thorough review. I've addressed your comments. Would you please review again?

@ZacBlanco ZacBlanco added this pull request to the merge queue Apr 3, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 3, 2026
@ZacBlanco ZacBlanco added this pull request to the merge queue Apr 3, 2026
Merged via the queue into bytedance:main with commit 634cdc1 Apr 3, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature][Design] Refactor Bolt connectors to remove Hive coupling and prepare for pluggable connectors (Bolt) Background

2 participants