Title
Redesign IO format hierarchy before adding new DataFrame sources
Problem
The current hierarchy of SupportedCodeGenerationFormat and SupportedDataFrameFormat is hard to extend for new IO sources.
It works mostly for existing file-based formats, but adding non-file or metadata-rich sources such as JDBC requires workarounds and duplicated logic.
This makes future source integration harder and riskier.
Expected
Redesign the format hierarchy so new DataFrame sources can be added consistently.
Design scope
Clarify how the hierarchy should support:
- file-based formats
- non-file sources such as JDBC
- metadata-based sources
- code generation support
- shared capabilities between sources
Acceptance criteria
- Current limitations of
SupportedCodeGenerationFormat and SupportedDataFrameFormat are documented
- New hierarchy/design is proposed and agreed
- JDBC use case is covered by the design
- Adding a new IO source has a clear extension path
- Existing formats continue to work
- Migration impact for existing APIs is checked before 1.0
Motivation
This should be done before 1.0 because the format hierarchy affects internal architecture and future source integrations.
After 1.0, changing this model may be harder due to API and compatibility constraints. Several new IO sources are expected, so the extension point should be clarified before release.
Title
Redesign IO format hierarchy before adding new DataFrame sources
Problem
The current hierarchy of
SupportedCodeGenerationFormatandSupportedDataFrameFormatis hard to extend for new IO sources.It works mostly for existing file-based formats, but adding non-file or metadata-rich sources such as JDBC requires workarounds and duplicated logic.
This makes future source integration harder and riskier.
Expected
Redesign the format hierarchy so new DataFrame sources can be added consistently.
Design scope
Clarify how the hierarchy should support:
Acceptance criteria
SupportedCodeGenerationFormatandSupportedDataFrameFormatare documentedMotivation
This should be done before 1.0 because the format hierarchy affects internal architecture and future source integrations.
After 1.0, changing this model may be harder due to API and compatibility constraints. Several new IO sources are expected, so the extension point should be clarified before release.