Skip to content

Remove storage_format config option from LanceDB design#696

Closed
Copilot wants to merge 10 commits intodtj-lance-designfrom
copilot/sub-pr-683
Closed

Remove storage_format config option from LanceDB design#696
Copilot wants to merge 10 commits intodtj-lance-designfrom
copilot/sub-pr-683

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 12, 2026

Change Description

Per review feedback, removes the storage_format configuration flag from the LanceDB integration design. The flag was previously agreed to be unnecessary—format selection should rely solely on auto-detection based on disk state.

Solution Description

Simplified writer selection (Step 3):

  • create_results_writer() now always returns ResultDatasetWriter (Lance format)
  • Removed conditional logic checking config["results"]["storage_format"]
  • InferenceDataSetWriter remains for legacy compatibility but is not default

Reader selection (unchanged):

  • Auto-detects format by checking for lance_db/results.lance directory
  • Falls back to InferenceDataSet for .npy files

Design document updates:

  • Step 6 Phase 2: Removed planned removal of storage_format config (since it was never added)
  • Resolved Ambiguities A6: Changed from config key naming to writer selection approach

Before:

def create_results_writer(original_dataset, result_dir):
    storage_format = original_dataset.config["results"].get("storage_format", "lance")
    if storage_format == "npy":
        return InferenceDataSetWriter(original_dataset, result_dir)
    else:
        return ResultDatasetWriter(original_dataset, result_dir)

After:

def create_results_writer(original_dataset, result_dir):
    # Always use Lance format for new writes
    return ResultDatasetWriter(result_dir)

Code Quality

  • I have read the Contribution Guide and agree to the Code of Conduct
  • My code follows the code style of this project
  • My code builds (or compiles) cleanly without any errors or warnings
  • My code contains relevant comments and necessary documentation

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

gitosaurus and others added 8 commits February 11, 2026 13:43
* Use dedicated lance_db subdirectory for LanceDB storage

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: gitosaurus <6794831+gitosaurus@users.noreply.github.com>
* Update lance_design.md terminology to match actual codebase

Co-authored-by: gitosaurus <6794831+gitosaurus@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: gitosaurus <6794831+gitosaurus@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: gitosaurus <6794831+gitosaurus@users.noreply.github.com>
Copilot AI changed the title [WIP] Update design for LanceDB integration in Hyrax Remove storage_format config option from LanceDB design Feb 12, 2026
Copilot AI requested a review from gitosaurus February 12, 2026 01:02
@gitosaurus
Copy link
Copy Markdown
Contributor

Conflicting separate solution. No final changes.

@gitosaurus gitosaurus closed this Feb 12, 2026
@gitosaurus gitosaurus deleted the copilot/sub-pr-683 branch February 12, 2026 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants