Skip to content

IvfModel.save/load and PqModel.save/load do not support storage_options #6311

@hushengquan

Description

@hushengquan

Problem

IvfModel.save() / IvfModel.load() and PqModel.save() / PqModel.load() accept a uri parameter that can point to cloud storage (e.g. s3://, gs://), but they do not accept storage_options. The underlying LanceFileReader and LanceFileWriter both already support a storage_options parameter for passing credentials and other backend-specific options, but the model save/load methods never forward it.

This means users currently can only authenticate via environment variables (AWS_ACCESS_KEY_ID, etc.) or a pre-configured credentials file (~/.aws/credentials). This becomes a problem when:

  • Working with multiple object storage instances that require different credentials (e.g. reading centroids from one S3 bucket and writing the codebook to another).
  • Running in environments where setting environment variables is not desirable or possible.
  • Needing to pass endpoint overrides for S3-compatible storage (e.g. MinIO).

Expected Behavior

IvfModel.save/load and PqModel.save/load should accept an optional storage_options dict and pass it through to LanceFileWriter / LanceFileReader, consistent with the rest of the Lance Python API (e.g. lance.dataset(), lance.write_dataset()).

from lance.indices.ivf import IvfModel
from lance.indices.pq import PqModel

storage_options = {
    "aws_access_key_id": "AKIA...",
    "aws_secret_access_key": "...",
    "region": "us-east-1",
}

ivf = IvfModel.load("s3://bucket/ivf.lance", storage_options=storage_options)
ivf.save("s3://another-bucket/ivf.lance", storage_options=other_storage_options)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions