Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LanceFragment.to_batches not respecting filter kwarg #2778

Open
tonyf opened this issue Aug 22, 2024 · 1 comment
Open

LanceFragment.to_batches not respecting filter kwarg #2778

tonyf opened this issue Aug 22, 2024 · 1 comment
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@tonyf
Copy link
Contributor

tonyf commented Aug 22, 2024

It looks like to_batches isn't respecting the filter kwarg

Repro

import lance

ds = lance.dataset(path)
fragments = ds.get_fragments()

for batch in fragments[0].to_batches(
    batch_size=1, 
    filter="split == 'test'", 
    columns=["image", "split"], 
    with_row_id=True, 
    batch_readahead=8,
):
    break

print(batch)
>>>
pyarrow.RecordBatch
image: binary
split: string
----
image: ...
split: ["train"]

@tonyf
Copy link
Contributor Author

tonyf commented Aug 23, 2024

This seems to happen only under the legacy storage format. In stable this seems to be working correctly atm

@eddyxu eddyxu added bug Something isn't working good first issue Good for newcomers labels Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants