Skip to content

bug: FTS gives duplicate results if searching over multiple columns #3188

@wjones127

Description

@wjones127

I'm noticing I get a duplicate result if there are multiple columns. Is this expected? @BubbleCal

This is using pylance v0.19.2.

import lance
import pyarrow as pa

assert lance.__version__ == '0.19.2'

data = [
    {"animal": "domestic rabbit", "description": "Eating carrots"},
]
data = pa.Table.from_pylist(data)

ds = lance.write_dataset(data, "./demo_fts", mode="overwrite")
ds.create_scalar_index("animal", index_type="INVERTED")
ds.create_scalar_index("description", index_type="INVERTED")

ds.to_table(full_text_query="domestic eating").to_pandas()
            animal     description    _score
0  domestic rabbit  Eating carrots  0.287682
1  domestic rabbit  Eating carrots  0.287682

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions