Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multifields and arrays #464

Open
usamec opened this issue Jan 13, 2019 · 3 comments
Open

Multifields and arrays #464

usamec opened this issue Jan 13, 2019 · 3 comments

Comments

@usamec
Copy link

usamec commented Jan 13, 2019

How can tantivy currently support:
a) multifields (aka multiple analyzers for one field, ES equivalent https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html)
b) arrays (multiple values in one fieds, ES equivalent https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html).

Are there some plans for it? Or should this be handled in layer above?

My original usecase is that document might have multiple authors (thus array of strings) and also I want to analyze each author name by cutting into words (to do easy search by given name and family name), and also by keeping it all together (to do exact search/faceting on whole names).

@fulmicoton
Copy link
Collaborator

a) At one point we need to introduce the concept of mapping. Right now, the "input" schema is the same as the index schema. You cannot index a given field several times. This is a wanted feature.

b) arrays... I would have to read the elasticsearch documentation a little more to be certain that it is not hiding something complicated but in tantivy arrays of int and arrays of strings are called multivalued fields in tantivy.
They work out of the box for int and string.

The object does not work at all, but introducing a "schemaless" JSON-like field is a wanted feature. It will come with the same pitfalls as the disclaimer in the doc you shared : search will not work as expected.

Now Lucene, and ES have another feature called nested documents I think. This is more complicated and I don't think it will happen any time soon.

@arifd
Copy link

arifd commented Jul 13, 2023

Hi @fulmicoton,

They work out of the box for int and string

Could you point me to where/how I can use arrays/multivalued fields?

Looking at the Value kinds, I don't see it.

I would like very much to attach a Vec of u64s or Strings to a document, and tell the Searcher to return documents only where the field of Vec of data contains 'x'.

@PSeitz
Copy link
Contributor

PSeitz commented Jul 14, 2023

You can just add the field multiple times in the Document.

        index_writer.add_document(doc!(
            date_field => DateTime::from_timestamp_secs(1000),
            date_field => DateTime::from_timestamp_secs(1001),
        ))?;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants