Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faceted search not working with docker image v0.4.0 #76

Closed
klausondrag opened this issue Mar 7, 2020 · 2 comments
Closed

Faceted search not working with docker image v0.4.0 #76

klausondrag opened this issue Mar 7, 2020 · 2 comments
Assignees

Comments

@klausondrag
Copy link
Contributor

Hi. I've been trying out bayard and it's great so far. One thing that I've noticed now that the faceted search doesn't seem to be working. I'm using this docker image. It is tagged as v0.4.0 and has been pushed 2 months ago. Looking at CHANGES.md, it looks like faceted search has been implemented in v0.3.0 so I would assume it's already available in the docker container.

Perhaps my understanding of faceted search is wrong. Here's a minimal example:
data.jsonl (inspired from tantivy examples):

{"_id": "1", "name": "Cat", "category": ["/Felidae/Felinae/Felis"]}
{"_id": "2", "name": "Canada lynx", "category": ["/Felidae/Felinae/Lynx"]}
{"_id": "3", "name": "Cheetah", "category": ["/Felidae/Felinae/Acinonyx"]}
{"_id": "4", "name": "Tiger", "category": ["/Felidae/Pantherinae/Panthera"]}
{"_id": "5", "name": "Lion", "category": ["/Felidae/Pantherinae/Panthera"]}
{"_id": "6", "name": "Jaguar", "category": ["/Felidae/Pantherinae/Panthera"]}
{"_id": "7", "name": "Sunda clouded leopard", "category": ["/Felidae/Pantherinae/Neofelis"]}
{"_id": "8", "name": "Fossa", "category": ["/Eupleridae/Cryptoprocta"]}

schema.json:

[
  {
    "name": "_id",
    "type": "text",
    "options": {
      "indexing": {
        "record": "basic",
        "tokenizer": "raw"
      },
      "stored": true
    }
  },
  {
    "name": "name",
    "type": "text",
    "options": {
      "indexing": {
        "record": "position",
        "tokenizer": "en_stem"
      },
      "stored": false
    }
  },
  {
    "name": "category",
    "type": "hierarchical_facet"
  }
]

Then, through the web api, I request the following:

curl -X GET 'http://localhost:8000/index/search?query=cat&from=0&limit=10&facet_field=category&facet_prefix=/Felidae/Felinae'

which results in

{
  "count": 1,
  "docs": [
    {
      "fields": {
        "_id": [
          "1"
        ],
        "category": [
          "/Felidae/Felinae/Felis"
        ]
      },
      "score": 2.016771
    }
  ],
  "facet": {
    "category": {
      "/Felidae/Felinae/Felis": 1
    }
  }
}

This is what I expect because I'm searching in the correct category. However, searching in a different category will yield the same document:

curl -X GET 'http://localhost:8000/index/search?query=cat&from=0&limit=10&facet_field=category&facet_prefix=/Eupleridae'
{
  "count": 1,
  "docs": [
    {
      "fields": {
        "_id": [
          "1"
        ],
        "category": [
          "/Felidae/Felinae/Felis"
        ]
      },
      "score": 2.016771
    }
  ],
  "facet": {
    "category": {}
  }
}

I would expect 0 documents to be returned, since no element has the name "cat" in the category "/Eupleridae".

I also noticed that "facet" is filled differently but I'm not sure how to interpret that.

This is just a minimal example. I've had a more data and I've queried for terms which exist in a category, but still other elements were returned. Am I misunderstanding faceted search, using bayard wrong, am I using an unreleased feature or is this indeed a bug?

@mosuka
Copy link
Owner

mosuka commented Mar 10, 2020

@klausondrag
Thanks for reporting this.
I'll check later.

@mosuka mosuka self-assigned this Mar 10, 2020
@BlackGlory
Copy link

tantivy's FacetCollector only collect relevant counts. If you want to reduce the query results, you should specify it in the query.

@mosuka mosuka closed this as completed Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants