Skip to content
This repository has been archived by the owner on Sep 27, 2022. It is now read-only.

Results from find_by_fuzzy_name often include nil entries #18

Closed
airblade opened this issue Sep 27, 2013 · 8 comments
Closed

Results from find_by_fuzzy_name often include nil entries #18

airblade opened this issue Sep 27, 2013 · 8 comments

Comments

@airblade
Copy link

I am trying out fuzzily v0.3.0 on a small (~360) set of records, in Rails 3.0.x.

When I search for a string, many of the results in the default set of 10 are nil. Is this expected? It makes pagination rather difficult.

Also the matches are extremely fuzzy, i.e. results often don't contain all the letters of the search string. For example aceven returns Ace Ventura as expected, but also Agentur Nina Klein – which is not what I want. Am I doing something wrong or does fuzzily not require all the letters in the search string to be present in the results, in the same order?

Thanks in advance.

@airblade
Copy link
Author

Regarding my second point, I re-read the README and the first paragraph states it find misspelled needles. Oops. So don't worry about that ;)

@mezis
Copy link
Owner

mezis commented Sep 27, 2013

No worries. Even though the matches are fuzzy (that's the point), they always should be ordered by best match—did you bump into a counter-example?

@airblade
Copy link
Author

The matches are indeed ordered by best match. But many of the result elements are nil. Is that correct behaviour?

Secondly, I search fuzzily across multiple model types and combine the results. While each type's matches are ordered by best match within that type, I can't find a way to get at the match score so I can order across all the types' results. Is this possible?

On 27 Sep 2013, at 17:56, Julien Letessier notifications@github.com wrote:

No worries. Even though the matches are fuzzy (that's the point), they always should be ordered by best match—did you bump into a counter-example?


Reply to this email directly or view it on GitHub.

@airblade
Copy link
Author

It turns out the nil results are caused by a mismatch of scopes.

I'm fuzzy-searching children in a has_many relation: parent.children.find_by_fuzzy_name 'some text'. Fuzzily searches all its trigrams, then maps each result to its owner, a "child" record. However not each child with a matching name belongs to the parent; when this happens the child's owner, within the scope of the relation, is nil.

For example:

# fuzzily supports this:
Employee.find_by_fuzzy_name 'joe bloggs'
# but we also need this:
@company.employees.find_by_fuzzy_name 'joe bloggs'

This can be fixed by changing Fuzzily to chain its trigram search off the searchable record, instead of starting at the trigram end and working back to the searchable record.

Would you be interested in a patch along these lines?

Regarding ranking results across types: I found a way to do this - see my comment in #16.

@airblade
Copy link
Author

airblade commented Oct 1, 2013

I fixed this here: airblade@dc8138d.

It also solves your N+1 problem where every result is mapped in turn to its owner.

@mezis
Copy link
Owner

mezis commented Dec 1, 2013

Thanks for this @airblade. I'll work my way through the scoring issue first.

@mezis
Copy link
Owner

mezis commented Dec 1, 2013

While I agree that the mapping can cause a cumbersome N+1, your proposed code doesn't work with Rails 2.3 which I still want to support (it's used in production in plenty of places).
The scoping/chaining is a problem too obviously (you can't easily chain find_by_fuzzy_name with another scope).

I'll keep working on this, no promises though ;)

@mezis
Copy link
Owner

mezis commented Dec 1, 2013

A problem with your approach (using JOINs), is that joining tends to break typical uses cases with ActiveRecord.
In your example,

`@company.employees.find_by_fuzzy_name('joe bloggs').count`

(a reasonable query) blows up badly. Not the fault of your code, but a classic with AR: #count and GROUP BY tend to not mix very well (since you'd need subqueries to do that in SQL, and AR is a bit too dumb).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants