Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TNTSearch returns no results for partial queries. #15

Closed
awgv opened this issue May 29, 2016 · 8 comments
Closed

TNTSearch returns no results for partial queries. #15

awgv opened this issue May 29, 2016 · 8 comments

Comments

@awgv
Copy link

awgv commented May 29, 2016

Hello,

I have a list of ~10k airports with additional data in a table. For example, I have a dynamic search input and want to select an airport in Berlin, if I request "b", I get one result "Vernon B. C." (a city in the USA), "be" gets me only "Nossi-Be" (Madagascar), "ber"/"berl"/"berli" — no results, and only full city name "berlin" finds and returns 6 Berlin airports. Same goes for every other query. I saw TNTSearch working very differently on the demo page, any ideas on what I might be doing wrong?

I'm indexing city names like so:

        $tnt = new TNTSearch;

        // I get "SQLSTATE[HY000] [14] unable to open database file"
        // error unless I load a configuration.
        $tnt->loadConfig([
            'driver'    => config('tntsearch.driver'),
            'host'      => config('tntsearch.host'),
            'database'  => config('tntsearch.database'),
            'username'  => config('tntsearch.username'),
            'password'  => config('tntsearch.password'),
            'storage'   => config('tntsearch.storage')
        ]);

        $indexer = $tnt->createIndex('airports.index');
        $indexer->query('SELECT id, city_name_ru, city_name_en FROM airports;');
        $indexer->run();

And here's my controller:

    public function searchCityWithAnAirport(Request $request)
    {
        $tnt = new TNTSearch;

        $tnt->loadConfig([
            'driver'    => config('tntsearch.driver'),
            'host'      => config('tntsearch.host'),
            'database'  => config('tntsearch.database'),
            'username'  => config('tntsearch.username'),
            'password'  => config('tntsearch.password'),
            'storage'   => config('tntsearch.storage')
        ]);

        $tnt->selectIndex('airports.index');
        $tnt->asYouType = true;
        $result = $tnt->searchBoolean($request->input('query'), 10);

        return $result;
    }

In case you'll have time to test it yourself, here's a Laravel migration file, and here's a CSV file with data.

@nticaric
Copy link
Contributor

The demo page uses the search not the searchBoolean method, so try with:

$tnt->asYouType = true;
$result = $tnt->search($request->input('query'), 10);

Does this solve your issue?

@awgv
Copy link
Author

awgv commented May 29, 2016

Thank you, it did. I've just realized that I tried search() method, but forgot to try it in conjunction with asYouType().

@awgv awgv closed this as completed May 29, 2016
@nticaric
Copy link
Contributor

Great! Can you share some performance data, how long does the indexing take and how fast are the result queries? I'm just eager to know how it works on uses cases other than mine :)

@awgv
Copy link
Author

awgv commented May 29, 2016

I'm developing it locally on Homestead, so I guess it doesn't matter right now — I'll remember to send you an email or mention here when the project goes live in a couple of months.

@nticaric
Copy link
Contributor

Ok, great. Oh yeah, and don't forget to order the results correctly. Something like:

$airports = Airport::whereIn('id', $result['ids'])->orderByRaw("FIELD (ID, $result['ids'])")->get();

because mysql doesn't keep the order when you use whereIn

And don't forget to star the package ;)

@jonstavis
Copy link

jonstavis commented Jul 6, 2016

I'm running into a similar issue and am wondering if you could offer some advice.

I have two rows in my db that are getting indexed, with column values

  • 3-(Trimethoxysilyl)propyl Acrylate
  • Propylparaben

When I search for 'propyl' I only get the first result. When I search for 'propylp' I get the second result.

I looked at the query being executed when asYouType is true and see this if I run it directly against the search index:

sqlite> select * from wordlist where term like 'propyl%' order by length(term) asc, num_hits DESC limit 1;
1168|propyl|1|1

If I modify the query slightly I see this, which looks like each is being treated as a separate word with 1 hit only:

sqlite> select * from wordlist where term like 'propyl%' order by length(term) asc, num_hits DESC;
1168|propyl|1|1
442|propylparaben|1|1

Is there something that could be happening when building the index that omits the 'propylparaben' hit from the 'propyl' wordlist since there is a space immediately following 'propyl'?

Thanks!

@nticaric
Copy link
Contributor

nticaric commented Jul 6, 2016

The words propyl and propylparaben are two different words that only share the same base which is propyl. If we omit the LIMIT clause like in your second example we could have performance problems with a larger dataset. Instead, we are returning the most frequent word that matches your base propyl.

If you think you'll have a small dataset and won't run into performance problems, you can query the index wordlist table directly

@jonstavis
Copy link

jonstavis commented Jul 6, 2016

Thanks for your response. I managed to come up with a solution that involves overriding several of the methods in TNTSearch.php. Is there a cleaner way to do this through the existing API?

sleepless pushed a commit to sleepless/tntsearch that referenced this issue Oct 25, 2017
* commit '4a992f620049380e6133f9c2a0d9a1d04bab84ff':
  Changed popmaterial-bulkorder@dfd.de password and server
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants