Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Succeed a search even with spelling errors #14428

Merged
merged 54 commits into from Dec 24, 2019
Merged

Conversation

@Lathanao
Copy link
Contributor

Lathanao commented Jun 29, 2019

Questions Answers
Branch? develop
Description? Implementation of the Levenshtein algorithm in the default search provider.
Type? improvement
Category? CO
BC breaks? no
Deprecations? no
Fixed ticket? #14588
How to test? Just run a vanilla Presta with the updated search provider, search something. Spelling errors are welcome.

This change is Reviewable

@Lathanao Lathanao requested a review from PrestaShop/prestashop-core-developers as a code owner Jun 29, 2019
@Lathanao

This comment has been minimized.

Copy link
Contributor Author

Lathanao commented Jun 29, 2019

Hi,
Here, just an implementation of the Levenshtein algorithm.
https://www.php.net/manual/en/function.levenshtein.php

The goal is to find some products with the search functionality. Even if you get some miss spelling in the search input.

That works perfectly with the vanilla Prestashop with default products.

Tooltip_007

For example, you search "ummi" instead of "hummi", search controller will be able to find the closest word, here "hummingbird", and will display all hummingbird products in the result.

Some precisions about the changes:

a. In case there are many close words, we take the word with the heaviest word of the list.
b. The algorithm is really fast, only 0.005 - 0.010 second to check 2000 words.
b. The algorithm could find some words, really far for the original.
c. Later in a another commit, we could improvement again the script, for example, by adding the possibility to modify the coefficients of the length of words we want to compare in the BO.
d. works with long queries with many words

Regards,
Lathanao

@Lathanao Lathanao changed the title CO: Succeed a search even with CO: Succeed a search even with spelling errors Jun 29, 2019

if ($distance[$a['word']] != $distance[$b['word']]) {
return $distance[$a['word']] < $distance[$b['word']] ? $a : $b;
} else {

This comment has been minimized.

Copy link
@PierreRambaud

PierreRambaud Jul 2, 2019

Contributor

useless else statement

@PierreRambaud

This comment has been minimized.

Copy link
Contributor

PierreRambaud commented Jul 2, 2019

Good idea, wdty @colinegin

@colinegin colinegin self-assigned this Jul 2, 2019
@Lathanao

This comment has been minimized.

Copy link
Contributor Author

Lathanao commented Jul 18, 2019

Hey!
Do I have any chance of it being merged?
Do you have something to do more ?
Regards,
Lathanao

@PierreRambaud PierreRambaud changed the title CO: Succeed a search even with spelling errors Succeed a search even with spelling errors Jul 18, 2019
@PierreRambaud

This comment has been minimized.

Copy link
Contributor

PierreRambaud commented Jul 18, 2019

@Lathanao Sorry, we're still waiting for PM to get their feedback.
But I add a label for our QA team to begin tests.

Copy link
Contributor

matks left a comment

Hi, I'm really worried about the SQL cost of this PR for huge catalogs (as reported in #14766) :/ I'm not sure we can merge it as it is right now.

Can we give some time for @matthieu-rolland to explore the possibilities he suggested in #14766 to reduce the possible performance stress on small servers ?

@Lathanao

This comment has been minimized.

Copy link
Contributor Author

Lathanao commented Jul 22, 2019

Hey !
No problem for that, I understand.

At the beginning, the $lenghtWordCoefMin and $lenghtWordCoefMax was some variable, depending how many word the database currently have.

I made a commit yesterday. You can see the switch. It's probably not the best way, cause the code is a bit dirty.

I will be very glad if I could create functionalities as mentioned in #14766 and merge after validation.

I check this this week if I can do something better.

@LouiseBonnard

This comment has been minimized.

Copy link
Contributor

LouiseBonnard commented Aug 6, 2019

Hey @Lathanao, did you have time to check the requested changes? ;-)

@Lathanao

This comment has been minimized.

Copy link
Contributor Author

Lathanao commented Aug 7, 2019

Hi @LouiseBonnard ,
Yes, I saw requested changes. I'm sorry, I was a little busy last week. I try to do something for Sunday evening.
Regards.

@LouiseBonnard

This comment has been minimized.

Copy link
Contributor

LouiseBonnard commented Aug 7, 2019

@Lathanao, no worries, we were just wondering about it, thanks for the reply. ;-)

@PierreRambaud PierreRambaud force-pushed the Lathanao:patch-3 branch from 4d35894 to 5e11456 Dec 17, 2019
@PierreRambaud

This comment has been minimized.

Copy link
Contributor

PierreRambaud commented Dec 19, 2019

Pull request rebased with a little refactoring :)

@Hlavtox

This comment has been minimized.

Copy link
Contributor

Hlavtox commented Dec 19, 2019

@PierreRambaud @matthieu-rolland Guys, please, before you finally merge it, test that it doesn't find half of the shop for every query... 😄

In PS's current form, if you search "red car", it finds all things that are red and all cars. (Unless you go back in time a bit, as I proposed in my PR: #16023 😎)

@Hlavtox

This comment has been minimized.

Copy link
Contributor

Hlavtox commented Dec 19, 2019

And one more thing. @Lathanao

Guys, can you please test this scenario?
Make a product called Prestashop S-100.
Does the algorithm find it, if you search Prestashop S 100 or Prestashop S100?

Otherwise, this will be super awesome!!!!!!!

@PierreRambaud

This comment has been minimized.

Copy link
Contributor

PierreRambaud commented Dec 20, 2019

@Hlavtox I'm on it to integrate your commits to this pull request.

@PierreRambaud

This comment has been minimized.

Copy link
Contributor

PierreRambaud commented Dec 20, 2019

@Hlavtox Wasn't your original commit, but I make sure you're the author of the changes I've done to make it work 😉

Plus, I tried with a specific name like S layer or Slayer and everything is ok.
Watch out if you use S-100 or S 100 because the search system is also looking for words in description and attributes values, so you'll see result for the attribute size 100 as example :)

It's something we can improve later, with a on/off button to search only on the title, or on everything. But this pull request must be merge 👍

Copy link
Contributor

matthieu-rolland left a comment

Thank you @Hlavtox , @PierreRambaud for making this PR even better ! 👍
And of course @Lathanao for the amazing work ! 🚀

And last but not least @LouiseBonnard , @matks , @colinegin and the whole PM team for your participation 👍

There will be optimizations in the near future, but this is an awesome start :) Let's pray for the QA validation now 😅

@Robin-Fischer-PS

This comment has been minimized.

Copy link

Robin-Fischer-PS commented Dec 24, 2019

Hi @Lathanao , @matthieu-rolland !

This time, it all seems to be good for me, so it's QA ✔️ !

Thanks for the hard work on this PR everyone :)

@Robin-Fischer-PS Robin-Fischer-PS added this to the 1.7.7.0 milestone Dec 24, 2019
@jolelievre jolelievre merged commit f4ab4b2 into PrestaShop:develop Dec 24, 2019
2 checks passed
2 checks passed
PrettyCI Code formatting
Details
Travis CI - Pull Request Build Passed
Details
@rdy4ever

This comment has been minimized.

Copy link
Contributor

rdy4ever commented Dec 25, 2019

I would've exepected this search to show "hummingbird"
image
.. or this to show "the adventure.."
image

But is seems common misspellings don't work

@Hlavtox

This comment has been minimized.

Copy link
Contributor

Hlavtox commented Dec 25, 2019

@rdy4ever @Lathanao

Good point, could this be integrated? 😎

@Lathanao

This comment has been minimized.

Copy link
Contributor Author

Lathanao commented Dec 25, 2019

Hey @rdy4ever @Hlavtox

Does your Prestashop works on the right branch?
Following your comments, I was worry. I just reinstalled a fresh Prestashop [Lathanao:patch-3]

I get :
image
image

With a fresh Prestashop, no need to set anything, Fuzzy search works fine without any setup in BO.
So, for me, everything is fine.

Did you check if your branch get last commits?

@Lathanao

This comment has been minimized.

Copy link
Contributor Author

Lathanao commented Dec 25, 2019

Just test PrestaShop:develop
Works fine too :
image

@Hlavtox

This comment has been minimized.

Copy link
Contributor

Hlavtox commented Dec 25, 2019

@Lathanao I didn't test it yet, just commented on @rdy4ever :-D

If it works, awesome. I will put it on our store after public release and I will instruct employees to record any use cases that could happen and would not find what they want. Maybe there will be some room for improvement. But so far, it looks perfect. 🚀

@rdy4ever

This comment has been minimized.

Copy link
Contributor

rdy4ever commented Dec 26, 2019

Hi @Lathanao
Yes, I have merged upstream/develop into my local develop, just before testing it. I didn't perform any database updates though, is there anything I need to change in the database?

@rdy4ever

This comment has been minimized.

Copy link
Contributor

rdy4ever commented Dec 26, 2019

@Lathanao My bad. I only put 5 at maximum word length in config. With 30 it works great. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.