Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default search is divided into exact search and wildcard search #3073

Merged
merged 2 commits into from Feb 19, 2019

Conversation

@ltclm
Copy link
Contributor

@ltclm ltclm commented Jan 15, 2019

@davidoesch @gjn @danduk82
the standard sphinx query will be converted into a sphinx multi-query [1].
It's still one network request, but there are two queries executed on sphinx side.
The standard query has been splitted into a

  • exact match query, returning first result of each origin only
  • wildcard query (same configuration as before)

triggered by geoadmin/mf-geoadmin3#4287

Testlink

The results of the exact match query (one per origin) will be prepended to the results of the wildcard query.
p.e search for

Entle
Dom
Rotten
Ricken

The results of the exact match search will be on top of the list now.

question:

  • how should we merge the results of the two searches?

a) search1 (exact) results on top of search2 (wildcard) results current implementation in this pr
p.e.

search1->gg25
search1->gazetteer
search2->gg25
search2->gazetteer
search2->gazetteer
search2->gazetteer
search2->address
search2->address
search2->address
search2->address

b) search1 (exact) results on top of each search2 (wildcard) results origin
p.e

search1->gg25
search2->gg25
search1->gazetteer
search2->gazetteer
search2->gazetteer
search2->gazetteer
search2->address
search2->address
search2->address
search2->address

We will also have to do some yandex load tests an monitor the impact on sphinx of this change.

TODO:

  • Take a decision how to merge the results (exact on top of wildcard) how many of the exact search results should be used (currently one per origin)

"Take a decision how to merge the results (exact on top of wildcard) how many of the exact search results should be used (currently one per origin)"
-> All exact results are shown in top. User gets what he expects

  • load tests
  • customer feedback (chmobil, others?)

[1] http://sphinxsearch.com/docs/current/multi-queries.html

@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Jan 15, 2019

Needs to be approved by SCHWEIZ Mobil as well

Loading

@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Jan 15, 2019

Let me know when I cans send the testlink to schweizmobil c2c
I made some test and it seems to work

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 15, 2019

First @gjn @davidoesch have to decide how the results are merged.
Then, i will modify the code and do some load tests.
Then we can send the testlink.

Loading

@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Jan 15, 2019

"Take a decision how to merge the results (exact on top of wildcard) how many of the exact search results should be used (currently one per origin)"
-> All exact results are shown in top. User gets what he expects

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 15, 2019

ok, so for the merge it will be option a).

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 15, 2019

@danduk82 @davidoesch
i did some load and performance tests with yandex:

the new multi-query requests are two times slower than the old requests (85ms -> 160ms)
the capacity stays the same, first 500 errors occur with 20 requests/s

see here https://docs.google.com/document/d/1qO3Si9xWCKIaca0pvhoZw7IJZGVPapbCG8wDkkoC73o/edit?usp=sharing

Loading

@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Jan 15, 2019

ouch... performance goes a long way.
if we reduce the exact results to max of 10 ?

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 15, 2019

the exact search query already returns a maximum of one object per rank/origin, thus 7 objects (zipcode,gg25,district,kantone,gazetteer,address,parcel)
self.sphinx.SetGroupBy('origin', sphinxapi.SPH_GROUPBY_ATTR, '@group desc').
We have to scan the index twice instead of once, that explains pretty much the loss in performance i guess.

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 15, 2019

well, the difference in the performance is obviously not that big, i only ran the benchmark test for one minute. I will repeat the test with a longer period and update the ticket.

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 15, 2019

so the performances with a longer test benchmark period are quite equal, see results of test 5 rps / 300s in the spreadsheet. Also the 2nd run in the test 5 rps / 60s more or less the same performance.

Loading

@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Jan 15, 2019

Will do a 1:1 with u tomorrow

Loading

@ltclm ltclm force-pushed the dev_ltclm_exact_search branch from 1543237 to 8f32515 Jan 16, 2019
@danduk82
Copy link
Contributor

@danduk82 danduk82 commented Jan 16, 2019

@ltclm , where do you take the numbers 85ms -> 160ms?
from what I see in your test results, median times and 90percentile are slower, ok, but not that significantly. More 30% than 200% slower.

Loading

@danduk82
Copy link
Contributor

@danduk82 danduk82 commented Jan 16, 2019

@davidoesch can you check with CHmobil?

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 16, 2019

@ltclm , where do you take the numbers 85ms -> 160ms?
from what I see in your test results, median times and 90percentile are slower, ok, but not that significantly. More 30% than 200% slower.

from 1st run from 5 rps / 60s test

new has  mean : 169.65505723905721
old has mean: 85.58455555555555

Loading

@danduk82
Copy link
Contributor

@danduk82 danduk82 commented Jan 16, 2019

I don't trust mean values, median and percentiles are much more reliable.

Loading

@danduk82
Copy link
Contributor

@danduk82 danduk82 commented Jan 16, 2019

and test 2 shows exactly the opposite behavior, do you know why? some sort of caching?

Loading

@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Jan 16, 2019

I'll test it side by side with clm -- i do trust rather what I see on the screen that the stats ;-)

Chmobil infomred they will test

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Jan 17, 2019

fyi i have changed the search behaviour to:

  • exact search: first 10 results will be prepended to
  • wildcard search (unchanged)

this will be easier to explain and to tweak than a groupby

Loading

@ltclm ltclm force-pushed the dev_ltclm_exact_search branch from f6af9f3 to c1e37fb Jan 17, 2019
@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Feb 14, 2019

LGTM we have a go from schweizmobil

Loading

@ltclm ltclm added this to the 2019-02-27 milestone Feb 14, 2019
@ltclm ltclm force-pushed the dev_ltclm_exact_search branch from c1e37fb to 867c5d9 Feb 14, 2019
@danduk82 danduk82 merged commit 2138be5 into master Feb 19, 2019
1 check passed
Loading
@danduk82 danduk82 deleted the dev_ltclm_exact_search branch Feb 19, 2019
@gjn
Copy link
Contributor

@gjn gjn commented Mar 4, 2019

Almost 2 weeks live with this now (released on 22nd of February). And here's the reality:

image

Almost no detectable/significant change in average, 50th percentile and 90th percentile response times on the location search. I think that the 2 requests are executed in parallel and sphinx can handle it quite well.

I guess the change could lead to a lower ceiling (number of total concurrent requests) we can handle, so a spike will lead to sphinx to break earlier, but standard traffic, it handles equally well.

Thoughts? @danduk82 @ltclm @davidoesch

Loading

@ltclm
Copy link
Contributor Author

@ltclm ltclm commented Mar 5, 2019

@danduk82 @davidoesch
i did some load and performance tests with yandex:
the new multi-query requests are two times slower than the old requests (85ms -> 160ms)
the capacity stays the same, first 500 errors occur with 20 requests/s
see here https://docs.google.com/document/d/1qO3Si9xWCKIaca0pvhoZw7IJZGVPapbCG8wDkkoC73o/edit?usp=sharing

We did some load tests with yandex before putting this change to production. the max number of requests/sec stays the same : 20 req/sec.

Loading

@davidoesch
Copy link
Contributor

@davidoesch davidoesch commented Mar 5, 2019

so far no negative feedback from customers

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants