Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matcher.rs#search_frecent is very slow #589

Closed
grigoryk opened this issue Jan 30, 2019 · 6 comments
Closed

matcher.rs#search_frecent is very slow #589

grigoryk opened this issue Jan 30, 2019 · 6 comments

Comments

@grigoryk
Copy link
Contributor

Together with @csadilek we sat down to figure out what's causing mozilla-mobile/reference-browser#395, and we narrowed it down to regressive cases of search_frecent.

Happy case is where we hit the requested limit of results (#322) and stop asking matchers for results. We're seeing fast-enough performance here. On my test device (Nexus 6P) search_frecent call which is able to bail out early (because it accumulated enough results) is taking around 20-30ms.

In a regressive case when there are either no results, or relatively little (under the limit), search_frecent is taking around 10x the happy case. I've seen timings go into 300ms territory on Nexus 6P.

This slowness is then causing problems upstream, and we end up with a very sluggish interface when user is typing in a URL they've never visited before, for example. Reported delays are 1-2s between key strokes.

I'm timing call to match_with_limit specifically, not the individual matchers. Currently I don't know if it's a single matcher that's causing a slowdown, or if it's a death by a five papercuts.

We've also observed a cascading slow-down effect if the user is typing quickly, with the delay growing somewhere in the JNA machinery. Rust calls + JNA processing may take up to 700-1000ms in very regressive cases. JNA processing goes back to normal if you slow down the rate of calls. Not yet sure what's causing this, but I think if we fix the underlying slowness it won't be a problem. Time it takes for the underlying match_with_limit to complete is mostly a constant and function of its input.

@thomcc
Copy link
Contributor

thomcc commented Jan 31, 2019

Every couple of weeks I spend some time after work trying to microoptimize this. I'm able to get it maybe 10-15% faster, but that's all. This isn't enough of a benefit to fix the issues, and usually requires introducing a bunch of unsafe in autocomplete_match's implementation. Fundamentally, we can't put an index on this and so SQLite has to run autocomplete_match on a lot of items in order to find what it needs.

FTS seems to be the way forward here. I've sketched out what this looks like before.

That said, the following things also will help:

  1. Landing cancellable autocomplete, e.g. the mostly untested / unfinished https://github.com/mozilla/application-services/tree/kotlin-places-autocomplete/ branch
  2. It sounds like you're blocking user input waiting for autocomplete results. You really shouldn't do that.

@rfk
Copy link
Contributor

rfk commented Jan 31, 2019

Naive question: what does desktop do to make this fast?

@thomcc
Copy link
Contributor

thomcc commented Jan 31, 2019

@rfk It's about 2x as fast as ours, AFAICT (It's really hard to measure due to everything happening on a background process which is responsible for all storage queries, though). Most of it you don't notice since you aren't blocking input waiting on autocomplete results. It also doesn't offer a way for us to get EXPLAIN QUERY PLAN output from desktop short of custom build that inserts it into the C++ and dumps the results.

Even after micro-optimizing AUTOCOMPLETE_MATCH some (since it's the only part where the problem is clearly 'rust code is being slow'), to the point where that function is no longer the bulk of the time spent, we're still way slower than desktop.

The only real difference AFAICT is that we're using SQLcipher and not SQLite. We've fixed the most egregious issue for this, at least for unencrypted connections (the fact that it spammed mlock/munlock calls), however it's completely possible it's doing more to slow things down that are less obvious (I haven't profiled without sqlcipher recently).

Beyond that, we also have a schema that's slightly different in a moderate number of very small ways. It's also possible we aren't updating frecency well enough, which desktop uses to exclude certain rows (some of that is fixed by #429).

Anyway, before I said the solution here is FTS. It's not:

SQLite's FTS only supports full word/token matches or prefix matches (and even prefix matches it won't do well by default, unless you tell it to store an additional prefix index). E.g. it can't ever match ample against example.com. This seems like a dealbreaker for us. It could make the 'fast' queries faster, but we'd still want to run AUTOCOMPLETE_MATCH if it failed to find anything.

There's still some stuff we can do to speed this up though.

  1. Right now we (mimicing desktop) run the query twice in the case no matches (or too few matches) are found. The first run only works if the search tokens appear at the start of words, the 2nd run is for anywhere in the word. We don't need to do this. If we really find it valuable, we can do the 'start of word' filter in memory or something.

  2. There are also some micro-optimizations I've applied before (that I almost certainly have in a git stash) to speed AUTOCOMPLETE_MATCH up some:

    • We don't need to take ownership of each string anymore (I landed changes to rusqlite to allow borrowing strings out of rust).
    • We spend the bulk of our time in unicode character decoding right now, since we're using the rust stdlib, desktop has some custom code for this, it wasn't hard to port but saved less than I had wanted, and when I wrote it I was still thinking the real solution here was FTS.

@thomcc
Copy link
Contributor

thomcc commented Jan 31, 2019

That said, again, this is absolutely not instantly responsive on desktop (they just don't block the UI), and even had we implemented FTS, it would still not be fast enough for blocking the UI while waiting for autocomplete results to be a good strategy.

IMO there are some perf issues here we can attempt to address, but it's a bug that fenix/ref browser is doing this.

@thomcc
Copy link
Contributor

thomcc commented Jan 31, 2019

I actually was able to do a better job here than expected (even without semi-gross hacks) in #590.

@thomcc
Copy link
Contributor

thomcc commented Feb 4, 2019

Fixed in #590

@thomcc thomcc closed this as completed Feb 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants