Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow first read when results.count is big (problem for 20k and above) #4886

Closed
fun2C0d3 opened this issue Apr 21, 2017 · 5 comments
Closed
Assignees
Labels

Comments

@fun2C0d3
Copy link

fun2C0d3 commented Apr 21, 2017

Goals

Efficiently searching for strings in a database large database containing strings.

Expected Results

Getting a Results object with all the matching results. Iterating over a few results should be fast since everything is lazy loaded.

Actual Results

When the returned Results object contains many values reading the first value (results.first/results[0]) takes a long time. This is when the query is actually executed right?
No matter how many results are found I only want to iterate over a few of the first results, do some work and then update the user UI.

So the problem is: When reading objects from results the FIRST read is very slow. About 6 seconds on an iPhone 5 if results.count = 80k. All subsequent reads takes less than a millisecond each.

Steps to Reproduce

Requires a database with many values. In my case I have about 500k MyClass objects, which only contains a string that also is the primaryKey.

let results = realm.objects(MyClass.self).filtered("aStringInMyClass BEGINSWITH[c] 's'").sorted(byKeyPath: "aStringInMyClass")
//For my dataset the above query gives results.count = 80k

let firstResult = results.first //This takes a long time when results contains many values.

Code Sample

let results = realm.objects(MyClass.self).filtered("aStringInMyClass BEGINSWITH[c] 's'").sorted(byKeyPath: "aStringInMyClass")

let firstResult = results.first //This takes a long time when results contains many values.

Version of Realm and Tooling

  • Realm framework version: 2.6.1
  • Realm Object Server version: ?
  • Xcode version: 8.3
  • iOS/OSX version: 9.3.5
  • Dependency manager + version: ?
@karagraysen
Copy link

Hey @fun2C0d3. Thanks for reaching out about this. Wanted to let you know that we've seen your issue and that someone will follow-up with more information soon.

@jpsim
Copy link
Contributor

jpsim commented Apr 21, 2017

Getting a Results object with all the matching results. Iterating over a few results should be fast since everything is lazy loaded.

Lazy loading helps here, but it doesn't make expensive operations magically disappear. For example, the fact that high-level "heavy" Objective-C or Swift model objects aren't created until accessed probably shaves off an order of magnitude of time to run the query, but it doesn't avoid the need to perform a case-insensitive string comparison operation on 500,000 strings, followed by sorting all 80,000 results.

No matter how many results are found I only want to iterate over a few of the first results, do some work and then update the user UI.

Because you're sorting the Results, Realm can't just terminate the query search after it finds a small number of objects matching the predicate, as that might not be the first match after the sort operation. So all 80,000 objects must be sorted before Realm can even serve you one of them. Skipping the sort operation would drastically improve the performance here.

let unsortedResults = realm.objects(MyClass.self).filter("aStringInMyClass BEGINSWITH[c] 's'")

// "cheap" as it just performs a "find first"
let anyResult = unsortedResults.first

// "expensive" as it exhaustively searches all 500,000 objects, then sorts all 80,000 results
let sortedResults = unsortedResults.sorted(byKeyPath: "aStringInMyClass")
let firstResult = sortedResults.first

// "cheap" as the search & sort has already happened
let lastResult = sortedResults.last

You have a few options to keep your app fast and/or responsive when dealing with large amounts of data.

1. Perform Queries Asynchronously

This won't speed up the query, but it will avoid blocking the thread on which the results are needed.

let unsortedResults = realm.objects(MyClass.self)
                           .filter("aStringInMyClass BEGINSWITH[c] 's'")
                           .sorted(byKeyPath: "aStringInMyClass")

// Query still hasn't been evaluated at this point, because of its "lazy-loading" property

let token = unsortedResults.addNotificationBlock { _ in
  // synchronous access to the Results, but only after it's been
  // calculated & prepared on a background thread
  unsortedResults.first
}

// Query still hasn't been evaluated at this point, because you still haven't accessed it synchronously.
// The notification block is invoked on the same thread on which it was added,
// but only when the Results have been calculated & prepared on a background thread.

// Later... (token must be retained at least until the initial notification is delivered)

token.stop()

2. Shard Your Data

Either by splitting up your 500,000 objects into different Realms, models, or properties. The grouping of this sharding is highly dependent on what makes sense semantically with the data you're storing/querying.

3. Precompute Your Queries

Realm's List type is inherently ordered, so you could store the same data as returned by the query+sort operation already ordered in a List property.

@fun2C0d3
Copy link
Author

Thanks for a very detailed response! 👍
Removing the sort does improve the query time allot. Now more like 0.5-1 second instead of 5.

Is it correct that queried elements will be returned sorted by their primaryKey?
That seems to be the case for me and the behaviour is consistent as far as I can tell.

@kishikawakatsumi
Copy link
Contributor

@fun2C0d3

Is it correct that queried elements will be returned sorted by their primaryKey?

No, it isn't. The order of Results isn't guaranteed. The order changes each time an object is added or deleted, for example.

I do not know if it is suitable for your use case though, but you can use List to save the object keeping the order. The List object keeps the order of each element.

@TimOliver
Copy link
Contributor

Hopefully that answered your questions @fun2C0d3! Please let us know if you have any additional follow-up!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants