-
-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eagerly fill marks and matches cache #75
Eagerly fill marks and matches cache #75
Conversation
and progressively fill/remove items instead of completely regenerating the whole cache.
by passing in a guess for the index.
Instead of copying the result data, we exchange it for an empty array and return the old result data. The caller can do the potentially slow copy if necessary, or handle the new data differently, skipping a perhaps unnecessary copy. Currently we still do the copy, but this allows us to optimize that step. To reflect the new behavior, the methods dealing with the exchange have been appropriately renamed. There's also a small change in that the pointer-parameters are turned into references.
We cannot properly guess a good start_index like for glogg, which has a single-threaded linear search. But we can collapse two loops when we receive two matches: The first loop which sorts the new matches into matching_lines_ and the second one which puts them into the filteredItemsCache_.
except when we absolutely can't, i.e. an empty container.
I've adapted this to latest changes in separate branch gin-ahirsch-eager-filtered-items-cache-klogg (a055fe6). It seems to work fine, however I notice some decrease in search performance on my computer. For original build with lazy cache:
For build with eager cache:
So lazy cache runs at about 2,4M lines/s, and eager cache gives 2,2M lines/s for same file and regular expression. |
I guess the commit "Let LogFilteredDataWorkerThread::updateSearchResult() move the result" could maybe affect cache locality negatively. Could you perhaps try running the search on that commit and compare to the previous one? If so, we can maybe mitigate the cost somewhat. In any case, UI latency should improve since we can be smarter about updating the |
In case moving the
Edit:
|
There's a stale comment in your merge-commit: klogg/src/logdata/src/logfiltereddata.cpp Line 525 in a055fe6
|
Thanks for your research. I'll need to do more profiling. Also my benchmarks are not very accurate and improving UI latency might worth some minor decrease in search speed. |
The performance hit was also because of merging new search results in UI thread. I've switched back to providing all search data as well as new lines for incremental cache filling. Due to immer persistent structures this does not involve to much data copying. |
Another goot test case for this is searching for very common line (I use log 1Gb file and line that matches 6867600 times). Without eager cache filling search gradually gets slower. With this PR it progresses at almost constant speed. Thanks again 👍 |
As promised in #37, here's the rebased work.
For some reason I also need to apply the following patch, otherwise my
item
is corrupted. I think this occurs sometimes when adding marks. While this does fix it, I don't see why the previous code would be wrong. Maybe it's just something else on my setup triggering the failure, so I didn't include it.Make sure to play with the cache a little! Well, I'm sure you will anyways, to evaluate if you want this change or not.