New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite selector compiler for minimongo #480

Closed
wants to merge 13 commits into
base: devel
from

Conversation

Projects
None yet
2 participants
@Ed-von-Schleck
Contributor

Ed-von-Schleck commented Nov 14, 2012

I've written a new selector compiler that works without eval. Main point is speed - on my sluggish netbook the tests for selector_compiler are roughly 3x as fast, and I expect it to be even faster in production, when the JIT kicks in for often-used functions.

All the tests pass, but they are very incomplete, so I will probably be working on those next (depending on the time I find).

I know it's not in a mergable state right now (both feature-wise and quality-wise), but I thought I let you know what I'm working on and would really like to hear your comments.

I'm closing #477 and #476 as a result of that (at least until I'm sure that these fast-paths are helpful with the new implementation).

@glasser

This comment has been minimized.

Member

glasser commented Nov 15, 2012

This looks like a great start! Let us know when it's ready to fully review and merge.

My main thought: let's make sure that we're benchmarking the right thing. There are two operations whose speed can be affected here: compiling a selector and executing it. For, eg, an observed cursor, the execution may happen many many more times than the compilation. My impression was that some of the advantage of eval is that the generated selector function could end up being very simple and so execution could be fast. Now, without benchmark, who's to say if that's actually true... and it looks like your code shouldn't add TOO much work to execution time.

But my point is just: let's make sure we're not just trying to optimize compilation in a vacuum without also looking at the effect on execution!

@Ed-von-Schleck

This comment has been minimized.

Contributor

Ed-von-Schleck commented Nov 15, 2012

I'm testing this mostly with my app, which I believe has a fairly standard usage of minimongo. I simply profile actual use of the site and check which functions take up a lot of time. With this work, the _compileSelector obviously disappears almost completely, and _evaluateSelector (the new main function introduced in this branch) shows up, but with typically between 2x - 15x less execution time.

I agree on your concerns with observed cursors, which call _evaluateSelector repeatedly. My app uses them rarely. Maybe the typical usage is different than mine, but I really can't tell (a benchmark package with both synthetical microbenchmarks as well as real-life applications would be great!). I'm not overly concerned, though. If a selector function is repeatedly called, the JS engine should mark it as hot and JIT it.

Now actually proving that a function is easily JITable is a different thing, I know. After I am sure that it works correctly (right now I'm looking into expanding the tests before I continue working on this branch), I am trying make sure that the hottest functions are monomorphic and such. I'm not an outright expert on optimizing JS, but there seams to be a lot of reading material for it, so I'm optimistic :)

@Ed-von-Schleck

This comment has been minimized.

Contributor

Ed-von-Schleck commented Nov 15, 2012

On repeatedly calling selectors: Longterm the best strategy might be something I vaguely recall from the CouchDB implementation: A cursor call always saves the result set, together with the selector. Whenever the collection changes, the saved selector functions are being run over the new/changed docs and the result set is being updated (can be done lazily, when the cursor is called next). Removed docs are being removed from all result sets, of course.

That would make both observed cursors really cheap, and would also greatly help the typical case of most queries being the same (check if cursor with same selector exists -> use that, together with its result set). Result sets could be precalculated or cached on the server, making startup faster. Memory might be an issue (but result set could be discarded after some time).

Then again, although I'd love to implement something like that, I doubt that I have the time to do it (gotta earn money). Might be an idea for someone, though ...

@glasser

This comment has been minimized.

Member

glasser commented Nov 15, 2012

Yes, that's roughly how cursors work now, client-side. The extra rub though, is that there are roughly two modes: one mode for applying one change at a time to all results (which is basically what you described), and a different mode for applying a batch of many changes. The reason is that since we can observe ordered lists, if you want to get an optimal set of "moved" callbacks you really want to process all of the changes which can affect ordering at once.

@Ed-von-Schleck

This comment has been minimized.

Contributor

Ed-von-Schleck commented Nov 16, 2012

Should have looked at the code before I started rambling ... and haven't thought about ordered lists at all. Hmm. I will keep this in mind when doing tests and benchmarks. Thanks for the hint!

Ed-von-Schleck added some commits Nov 16, 2012

Fixed bug where $and/$or/$nor accepted empty arrays
mongodb doesn't accept it, so we follow its example. Also fixed test
for that.
Replaced _compileSelector logic with non-eval'd code
A few tests still fail, and the original implementation is still in place.
Removed old implementation
The _f object remains because it is used elsewhere.
Passes all test
The tests are incomplete, I know. Next thing I have to do is probably
to expand those tests. Also the $where implementation is sadly missing
(should be a big deal, though).
@Ed-von-Schleck

This comment has been minimized.

Contributor

Ed-von-Schleck commented Nov 18, 2012

I rebased the branch on #485, because the test coverage was not good enough before (maybe I should have merged, but I didn't think that through before).

I'm quite confident that my implementation is faster than the original. The common cases (_id check or empty selector) are handled very quickly.

Also, addressing values in arrays (e.g. people.2.name) as well as $elemMatch are supported.

There is just one remaining issue: I didn't find any documentation how mongodb handles comparisons ($lt, $gt) between objects. I made my implementation match the tests, but I'd feel better if I knew how these comparisons are defined. If you could give me hints on that, I'd be really thankful.

Other than that, I believe the branch is ready for pulling.

@glasser

This comment has been minimized.

Member

glasser commented Dec 18, 2012

Hi Ed! Sorry this took too long, but I finally got around to reviewing your patch (I spent most of today on it!) You can take a look at the 'pr-480' branch to see what I did.

The first commit there is your changes, squashed together (and also removing some dead code from the original compiler): c5f4d5e

The next commit there is some cleanup I did while reviewing it: 8180d8e
Mostly, I wanted to avoid your top-level try/catch which would mask any TypeErrors caused by programming errors. In doing that, I discovered a few issues, like matching RegExp objects against arrays.

After that, I decided to benchmark this. You can see the benchmark I chose on the branch as 47b31c8 (though I really just kept it in a git stash and popped it onto various versions of the code as I worked). I ran tests for the minimongo package only and looked just at the time for the selector_compiler test. This benchmark commit makes it run the test 100 times, and for each match check, it compiled the selector once and executed it 100 times. I figured this would show a reasonable tradeoff between compilation and evaluation time.

Unfortunately, at least in Chrome on my Mac, this benchmark showed that your code was a slowdown. The test took about 2 seconds with the "old compiler" code but about 7.5 seconds with the "Ed compiler" (or, well, Ed compiler with my modifications). Yes, the compiler was definitely faster --- and I think your code was more readable and maintainable --- but when combined with evaluation time, that definitely seems like a slowdown.

I was inspired to keep trying, though. The old compiler managed to hardcode a pretty simple function with not too many nested function calls, but used the (slow) eval. Your compiler did very little compilation so that evaluation was still full of lots of table lookups. What if we tried something in the middle? I did another pass: ce204e0 This avoided using eval like yours did, but tried to make as many choices as possible at eval time rather than at runtime. This performed a little better than yours (around 5.5 seconds) but still not as well as the current code.

I'm not sure what to do next. I don't want to merge your compiler or mine, since their main inspiration is speed and they don't seem to improve speed in my benchmarks.

Maybe my benchmark is bad? Or maybe there are browsers where one of our compilers is so much faster than the old compiler that it makes up for the decrease in speed on Chrome? (ie, if it takes another browser from "incredibly slow" to "reasonably fast" while only dropping Chrome from "very fast" to "pretty fast", maybe that's good enough?)

Another possibility is that we should just do a function cache like you suggested. Did you say you already implemented one? A cache plus the eval compiler could be the best case (except for code clarity and extensibility :( )

In any case, I was inspired to do an implementation of "foo.1.bar" indexing for the old eval compiler, which I think works.

Thanks for putting the time into this --- I hope we can get some real speedups for minimongo out of this work.

@glasser

This comment has been minimized.

Member

glasser commented Dec 18, 2012

I'm going to close this for now, but hopefully we can get some more work done along these lines (eg, curious about the function cache that you wrote...)

@glasser glasser closed this Dec 18, 2012

@glasser

This comment has been minimized.

Member

glasser commented Dec 18, 2012

BTW, I took a quick stab at doing something that was just like the old eval compiler, except that it cached the results of eval. (ie, it called _exprForSelector every time and used that as a cache key (only in cases where literals was empty.) This didn't affect performance on Chrome and sped up FireFox 3.6 (a token "old browser") by about 15%.... which isn't enough to justify the additional complexity (especially since a real version would have to do something to keep the cache to a fixed size).

@Ed-von-Schleck

This comment has been minimized.

Contributor

Ed-von-Schleck commented Dec 20, 2012

Thanks for all the feedback! When time permits, I'm going to look more
closely on this after christmas.

One main motivation for my patch (that I didn't mention before, because as
of now it is this little more than a brainfart) was to make it easier to
add indexes to minimongo. I have a rough plan of how to do this, and
compiled selectors make it harder than it need be.

In this plan, I have indexes as hashmaps (+ordered arrays for range
lookups). Any query is pre-filtered with the indexes, then the remaining
query (if there even is one) is run through the standard selector
evaluator. This makes selector function caching (a.k.a a compiled selector)
both a lot less useful and complicated to maintain (an index may be
inserted after a query has been made, invalidating the compiled selector
function).

If you are interested in indexes for minimongo, I would be happy to draft
out my plan in more detail and start to work on an implementation.

2012/12/18 David Glasser notifications@github.com

BTW, I took a quick stab at doing something that was just like the old
eval compiler, except that it cached the results of eval. (ie, it called
_exprForSelector every time and used that as a cache key (only in cases
where literals was empty.) This didn't affect performance on Chrome and
sped up FireFox 3.6 (a token "old browser") by about 15%.... which isn't
enough to justify the additional complexity (especially since a real
version would have to do something to keep the cache to a fixed size).


Reply to this email directly or view it on GitHubhttps://github.com//pull/480#issuecomment-11474489.

@glasser

This comment has been minimized.

Member

glasser commented Feb 5, 2013

Hi @Ed-von-Schleck ! The original inspiration for this change was performance, and so my original decision to not accept the pull request was based on my benchmark not showing a performance improvement (and a performance regression, though possibly not one observable in real use).

I was disappointed, though, because your version was definitely more maintainable.

Over on the ddp-pre1 branch, @sixolet has extended Meteor to handle more types throughout the system. This meant that the eval-based compiler was more problematic: any additional types (ObjectID, Binary, etc) would have to know how to be serialized for eval, just because of the selector compiler implementation. So this was the bit of inspiration needed to accept the pull request and merge in your compiler (with some changes from me, plus a similar implementation of the sort function compiler). baceda9 is the merge commit.

So congratulations! This is one of the larger changes we've been able to merge from our open source community so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment