Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disappointing read performance #20

Closed
wereHamster opened this issue May 17, 2014 · 14 comments
Closed

Disappointing read performance #20

wereHamster opened this issue May 17, 2014 · 14 comments

Comments

@wereHamster
Copy link

I did a simple read benchmark with criterion and the results are very disappointing. The numbers indicate that the driver can only sustain ~25 reads/s (or ~40ms per query). The benchmark code is here.

The hardware I was running the benchmark on is a Lenovo x220, i7-2640M 2.80GHz, 16G RAM.

@wereHamster
Copy link
Author

I should add:

  • RethinkDB Server 1.12.4
  • GHC 7.6.3
  • haskell rethinkdb driver 1.8.0.5

@AtnNn
Copy link
Owner

AtnNn commented May 19, 2014

Thanks for the details. It may be related to alphaHeavy/protobuf#4.

RethinkDB 1.13 will deprecate the protobuf-based protocol. I don't think I will fix this issue directly, but instead switch to the new protocol when that version comes out.

@NathanHowell
Copy link

I wouldn't expect protobuf to make that much of an impact on runtime unless you have profiling results that say otherwise. I would take a look at disabling TCP nagle and buffering though.

hSetBuffering h NoBuffering
setSocketOption s NoDelay 1

@AtnNn
Copy link
Owner

AtnNn commented Oct 8, 2014

Upgrading to rethinkdb 1.15, Adding hSetBuffering h NoBuffering and switching to the json api instead of protobuf does not fix this issue. Queries still seem ot take 40ms.

@wereHamster
Copy link
Author

I did some benchmarks with my own driver (https://github.com/wereHamster/rethinkdb-client-driver). It uses the JSON API.

Looking up a single empty object ({ "id": "x" }) in an otherwise empty table is very fast, about 2.5s per 10k iterations (0.25ms per single call). I also plan to test something which doesn't hit the database (such as ADD 1 2 or even CONSTANT 1) to see how much overhead there is on the protocol level. I already have a few tests (https://github.com/wereHamster/rethinkdb-client-driver/blob/master/test/Test.hs#L73-L88), I just need to wrap them in criterion.

Once the object gets bigger, the lookup takes orders of magnitude longer. 30s / 10k iterations (3ms per call) and more. I stopped testing after that. I don't know whether the time is spent on the client parsing the response or in the server.

@AtnNn
Copy link
Owner

AtnNn commented Oct 9, 2014

@wereHamster Awesome. Those are the kind of times I would expect.

I believe my code in Network.sh is doing something wrong. When I run multiple queries in parallel, the average time goes down a lot. I will be doing some profiling.

@codedmart
Copy link
Collaborator

@AtnNn could this be a bug in Network using connectTo? The mongo driver suffers from a similar lag as well and they are using connectTo from Network. It looks like @wereHamster is using createSocket for his connection. This is all just speculation right now as I have not looked into it and I could be way off. I will look into it more after the weekend.

@wereHamster
Copy link
Author

connectTo is a simple wrapper around getaddrinfo and the socket syscall. I don't believe that it makes any difference.

@codedmart
Copy link
Collaborator

@wereHamster Thanks! Maybe I should read through the code more before speculating or commenting 😄

@codedmart
Copy link
Collaborator

@AtnNn for the record I did try setting the socket option as @NathanHowell mentioned:

setSocketOption s NoDelay 1

That didn't seem to have any effect as well.

@AtnNn
Copy link
Owner

AtnNn commented Nov 1, 2014

In a quick test, adding setSocketOption s NoDelay 1 gives me 309.9 μs instead of 41 ms for point gets.

@codedmart
Copy link
Collaborator

Hmm.... Maybe I did something off or wrong.

@AtnNn
Copy link
Owner

AtnNn commented Nov 1, 2014

With this change, I was able to get over 4000 reads per second on a single connection by running bench point-get. It would be possible to get even more by using multiple connections.

reads per second

I've pushed my changes to master. All tests pass on GHC 7.8. I'll release this and @codedmart's other changes after testing with GHC 7.6.

@AtnNn AtnNn closed this as completed Nov 2, 2014
@wereHamster
Copy link
Author

I don't think the improvement is because of the NoDelay socket option. I don't use it in my driver and it is even faster when doing the same point-get benchmark (~170us vs ~250us on my laptop). I would attribute it to switching to the JSON protocol.

Good job nonetheless. My web app feels quite a bit faster now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants