Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support for multiple gRPC channels #125

Merged
merged 3 commits into from
Jul 20, 2022
Merged

feat: support for multiple gRPC channels #125

merged 3 commits into from
Jul 20, 2022

Conversation

cprice404
Copy link
Collaborator

This commit is the result of several hours of experimentation
with gRPC tuning settings and running load tests with varying
numbers of gRPC channels.

See results here:
#120

The tl;dr is:

  • the python gRPC client seems to be tuned reasonably well out-of-the-box
    and I wasn't able to demonstrate any obvious perf improvements by
    using multiple channels or tweaking gRPC configs.
  • the javascript SDK performs slightly better than the python one, which
    was surprising to me. My best guess based on the data is that something
    about the async / network I/O frameworks in nodejs is able to do a tiny
    bit better job utilizing the CPU core, and that gives it a slight edge.

Copy link
Member

@eaddingtonwhite eaddingtonwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice perf testing write up in ticket. This looks like a good base can iterate on in future prs 👍 🚢

# We are hard-coding the value for now, because we haven't yet designed the API for
# users to use to configure tunables:
# https://github.com/momentohq/dev-eco-issue-tracker/issues/85
_NUM_CLIENTS = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for sharing the performance numbers and an amazing deep dive.

This is just a curiosity question - is there a reason why we chose to create multiple clients vs actually implementing channel pooling.

https://grpc.io/docs/guides/performance/#general and the related issue grpc/grpc#21386

I just want to make sure that gRPC won't do any sort of optimizations given that the channels have same arguments. If you have verified that the connections won't be reused then I guess this is a fine approach too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for bringing this up.

The main reason I'm creating multiple clients is just that it seemed like an easier implementation path.

When you say "implementing channel pooling" - do you just mean creating more than one channel inside of the grpc_manager class and then round-robining over those instead of doing the round-robin out here? If we have any reason to believe that there would be different performance characteristics for that approach then it would be worth trying out. Let me know if you have thoughts.

I didn't put a ton of thought into this, but from a customer API / UX perspective, I kind of like this approach because it would allow advanced customers to experiment with creating their own pool of MomentoCache objects without having to think about or tune the number of gRPC channels that each client uses.

re: "channels have the same arguments" - yes I read that in the docs and I was concerned about it, but I wasn't sure what to think about it for 2 reasons:

  1. creating multiple channels in the JS SDK very clearly made a difference so I kind of assumed the python one would handle the channel creation the same way, but that may be a faulty assumption.
  2. That mention in the docs about the channel arguments talks about "using a use-specific channel arg such as channel number", but then if you look at the grpc source where they steer you to read about config flags, the only setting I could find that looked related to this was the channel_id setting, and the comments in the code make it sound like that is only used for Objective-C. I didn't see any other settings that looked like they would cover what was being hinted at by the docs you linked above.

I can try out that channel id setting pretty easily, no harm in that.

If you see another setting in that list that looks like it would be suitable for that task I would definitely be interested in your thoughts there.

And lastly if you have any experience messing with their instrumentation / census / channelz stuff and you know an easy way for me to verify whether it is optimizing out the additional channels, I'd also love to know more about that! If not I can circle back around in the future and dig into that more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was always thinking this would be done at a lower level, so I had created this abstraction, so that we dont have to leak number of clients etc. in the higher level code as we are doing now

https://github.com/momentohq/client-sdk-python/blob/main/src/momento/aio/_scs_grpc_manager.py

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that's definitely an option, and I'm not 100% certain that the approach I've taken here is better. But it does allow the customers to construct multiple instances of the client directly if they want to, if they opt out of using SimpleCacheClient. And in languages like python/JS where you really have to spawn multiple processes to utilize multiple CPU cores, since we can't share the gRPC channels across processes, I guess I was thinking it's more natural to abstract away the gRPC part by keeping the relationship 1-1 there, and allow customers to create multiple instances of the client directly if they want more control than what SimpleCacheClient gives them.

I could be persuaded that it would be a better UX to push the abstraction down a level further so I would be happy to hear more of your thoughts about it. cc: @kvcache

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't have any practical knowledge about instrumentation.

The JS approach if it worked fine then this will work for other languages too 🤞 and we probably don' t need the the channel pooling at the grpc level. However there are these random comments that make me scratch my head a bit if all this is language specific - grpc/grpc#20985 (comment)

The documentation about channel pooling has been really hard to navigate. I really appreciate all your work that went into this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! I'll do a quick test using that channel_id setting to see if it makes any difference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some runs where I set that channel_id field and I didn't see any difference in performance, so, putting a pin in this for now. It may indeed be worth exploring later in the future. Especially if we have time to dig into the gRPC instrumentation to see what kind of insights that gives us.

Base automatically changed from python-retries to main July 20, 2022 21:54
This commit is the result of several hours of experimentation
with gRPC tuning settings and running load tests with varying
numbers of gRPC channels.

See results here:
#120

The tl;dr is:

* the python gRPC client seems to be tuned reasonably well out-of-the-box
  and I wasn't able to demonstrate any obvious perf improvements by
  using multiple channels or tweaking gRPC configs.
* the javascript SDK performs slightly better than the python one, which
  was surprising to me.  My best guess based on the data is that something
  about the async / network I/O frameworks in nodejs is able to do a tiny
  bit better job utilizing the CPU core, and that gives it a slight edge.
@cprice404
Copy link
Collaborator Author

rebased onto latest main

@cprice404 cprice404 merged commit 3606a77 into main Jul 20, 2022
@cprice404 cprice404 deleted the perf-tuning branch July 20, 2022 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants