Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API proposal : pooled / leased connections as a secondary API #886

Open
mgravell opened this issue Jul 10, 2018 · 17 comments
Open

API proposal : pooled / leased connections as a secondary API #886

mgravell opened this issue Jul 10, 2018 · 17 comments

Comments

@mgravell
Copy link
Collaborator

mgravell commented Jul 10, 2018

Motivations:

  • provide safe access to blocking operations (BRPOPLPUSH, etc)
  • provide "pure" access to WATCH/MULTI/EXEC
  • we already expose the user to the risk of these operations via Execute; this would hopefully stop people doing dangerous things on the main connections

Counter-motivations:

  • Lua scripts are better solutions for most (but not all) scenarios that previously would have been WATCH/MULTI/EXEC

Note:

If implemented (needs consideration), the multiplexer would take care of the socket IO, but they would be stored separately and would not participate in the multiplexed command stream. This would be in addition to the regular API and would just work inside the existing context.

Questions:

  • what does this mean for cluster? is it assumed that all operations on a lease will participate on a single shard and will therefore be single-server?
  • how is primary/replica selected
  • what does the lease API look like - presumably using?
  • what happens when the pool is exhausted? is it an "await (async) an item from the pool, or timeout after x"?
  • where would it attach; IServer ? and if so, do we need a new GetServer mechanism that makes this painless?
  • how complex does the pool management need to be? just a max? is it enabled at all if not configured?
  • do timeouts apply on blocking operations? if so, how?

Timescales:

It won't be 2.0

Likelihood:

Thinking about it; it is tempting.

@mgravell
Copy link
Collaborator Author

mgravell commented Jul 10, 2018

Usage idea:

// args for sharding and primary/replica selection (key is just
// representative - and optional?)
IServer server = conn.GetServer(sampleKey, flags); 

// other args here not shown: db number, async state, etc
using(ILeasedDatabase db = await server.LeaseAsync(timeout)) 
{
  // stuff
  await db.SomeBlockingOperationAsync(..);
  // more stuff
}

Or maybe just one-step that?

using(ILeasedDatabase db = await conn.LeaseAsync(sampleKey, flags, timeout))
{
   // ...
}

with:

interface ILeasedDatabase : IDatabase, IDisposable
{
    // TODO: add blocking APIs

    // TODO: add "pure" WATCH/EXEC APIs
}

@NickCraver
Copy link
Collaborator

For clarity: is the concern mainly client (pipeline) blocking operations due to large load, or things that are meant to be server-blocking (atomic), the latter of which are going to block due to the nature of redis anyway? I'm assuming all the wins are in the former, in which case: ...maybe? But I see less utility for it.

That being said, most of our direct concerns in this area (around connections) at Stack fall info:

  • Lots of data in an op (e.g. a large string, like with leagues)
  • Lots of pipe contention for massive op counts (e.g. API throttles)

I'm not sure how it'd help the second case, but a similar approach as our "bulky" could be natively handled by the former. But, counter to that use case, is it worth automatic control and generation of bulk connections over our explicit usage today? I'm leaning towards "it's not", due to increased complexity and uncertain expectations vs. today. A single "bulky" on-demand (in addition to infrastructure and subscription) connections could satisfy that, but it's such a narrow use case...again, worth it?

I think the biggest issue I have is this changing from a very predictable 2 connections to n and breaking all expectations. That worries me for our use case...but I totally admit we're a crazy outlier and that's likely not an issue in general. To that end, I'd say: we'd want to be able to disable such behavior...but then we don't be dogfooding it either.

...I'm not sure I helped here, but that's current thoughts after a first pass.

@mgravell
Copy link
Collaborator Author

@NickCraver the motivation isn't data volume / load related; it is semantic related. There are a good few operations that are designed to be connection-blocking, so we don't expose them today. Similarly, there are some WATCH scenarios that we can't implement, because they require ad-hoc commands inside a WATCH region (but before a MULTI), which we disallow.

@aensidhe
Copy link

aensidhe commented Jul 10, 2018

  1. IDatabase is compatible with IDatabaseAsync. I can shoot myself into leg passing ILeasedDatabase into usual async code.
  2. Why do we need separate API for creating additional connection? Why creating a separate multiplexer and caching it is worse? Chances are that I'll need it several times during lifetime of my web code. If code is in some tool, I can create as many multiplexers as I want.

My opinion that this API is unnecessary.

PS: yes, I can still shoot myself into leg by passing IDatabaseAsync from "special multiplexer" into usual code, but that is ok for me.

@NickCraver
Copy link
Collaborator

NickCraver commented Jul 10, 2018

@mgravell I'm not sure I understand the intent of where you're picturing the leasing/blocking. It's a close in concept to a transaction (IMO), or (the way I read it) we'd be blocking every command on the pipe behind that using, effectively making it lock() semantics w.r.t. the pipeline. Can you elaborate on the examples a bit more on where you expect the actual leasing/blocking to start and end? If it's not on a separate connection (pooled, dedicated, whatever), which most of my concerns are around, we're still back to blocking the existing primary connection (the primary reason, AFAIK, we disallow these today). Am I way off here?

@mgravell
Copy link
Collaborator Author

@aensidhe

  1. doesn't shoot you in the leg - it just means it runs on a separate connection
  2. because establishing connections and doing all the handshaking is more expensive than you might think; and also: we don't currently expose those APIs for you to use

@mgravell
Copy link
Collaborator Author

mgravell commented Jul 10, 2018

@NickCraver the leased connection here is a dedicated socket - it is a leased redis connection, essentially; "interactive" only (no subscription functionality); we would hand it back to some notional pool when disposed, etc (so from a redis perspective: it stays alive)

@NickCraver
Copy link
Collaborator

Okay so if I do this:

db.StringSet("mykey", "value");
using(ILeasedDatabase db = await server.LeaseAsync(timeout)) 
{
  var val = await db.StringGet("mykey");
}

...then val can be null, because we're got 2 connections and physical pipelines - so if the first is backed up a bit, it won't have issued the write to redis before the GET on the second, right? This is less likely to happen if using the shared SocketManager writer pools, but still. I think there's a lot of non-intuitive behaviors that could arise here: races, exhaustion of the pool (in loops/concurrency) or unexpected timeouts (assuming that's a concept that applies here), and other related bits when using these with any decent frequency.

Thoughts on those kinds of things?

@aensidhe
Copy link

aensidhe commented Jul 10, 2018

@mgravell

  1. Yeah and that can be unexpected. Can't think about anything bad right now, so ok, let it be ok.
  2. I know about cost of establishing of connection, but why is pooling connection is better than thinking ahead and putting an additional multiplexer ahead of time into DI container or in separate global variable, etc? This is very edge use case. Will you recycle pooled connection? How and when will you do it?

It's not a "user way of think", I'm writing a similar package for another database. And I think that if user want a separate connection from time to time he can cache that separate connection somewhere near main connection.

And other concern: IDatabase and IDatabaseAsync are lightweight wrappers to get and throw away here and there, if I'm remember it correctly. Making a compatible interface that can create a connection per allocation of that interface can be a problem.

@mgravell
Copy link
Collaborator Author

mgravell commented Jul 10, 2018

Okay so if I do this:

No, there is no conflict there; StringSet blocks until the response comes back from the server; likewise await StringSetAsync. There is a thread-race if you do:

var x = dbA.StringSetAsync();
var y = dbB.StringGetAsync();
await x;
var s = await y;

but.... at that point you've gone out of your way to introduce a problem (@NickCraver )

@NickCraver
Copy link
Collaborator

Ah sorry I should amend the example (was on mobile), I don't think the FireAndForget is that uncommon/going out of the way - I think that's fairly standard fare on a set operations.

I'm just not (yet) convinced we've gotten enough feedback wanting this to justify spending much time on it, but I don't feel strongly against either. FWIW, some of the feedback like WAIT was around wanting things on that pipeline to finish (which is a separate issue from this anyway). I can recall a few issues related here, but IIRC they were all isolated.

My main concern is still around any sort of automatic socket pooling which makes things less predictable. That would feel (at this particular moment) like a step backwards. Right now failures there are failures on that multiplexer and it's not too hard to reason about what went wrong. When 1 (or many) in a pool of several start having issues, things get messy to figure out...that's the part that scares me, given we just got sockets decently stable and are still working on hangs (which to be fair may be unrelated).

To be fair, it's entirely possible after we're stable on v2 I'd feel a lot more confident in diving into this with a different perspective on the complexity/debug risk vs. payoff.

@mgravell
Copy link
Collaborator Author

Fair enough; totally agree that it isn't a priority today - happy to shelve indefinitely, I just want to open the dialogue.

@Plasma
Copy link

Plasma commented Jul 26, 2018

Just wanted to chime in as to why having a separate connection, even for non-blocking operation use, can make sense some times.

At the moment the recommendation is to use a single, shared ConnectionMultiplexer across all threads.

This usually works well, but I've observed these side effects in production:

  1. Because all commands from all threads go to a single C# queue and TCP socket, you get "head of line blocking" behind a buffer that may be filled with more commands from Thread A (perhaps megabytes in terms of bandwidth to process) in front of the one you're wanting to transmit that is more time sensitive, and smaller in size, from Thread B.

Imagine for example, Thread A has written several megabytes worth of SET commands due to a background task, and Thread B has a simple GET command pending at the end of the shared muxer queue. Thread B must wait for Thread A's workload to be emptied first.

The delays I am referring to here are milliseconds, and in one case 100s of milliseconds, due to CPU heavy redis commands used (eg sorted set intersects), but I'm mostly pointing it out in terms of "jitter" observed for random requests.

  1. I acknowledge that Redis is single threaded, and processes requests sequentially, but it is worth noting that Redis attempts to treat each client/TCP connection fairly [1].

  2. Imagine we utilized two connections ("clients") to Redis instead, and Thread A is doing its bandwidth-heavy, batch based processing on one connection, and Thread B uses another connection that is intended to be for more real-time request/response patterns.

If both are sent at the same moment, and my assumption is the local app server's TCP framework will also do interleaving, then Redis will switch/interleave processing requests from the TCP sockets between both connections, so while partially receiving/processing Thread A's workload (the multi-megabyte SET commands), Redis will process Thread B's simple workload quickly, then switch back to continuing to process Thread A workload.

The end result here is that Thread B gets a faster response time as its connection's workload was interleaved, on Redis's side, because it was a separate TCP connection, and it wasn't behind the workload of Thread A by sharing the same TCP socket or local C# command queue.

[1] https://redis.io/topics/clients

@mgravell
Copy link
Collaborator Author

mgravell commented Jul 26, 2018 via email

@prat0088
Copy link

Is this still on the table? Is there anything I can help out with?

@mgravell
Copy link
Collaborator Author

mgravell commented Feb 15, 2020 via email

@DeepWorksStudios
Copy link

Any updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants