Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for semaphore #7

Closed
clement911 opened this issue Mar 13, 2017 · 10 comments
Closed

Support for semaphore #7

clement911 opened this issue Mar 13, 2017 · 10 comments

Comments

@clement911
Copy link

This may be a bit out there for a "lightweight" library but have you thought of implementing a distributed counting semaphore?

In my scenario, I have a sql azure database with many customers. Customers can have a lot of data.
The .net code is hosted in several azure web instances.

I regularly want to process every customer data. However, I don't want to process all customers at the same time because it would overload the database.
I also don't want to process each customer serially (1 by 1) because it would take too long.
Ideally I would use a distributed semaphore with a count of N to process N customers at a time...

It could be built on top of SqlDistributedLock and I wonder if you've come across this scenario before?
Do you see that fitting within the vision of your library?

@madelson
Copy link
Owner

Any thoughts on what the implementation would look like? I can imagine using a global (##) temp table to store the counter state. What's less clear is how to implement the wait that happens when you try to remove the last ticket from the semaphore but can't. Obviously this could be done via sleeping with WAITFOR, but is there a cleaner mechanism by which anyone returning a ticket would wake up a waiter?

@clement911
Copy link
Author

Hmm I wasn't thinking about the overload with a timeout because I usually want to acquire the lock immediately or not at all. But I think we can handle that...

So, here are my thoughts on how to implement it.

For the user, it could look like this:

var semaphore = new SqlDistributedSemaphore("MyLock", connectionString, maxCount: 3);

using (semaphore.Acquire())
{
    // this block of code is protected by the lock!
}

Internally, the library will try to acquire a SqlDistributedLock with a name of "MyLock" + i.
Pseudo code:

for (int i = 1; i <= maxCount; i++)
{
     exclusiveLock = new SqlDistributed(lockName + i.ToString()).Aqcuire(...);
     if (exclusiveLock != null)
          return exclusiveLock;
}
return null;

If the user provided a timeout I guess we could try to acquire all locks in parallel, using the timeout provided. Of course the code would need to make sure that if multiple locks were acquired, all but 1 are released immediately... Also, if the connection is owned by the user that could be tricky because as far as I know SqlConnection is not thread safe.
Alternatively we could try each lock serially with a timeout of (timeout / maxCount). Not very elegant...

I realize I'm leaving out a lot of details here but I hope this makes sense from a high level view...

This post describes a similar technique, although the loop happens in the sql code...

@madelson
Copy link
Owner

Thanks @clement911 I can see that approach working well for many scenarios. Two things that would worry me about using this for a general-purpose API:

  • Were someone to create a semaphore with hundreds or thousands of tickets, the runtime for acquire scales very poorly
  • This works well with TryAcquire() semantics or a timeout of zero. However, for more traditional approaches where you want to wait to acquire the semaphore, we basically end up sleeping in a loop. This means executing a ton of additional queries as we wait.

Given these caveats, I don't think it makes sense to use the algorithm for a general-purpose semaphore API, especially since as you point out it only takes a few lines to implement it on to of the existing APIs. I'll keep thinking about whether there's a way to work around these limitations.

@clement911
Copy link
Author

Those are fair points. Thank you for your feedback and feel free to close the issue if you don't see it making it into the library.

@alastairtree
Copy link

What about using a global temp table with three columns LockName, LockHolder and optionally an expiry. Then you could insert into this table in an atomic manner to aquire the lock, and delete to release it. insert should be blocked when the rowcount reaches the max clients. Something like the below. Should scale pretty well I expect.

CREATE TABLE ##locks(Lockname varchar(20), LockHolder varchar(20), Expiry datetime2)

-- Try aquire the lock
DECLARE @LockName varchar(20) = 'MySharedLock'
DECLARE @LockHolder varchar(20) = 'Me'
DECLARE @MaxLocks int = 3

INSERT INTO ##locks(LockName, LockHolder, Expiry) 
SELECT @LockName, @LockHolder, DATEADD(hour,1,getdate())
WHERE NOT EXISTS (
	SELECT 1
	FROM ##locks
	WHERE LockName = @LockName 
		AND Expiry > GETDATE()
	GROUP BY LockName
	HAVING count(*) = @MaxLocks
)

IF(@@ROWCOUNT = 1)
	PRINT 'Lock Aquired'
ELSE 
	PRINT 'Lock denied'

-- and later release the lock
DELETE FROM ##locks
WHERE LockName = @LockName and LockHolder = @LockHolder

@madelson
Copy link
Owner

madelson commented Dec 3, 2017

Hi @alastairtree thanks for commenting.

I definitely think that a global temporary table would be part of the solution.

One of the big challenges, however, is implementing blocking behavior (Acquire vs. TryAcquire). In your example, failure to acquire the lock returns lock denied immediately, but in many cases you want to have your process block until it gains entry (or a timeout elapses). While this can be done by spinning in a busy wait, it's not ideal since it consumes resources for each waiter and doesn't guarantee any sort of fairness (one spinning thread could repeatedly get unlucky).

My current thinking on this would be something like the following:

if (try acquire one of the N sempahore slots) { return success; }

acquire waiter lock
{
     while (timeout has not elapsed) 
     {
          if (try acquire one of the N sempahore slots) { return success; }
     }
}

return failure

The idea here is that the waiter lock (a regular mutex lock) allows most waiters to block while a single waiter goes through the busy wait cycle. Since waiters queue on the wait lock, that means that the semaphore will be at least as fair as normal SQL server mutex. Still not perfect (we have a busy wait), but that's the best I have so far.

@madelson
Copy link
Owner

madelson commented Feb 8, 2018

This is available as of version 1.4

@clement911
Copy link
Author

Wooo thanks for that! That looks very solid!

@alastairtree
Copy link

Looks great! Can you comment on how you managed the fairness? Was it as above?

@madelson
Copy link
Owner

madelson commented Feb 8, 2018

@alastairtree the algorithm I ended up with is essentially this (implemented in SQL):

// the idea of the preamble is to make sure we never hit a case where a lock acquisition with a timeout
// (e. g. TryAcquire) returns false due to timing out on the busy wait lock (below) even though there
// were tickets available
preambleLock = name + '_preamble'
lock (preambleLock) // no blocking inside this; so we can wait for it regardless of timeout
{
     waiterCount = SELECT COUNT(*) FROM tempdb.sys.tables WHERE name like '##' + name + 'marker%';
     var markerTable = '##' + name + 'marker' + GUID;
     CREATE TABLE markerTable;

     if (waiterCount < maxCount) { // space available; just take it
         for (i = 0; i < maxCount; ++i) {
             ticketLock = name + i;
             if (tryAcquire(ticketLock)) { return (ticketLock, markerTable); }
         }
         throw "should have acquired";
     }
}

// the idea of the busy wait lock is that threads queue up here to do a spin wait.
// This both ensures fairness (one spinner at a time) and should prevent too much CPU
// load with many waiters (all but one blocked)
busyWaitLock = name + '_busyWait'
if (tryLock(busyWaitLock, timeout)) {
    while (!timeout expired) {
           for (var i = 0; i < maxCount; ++i) {
                 ticketLock = name + i;
                 if (tryAcquire(ticketLock)) { return (ticketLock, markerTable); }
           }
    }
}

// failed acquire
DROP TABLE markerTable;

To clean up, you just drop the marker temp table along with the ticket lock you took. The full code is in https://github.com/madelson/DistributedLock/blob/master/DistributedLock/Sql/SqlSemaphore.cs#L347

There are some additional complexities there dealing with cancellation and lock reentrance.

Give it a try and let me know how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants