Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method RemoveOrUpdate to ConcurrentDictionary #23644

Open
alewmt opened this issue Sep 26, 2017 · 31 comments
Open

Add method RemoveOrUpdate to ConcurrentDictionary #23644

alewmt opened this issue Sep 26, 2017 · 31 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections backlog-cleanup-candidate An inactive issue that has been marked for automated closure.
Milestone

Comments

@alewmt
Copy link

alewmt commented Sep 26, 2017

It would be very useful to have the ability to conditionally remove or update ConcurrentDictionary entry. It already has AddOrUpdate. Right now I can't use ConcurrentDictionary in ref counting scenarios, because I can't check reference count and remove it if value is 0 and forced to use common Dictionary with lock.

@jnm2
Copy link
Contributor

jnm2 commented Sep 26, 2017

With AddOrUpdate you don't get to decide whether to add or update. It adds if the key doesn't exist and updates if it does.

With your RemoveOrUpdate, it sounds like you want a way to decide whether to remove or update. Why is this necessary for reference counting? When the count gets to zero, what's the issue you're facing with leaving zero in the dictionary?

@Clockwork-Muse
Copy link
Contributor

To clarify @jnm2, you should be able to counterfeit "removal" by checking count during the "get" update and recreate the item if the count is 0 (presumably the item was previously disposed when the count was set to 0).

@jnm2 -
Separately, I wonder if a bool TryRemove(TKey key, out TValue value, Func<TValue, bool> currentValuePredicate) overload would be useful?

@alewmt
Copy link
Author

alewmt commented Sep 26, 2017

In the case of AddOrUpdate decision based on existence of an entry. I suggest symmetric functionality that requires decision based on a value of an entry.
I can’t just leave zeroes. It causes OutOfMemoryException.

dictionary.AddOrUpdate(id, new ValueTuple(1, new object()), (k,v) => new ValueTuple(v.Item1 + 1, v.Item2));
// ...
dictionary.RemoveOrUpdate(id, (k,v) => v.Item1 == 1/*remove entry if condition is true*/,  (k,v) => new ValueTuple(v.Item1 - 1, v.Item2)); 

ConcurrentDictionary with current API is useless for this case.

@Clockwork-Muse
Copy link
Contributor

Since I doubt you have that many ValueTuples, what's the type of v, and can you set it to null?

@alewmt
Copy link
Author

alewmt commented Sep 26, 2017

In my case actual type of v is AsyncLock
How can it help if I set it to null?

@jnm2
Copy link
Contributor

jnm2 commented Sep 26, 2017

What if you use SemaphoreSlim which provides WaitAsync which my guess would be that AsyncLock is built on? That way you move the reference counting into the object and don't need a tuple.

@jnm2
Copy link
Contributor

jnm2 commented Sep 26, 2017

I wonder if I can bother @StephenCleary - what's the best way to implement a keyed semaphore? Seems like mixing and matching synchronization primitives may not be ideal.

@Clockwork-Muse
Copy link
Contributor

Hm, the other option would be just to remove the entry (capture the value during the initial update to check later), than add/update it again if it turned out to have been updated in between the update-for-decrement and remove. Of course, that would mean the reference count could go negative, but we don't particularly care what the actual value is, so long as it's not zero

@alewmt
Copy link
Author

alewmt commented Sep 27, 2017

@jnm2 Shared object actually is not shared. This is asp.net core app, so every request restores personal object from DB. There are many of them point to the same DB entity simultaneously. So there is no way to move sync object to the Object.

@alewmt
Copy link
Author

alewmt commented Sep 27, 2017

@Clockwork-Muse If you remove an entry thinking that reference count is zero, but it is not. And after that somebody will create new entry, you will have two objects with two different AsyncLocks.

@jnm2
Copy link
Contributor

jnm2 commented Sep 27, 2017

@alewmt I meant stop using AsyncLock and use SemaphoreSlim instead- the rest stays the same.

@alewmt
Copy link
Author

alewmt commented Sep 27, 2017

@jnm2 Tuple with counter is still required. Otherwise impossible to know when to delete entry

@Clockwork-Muse
Copy link
Contributor

@alewmt - no, you're right, that was a silly proposal. I guess I was just thinking of the reference count, which wouldn't be helpful.

@jnm2 - unfortunately, lock-taken-count (or really, "number of people waiting to enter the semaphore remaining count", given you'd need to check the result of Release()) isn't the same thing as reference count.

@jnm2
Copy link
Contributor

jnm2 commented Sep 27, 2017

Ah, got it. You might be referencing it even though you don't need to use it right then.

@StephenCleary
Copy link
Contributor

I have always avoided keyed semaphores. In every scenario I've been in where I've wanted one, I take a step back and reconsider the design at a higher level. Every time (so far), I've been able to use an alternative design that does not require a keyed semaphore, and it's usually a cleaner design, too.

@alewmt
Copy link
Author

alewmt commented Sep 30, 2017

@StephenCleary Simultaneously received requests require some synchronization. I'm curious what is wrong with keyed semaphores?

@StephenCleary
Copy link
Contributor

As you have been experiencing, semaphore management is a pain. And there's almost always a better place for it that doesn't require keyed semaphores.

With the example you provided, you have a request context object and then a keyed semaphore that maps those requests to semaphores. Well, why not have the request context object contain its own semaphore? It's a cleaner design and removes the need for a keyed semaphore in the first place. Have the context contain its own components rather than looking them up in a shared, static dictionary.

@alewmt
Copy link
Author

alewmt commented Oct 17, 2017

@StephenCleary I’m not sure how it can help abandon keyed semaphores. Two request contexts needs the same semaphore. Certainly, it is possible to move a dictionary of semaphores from application layer to somewhere in a pipeline infrastructure. It just hides usage of keyed semaphore. Did I miss something?

@StephenCleary
Copy link
Contributor

@alewmt I thought you meant that each request would have a semaphore.

If you are talking about some other resource, then just move the semaphore into whatever that resource is. So instead of locking a keyed semaphore based on some lookup, you'd just lock the resource's semaphore directly.

@alewmt
Copy link
Author

alewmt commented Oct 17, 2017

@StephenCleary In my case it is impossible. The resource is an orm entity for DB row. It is instantiated on every request.

@StephenCleary
Copy link
Contributor

@alecont I see. I've never been in that situation, but I can see how it's possible, depending on your db concurrency strategy.

I'd say that modifying ConcurrentDictionary is probably not the best approach, though. I'd recommend using a custom dictionary-like mapping type: a refcount mapping or weak mapping would work. Neither of those facilities exist in the BCL AFAIK.

@jnm2
Copy link
Contributor

jnm2 commented Oct 17, 2017

When I've been in this scenario, I use a database-table-backed mutex implementation that not only locks within the same process but any other instance running on the same machine or a different machine.

Acquiring, maintaining and releasing the lock takes I/O which is slower than concurrency primitives, but in my case, I specifically needed cross-process and cross-machine locking of resources by row ID.
(If you're on SQL Server and you don't want to devote a whole database table to atomic lock-tracking, you can also use the string-based sp_getapplock.)

(Of course, none of this will help you if you're not interested in locking across multiple instances of your application.)

@alewmt
Copy link
Author

alewmt commented Oct 17, 2017

@StephenCleary Thank you for your recommendation.
@jnm2 I don't neet to lock across applications.
In my case semaphore is an optimization. There were too many DbUpdateConcurrencyException and I moved to pessimistic locking.

@Clockwork-Muse
Copy link
Contributor

@alewmt -

I don't neet to lock across applications.
In my case semaphore is an optimization. There were too many DbUpdateConcurrencyException and I moved to pessimistic locking.

Um what? The DbUpdateConcurrencyException is most probably from something else updating the rows, not your current thread/task, right? Either it's a different application (or different instances of the current application), and you're going to need to share locks externally, or you've got multiple tasks/threads in the application attempting to update the same row, and you might be better served to change it so you're not sharing the same item (partition the range, say).

@jnm2
Copy link
Contributor

jnm2 commented Oct 17, 2017

@alewmt If your application is a website, and you're relying on only a single instance of it ever running (per database), and you ever want to scale it horizontally, you're going to have to use some central system like the database for pessimistic locking or you'll have concurrency bugs again. Or eventual consistency, but that also solves your problem by not needing mutexes.

Also say it's website running on IIS. By default (I think) IIS starts up a new instance of your application before shutting down the last one so that recycling doesn't interrupt web traffic.

If your application is a desktop or console app, @Clockwork-Muse's point is exactly right. How are you guaranteeing that no two instances of your app are sharing the same DB?

@Clockwork-Muse
Copy link
Contributor

I was thinking of the web situation, too, really.

If you need to lock a resource, it's usually better to do it at the resource level, not the thing that's running, because otherwise there's a possibility that something previously unanticipated is allowed access to the resource, and now you're in trouble.
(The actual line can be hard or difficult to find, and usually warrants an abstraction. For instance, in @jnm2 's database example, if I anticipated multiple different applications accessing the DB I'd probably make the abstraction point a set of views and stored procedures in the database, to guarantee everybody follows the rules. If only one program used it - possibly with multiple instances - the abstraction point would be the application)

@ianhays
Copy link
Contributor

ianhays commented Oct 23, 2017

I'd say that modifying ConcurrentDictionary is probably not the best approach, though. I'd recommend using a custom dictionary-like mapping type: a refcount mapping or weak mapping would work. Neither of those facilities exist in the BCL AFAIK.

There's also https://github.com/dotnet/corefx/issues/24770 which I think is reasonable, though maybe not exactly the same scenario.

@weitzhandler
Copy link
Contributor

Related: reactiveui/splat#370 (comment)

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@layomia layomia removed the untriaged New issue has not been triaged by the area owner label Jun 24, 2020
@layomia layomia modified the milestones: 5.0.0, Future Jun 24, 2020
@ghost
Copy link

ghost commented Oct 28, 2021

Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process.

This process is part of the experimental issue cleanup initiative we are currently trialing in a limited number of areas. Please share any feedback you might have in the linked issue.

@alewmt
Copy link
Author

alewmt commented Oct 29, 2021

up

@ghost ghost removed the no-recent-activity label Oct 29, 2021
@eiriktsarpalis
Copy link
Member

What would be the proposed API shape for such a method? Does the proposed method account for cases where it might make sense to add, update or remove an entry depending on the current state?

Note that a transaction on a given key could support four possible outcomes, depending on the current state of the dictionary:

Precondition Outcome Operation Name
Key not found Key removed No-Op
Key not found Key added Add
Key found Key removed Remove
Key found Key added Update

Assuming we had some form of optional type it might be possible to expose a method that supports all four combinations dynamically:

public partial class ConcurrentDictionary<TKey, TValue>
{
    public Option<TValue> Transact(TKey key, Func<TKey, Option<TValue>, Option<TValue>> updater);
    public Option<TValue> Transact<TState>(TKey key, Func<TState, TKey, Option<TValue>, Option<TValue>> updater, TState state);
}

which could be used as follows (using pseudo C# DU syntax):

var dictionary = new ConcurrentDictionary<string, int>();
dictionary.Transact("key", value => value switch
    {
        case None => None, // attempting to remove a key that doesn't exist, do not make any changes
        case Some { Value: <= 1 } => Option<T>.None, // value decreased to zero, remove key
        case Some { Value: int v } => Option<T>.Some(--v), // update entry with decremented counter
    });

This pattern might not be possible until C# receives discriminated union support but I think it might be worth the wait before we do something in this space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections backlog-cleanup-candidate An inactive issue that has been marked for automated closure.
Projects
None yet
Development

No branches or pull requests

10 participants