Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Add GetOrAddAsync to ConcurrentDictionary #83636

Open
jairbubbles opened this issue Mar 18, 2023 · 34 comments
Open

[API Proposal]: Add GetOrAddAsync to ConcurrentDictionary #83636

jairbubbles opened this issue Mar 18, 2023 · 34 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Collections
Milestone

Comments

@jairbubbles
Copy link

jairbubbles commented Mar 18, 2023

EDIT: remove AddOrUpateAsync based on comments in the thread. (see @theodorzoulias's comment) + return a ValueTask

Background and motivation

ConcurrentDictionary.GetOrAdd is pretty useful to store results in a thread safe way. But when your code is async, it's missing an overload to pass it a task.

The current proposal is pretty straightforward, the idea is to introduce GetOrAddAsync methods that would use a Task<TValue> rather than a Value.

To be more consistent I also propose to add AddOrUpdateAsync overloads.

API Proposal

namespace System.Collections.Concurrent;

public class ConcurrentDictionary<TKey, TValue> : IDictionary<TKey, TValue>, IDictionary, IReadOnlyDictionary<TKey, TValue> where TKey : notnull
{
    public TValue GetOrAdd(TKey key, Func<TKey, TValue> valueFactory)
+    public ValueTask<TValue> GetOrAddAsync(TKey key, Func<TKey, Task<TValue>> valueFactory)
    public TValue GetOrAdd<TArg>(TKey key, Func<TKey, TArg, TValue> valueFactory, TArg factoryArgument)
+    public ValueTask<TValue> GetOrAddAsync<TArg>(TKey key, Func<TKey, TArg, Task<TValue>> valueFactory, TArg factoryArgument)
}

API Usage

var results = new ConcurrentDictionary<int, string>();
await Parallel.ForEachAsync(Enumerable.Range(1, 1000), async (i, _) =>
{
    var result = await results.GetOrAddAsync(i, GetSomethingAsync);
});

async Task<string> GetSomethingAsync(int i)
{
    await Task.Delay(10);
    return "Test";
}

Alternative Designs

  • Use an extension method as the base class exposes everything that's needed. (see @stephentoub's comment)

  • Add another class that would be async by nature?

  • Use a Dictionary<TKey, Task<TValue>> rather than a Dictionary<TKey, TValue> (see @Clockwork-Muse's comment)

  • Dot not use ConcurrentDictionary when your code is async? Though I would add that when migrating code from sync to async you don't want to change everything.

Risks

I'm not sure it scales well to add an overload for each sync method.

@jairbubbles jairbubbles added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Mar 18, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 18, 2023
@ghost
Copy link

ghost commented Mar 18, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

ConcurrentDictionary.GetOrAdd is pretty useful to store results in a thread safe way. But when your code is async, it's missing an overload to pass it a task.

The current proposal is pretty straightforward, the idea is to introduce GetOrAddAsync methods that would use a Task<TValue> rather than a Value.

To be more consistent I also propose to add AddOrUpdateAsync overloads.

API Proposal

namespace System.Collections.Concurrent;

public class ConcurrentDictionary<TKey, TValue> : IDictionary<TKey, TValue>, IDictionary, IReadOnlyDictionary<TKey, TValue> where TKey : notnull
{
    public TValue GetOrAdd(TKey key, Func<TKey, TValue> valueFactory)
+    public Task<TValue> GetOrAddAsync(TKey key, Func<TKey, Task<TValue>> valueFactory)
    public TValue GetOrAdd<TArg>(TKey key, Func<TKey, TArg, TValue> valueFactory, TArg factoryArgument)
+    public Task<TValue> GetOrAddAsync<TArg>(TKey key, Func<TKey, TArg, Task<TValue>> valueFactory, TArg factoryArgument)
...
  public TValue AddOrUpdate<TArg>(TKey key, Func<TKey, TArg, TValue> addValueFactory, Func<TKey, TValue, TArg, TValue> updateValueFactory, TArg factoryArgument)
+  public Task<TValue> AddOrUpdateAsync<TArg>(TKey key, Func<TKey, TArg, Task<TValue>> addValueFactory, Func<TKey, TValue, TArg, Task<TValue>> updateValueFactory, TArg factoryArgument)
  public TValue AddOrUpdate(TKey key, Func<TKey, TValue> addValueFactory, Func<TKey, TValue, TValue> updateValueFactory)
+  public Task<TValue> AddOrUpdateAsync(TKey key, Func<TKey, Task<TValue>> addValueFactory, Func<TKey, TValue, Task<TValue>> updateValueFactory)
  public TValue AddOrUpdate(TKey key, TValue addValue, Func<TKey, TValue, TValue> updateValueFactory)
+  public Task<TValue> AddOrUpdateAsync(TKey key, TValue addValue, Func<TKey, TValue, Task<TValue>> updateValueFactory)
}

API Usage

var results = new ConcurrentDictionary<int, string>();
await Parallel.ForEachAsync(Enumerable.Range(1, 1000), async (i, _) =>
{
    await results.GetOrAddAsync(i, GetSomethingAsync);
});

async Task<string> GetSomethingAsync(int i)
{
    await Task.Delay(10);
    return "Test";
}

Alternative Designs

Add another class that would be async by nature?

Dot not use ConcurrentDictionary when your code is async? Though I would add that when migrating code from sync to async you don't want to change everything.

Risks

I'm not sure it scales well to add an overload for each sync method.

Author: jairbubbles
Assignees: -
Labels:

api-suggestion, area-System.Collections

Milestone: -

@jairbubbles jairbubbles changed the title [API Proposal]: Add GetOrAddAsync to ConcurrentDictionary [API Proposal]: Add GetOrAddAsync/AddOrUpdateAsync to ConcurrentDictionary Mar 18, 2023
@MichalPetryka
Copy link
Contributor

So those overloads would complete synchronously in case the factories aren't called, is that right? In such case, those methods should return ValueTask<T>, not Task<T> since synchronous completion will be the more common case.

@Clockwork-Muse
Copy link
Contributor

.... There's nothing preventing using ConcurrentDictionary in async code, and the existing methods are non-blocking (or locks are held for a vanishingly short period of time). What benefit do you see async methods bringing?
Note that for at least GetOrAdd, simply storing the Task<T> allows you to produce equivalent behavior. For AddOrUpdateAsync it might also be equivalent, depending on what you envision the "locking strategy" to be.

@MichalPetryka
Copy link
Contributor

.... There's nothing preventing using ConcurrentDictionary in async code, and the existing methods are non-blocking (or locks are held for a vanishingly short period of time). What benefit do you see async methods bringing? Note that for at least GetOrAdd, simply storing the Task<T> allows you to produce equivalent behavior. For AddOrUpdateAsync it might also be equivalent, depending on what you envision the "locking strategy" to be.

The async here is for the case when the factory callback performs async operations like web requests.

@Clockwork-Muse
Copy link
Contributor

From the documentation for AddOrUpdate():

For modifications and write operations to the dictionary, ConcurrentDictionary<TKey,TValue> uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner.) The addValueFactory and updateValueFactory delegates may be executed multiple times to verify the value was added or updated as expected. However, they are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, AddOrUpdate is not atomic with regards to all other operations on the ConcurrentDictionary<TKey,TValue> class.

.... which means that, if we keep this behavior, then simply storing the Task would be equivalent.

The async here is for the case when the factory callback performs async operations like web requests.

Do note that because of the above, whatever web requests you're making would need to be idempotent (even in the case that these methods were implemented)

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Mar 18, 2023

What is the expected behavior of the proposed APIs? Are the valueFactory, addValueFactory, updateValueFactory delegates expected to be invoked just once, or you are expecting to see multiple concurrent invocations in case multiple asynchronous execution flows call the GetOrAddAsync and AddOrUpdateAsync APIs concurrently for the same key? Specifically with the AddOrUpdateAsync, are you expecting that a single call to AddOrUpdateAsync might invoke the updateValueFactory delegate multiple times by the current async flow alone?

@jairbubbles
Copy link
Author

So those overloads would complete synchronously in case the factories aren't called, is that right? In such case, those methods should return ValueTask, not Task since synchronous completion will be the more common case.

Wasn't clear to me if I should use a ValueTask or a Task, now it is. Thx!

.... There's nothing preventing using ConcurrentDictionary in async code, and the existing methods are non-blocking (or locks are held for a vanishingly short period of time). What benefit do you see async methods bringing?

It's mainly to have a nice syntax, similar to the sync one. BTW you'll want to await if you need the result right away:

var result = await cache.GetOrAddAsync(key, GetSomethingAsync);

Note that for at least GetOrAdd, simply storing the Task allows you to produce equivalent behavior.

You're right, that's a good counter argument to not add those overloads 🤔

What is the expected behavior of the proposed APIs? Are the valueFactory, addValueFactory, updateValueFactory delegates expected to be invoked just once, or you are expecting to see multiple concurrent invocations in case multiple asynchronous execution flows call the GetOrAddAsync and AddOrUpdateAsync APIs concurrently for the same key? Specifically with the AddOrUpdateAsync, are you expecting that a single call to AddOrUpdateAsync might invoke the updateValueFactory delegate multiple times by the current async flow alone?

I didn't think much about AddOrUpdateAsync as my use case is for GetOrAddAsync. In this case I would expect to have the same behavior that the sync method, if two concurrent invocations add the same key, the valueFactory will be called twice.
Usually, we add a Lazy<Value> when we want to prevent that but I think like we would miss a AsyncLazy then 🤔

@davidfowl
Copy link
Member

I've had this gist, that I've been using in various personal projects for a while.

@jairbubbles
Copy link
Author

jairbubbles commented Mar 18, 2023

With AsyncLazy it's not bad:

using System.Collections.Concurrent;
using DotNext.Threading;

var results = new ConcurrentDictionary<int, AsyncLazy<string>>();

await Parallel.ForEachAsync(Enumerable.Range(1, 100), async (i, cancellationToken) =>
{
    var result = await results.GetOrAdd(0, x => new AsyncLazy<string>(async t =>
    {
        Console.WriteLine($"Compute {x}");
        await Task.Delay(10, t);
        return $"Result {x}";
    })).WithCancellation(cancellationToken);
    Console.WriteLine(result);
});

@davidfowl Could you explain why in your gist you don't insert in the dictionary the factory task directly?

EDIT: It's fine with Lazy too:

using System.Collections.Concurrent;

var results = new ConcurrentDictionary<int, Lazy<Task<string>>>();

await Parallel.ForEachAsync(Enumerable.Range(1, 100), async (i, cancellationToken) =>
{
    var result = await results.GetOrAdd(0, x => new Lazy<Task<string>>(async () =>
    {
        Console.WriteLine($"Compute {x}");
        await Task.Delay(10);
        return $"Test {x}";
    })).Value;
    Console.WriteLine(result);
});

@jairbubbles
Copy link
Author

jairbubbles commented Mar 18, 2023

Humm it's linked to "//The factory method will only run once." ? You return the task so that callers can await this one directly?

Aren't you missing a lock so that only one TaskCompletionSource is created?

EDIT: ok I see the TryAdd is locking so it's fine.

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Mar 18, 2023

Imagine that you have a ConcurrentDictionary<string, int>, and one asynchronous flow that increments the value of a key every second:

ConcurrentDictionary<string, int> dict = new();
PeriodicTime timer = new(TimeSpan.FromSeconds(1));
while (await timer.WaitForNextTickAsync())
    dict.AddOrUpdate("Key1", 1, (_, existing) => existing + 1);

Then you add another asynchronous flow, that uses the proposed API AddOrUpdateAsync. This one decrements the value of the same key once, after awaiting an async operation that has a duration of 2 seconds:

await dict.AddOrUpdateAsync("Key1", 1, async (_, existing) =>
{
    await Task.Delay(TimeSpan.FromSeconds(2));
    return existing - 1;
});

The second asynchronous flow will deadlock, because the AddOrUpdateAsync will never complete. After the updateValueFactory delegate completes, the AddOrUpdateAsync will find that the "Key1" now has a different value than before, so it will invoke the updateValueFactory again and again for an eternity. Isn't this a problem for the proposed API?

@stephentoub
Copy link
Member

stephentoub commented Mar 18, 2023

I don't foresee us adding async methods to ConcurrentDictionary. The existing helpers like GetOrAdd are there only because they represent a significant use case, but they can also be implemented on top of the primitive TryGetValue/TryAdd/TryRemove/TryUpdate methods, as can the proposed methods, which can also use those same helpers. For example, the proposed public Task<TValue> GetOrAddAsync(TKey key, Func<TKey, Task<TValue>> valueFactory) can be implemented as trivial extension like:

public static async ValueTask<TValue> GetOrAddAsync<TKey, TValue>(this ConcurrentDictionary<TKey, TValue> d, TKey key, Func<TKey, ValueTask<TValue>> valueFactory) =>
    d.TryGetValue(key, out TValue value) ? value : d.GetOrAdd(key, await valueFactory(key));

If you want different semantics, obviously you can compose it however you'd like.

@davidfowl
Copy link
Member

davidfowl commented Mar 19, 2023

This is a very common caching use case (that's why the MemoryCache has an extension method), especially when it prevents the cache stampede problem. I don't know if that means we should add it to concurrent dictionary, but I'd reckon it's common enough to put somewhere in the core libraries.

@jairbubbles
Copy link
Author

jairbubbles commented Mar 19, 2023

This is a very common caching use case (that's why the MemoryCache has an extension method), especially when it prevents the cache stampede problem.

So you would go with a locking strategy similar to the one in your gist then?

I feel like it's a nicer behavior but it's not consistent with the original GetOrAdd.

I don't know if that means we should add it to concurrent dictionary, but I'd reckon it's common enough to put somewhere in the core libraries.

Yep that's the initial goal at my proposal, make it built-in. An extension method would be fine too, a little less discoverable.

When migrating, code from sync to async, you have a lot of modifications to be made and I believe it's important that the framework helps the migration path. Async is viral so when you start modifying one method you end modifying a dozen of other methods and if it ends up modifying too much code, you can be lazy and do a "sync over async".

The second asynchronous flow will deadlock, because the AddOrUpdateAsync will never complete.

Maybe I should remove AddOrUpdateAsync from the proposal? I added it when looking at the code but it seems it's a more complicated use case.

Should I just edit the description / title?

@stephentoub
Copy link
Member

stephentoub commented Mar 19, 2023

This is a very common caching use case (that's why the MemoryCache has an extension method), especially when it prevents the cache stampede problem

Your gist's solution to that is a completely different method than what's being proposed and by its very nature isn't one that can be added to CD but would need to remain an extension method as it operates on a specific TValue only. If there's such an alternate proposal to discuss, we can, but that's not what this issue was proposing nor what I was answering. It also is one specific take on how to address that problem, and means that you don't get the alternate behavior you might want sometimes, of actually allowing concurrent adds to compete, not serializing behind a potential failure, not falling all concurrent requests when the first fails, etc. And its behavior diverges from the sync counterpart beyond just using asynchrony, and such divergence is something we strive to avoid as it means the behavior of code changes just by going async.

@jairbubbles jairbubbles changed the title [API Proposal]: Add GetOrAddAsync/AddOrUpdateAsync to ConcurrentDictionary [API Proposal]: Add GetOrAddAsync to ConcurrentDictionary Mar 19, 2023
@theodorzoulias
Copy link
Contributor

Maybe I should remove AddOrUpdateAsync from the proposal? I added it when looking at the code but it seems it's a more complicated use case.

Should I just edit the description / title?

I am not sure what is the protocol here about updating API proposals, but my guess is that removing the AddOrUpdateAsync by editing the body and the title of the proposal should be OK.

The GetOrAddAsync is surely less problematic than the AddOrUpdateAsync, because the valueFactory is invoked at most once by each asynchronous flow. Personally I am not an enthusiastic supporter of this API either. It is likely to be used by people who want an asynchronous cache, and get a suboptimal behavior they don't really want. I think that most potential users of the proposed AddOrUpdateAsync, would much prefer to use instead @davidfowl's gist, with a ConcurrentDictionary<TKey, Task<TValue>>. Filling the cache by invoking the valueFactory only once, regardless of how many execution flows will request the value of the same key at the same time, should be the desirable behavior in most cases IMHO.

@jairbubbles
Copy link
Author

I just edited the description / title.

As for using a ConcurrentDictionary<TKey, Task<TValue>> I agree that it's a good alternative design that makes my proposal less attractive. But as I took time to write I still like it a bit 😊

It's just a little less obvious when migrating code from sync to async and if the dictionary is still used by some sync code you would need to wrap with a Task.

@Clockwork-Muse
Copy link
Contributor

The GetOrAddAsync is surely less problematic than the AddOrUpdateAsync, because the valueFactory is invoked at most once by each asynchronous flow.

That's not a given, and the existing sync GetOrAdd methods do not make such a guarantee.

@theodorzoulias
Copy link
Contributor

That's not a given, and the existing sync GetOrAdd methods do not make such a guarantee.

Based on the source code, it is. First the TryGetValueInternal is called. In case the result is false, then the valueFactory and the TryAddInternal are invoked in succession, and that's it. There is no while (true) loop in there, like in the AddOrUpdate method. The documentation is ambiguous though (emphasis added):

Since a key/value can be inserted by another thread while valueFactory is generating a value, you cannot trust that just because valueFactory executed, its produced value will be inserted into the dictionary and returned. If you call GetOrAdd simultaneously on different threads, valueFactory may be called multiple times, but only one key/value pair will be added to the dictionary.

The "called multiple times" could mean either multiple times on the current thread, or multiple times on multiple threads (but at most once per thread). In reality the second meaning is the correct one.

@Clockwork-Muse
Copy link
Contributor

Ah, I missed the "each asynchronous flow" part of your comment. So yes.

That doesn't really solve anything,, though - generally the problem is going to be that it gets invoked multiple times at all, not 'single time per thread'.

@theodorzoulias
Copy link
Contributor

@Clockwork-Muse pardon me a small digression. Some time ago we had a long discussion about the expected effects of a delegate failure. What would be your expectation in case the valueFactory throws an exception? Would you assume that the ConcurrentDictionary<K,V> is corrupted after the error?

@Clockwork-Muse
Copy link
Contributor

Uh.
Maybe.
Things start getting weird once multiple threads get involved. And the closer you swing such a collection to a database the closer it should be to having "I/O" exceptions. In particular, the moment you start making web requests you have I/O, and should react accordingly.

Normally it's not an issue for ConcurrentDictionary because the lambdas are supposed to be short-lived and do minimal work, and any exceptions thrown are going to be panic-worthy (ie, OutOfMemoryException) or unrecoverable anyways (ArgumentNullException).

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Mar 20, 2023

@Clockwork-Muse yep, it would be unusual to use the valueFactory of the GetOrAdd for something heavier than creating an empty List<T> or a new SemaphoreSlim. But the proposed here GetOrAddAsync API with asynchronous valueFactory is like asking to be used for I/O work, which not only comes typically with much higher latency, but also with the risk of exogenous exceptions, that make things messy. In many cases handling these exceptions inside the valueFactory would be problematic, and letting them propagate through the GetOrAddAsync would rise the question about the integrity of the internal state of the ConcurrentDictionary<K,V>. Which makes an argument for not adding this API in the collection.

@jairbubbles
Copy link
Author

In many cases handling these exceptions inside the valueFactory would be problematic

I guess it depends on the use case, you might want to put a special value in the dictionary when it fails. For instance you get a FileNotFoundException, the resource is not available and you store a null. It's really a design choice.

and letting them propagate through the GetOrAddAsync would rise the question about the integrity of the internal state of the

If the valueFactory throws an exception, it would just not be added to the dictionary isn't it?

@Clockwork-Muse
Copy link
Contributor

@jairbubbles - Maybe. Depending on implementation of the collection, if the factory throws an error it could permanently corrupt the collection in subtle (or hopefully not-so subtle) ways. This is almost always the case for single-threaded collections that accept lambdas (since usually if those fail your entire program is borked anyways).

The problem for a collection like this is what happens if the collection is implemented in such a way that while the first call is running a second call comes in and gets the task from the first call (as it does in that example gist). The item has been added, so it now returns an exception for that item when you await it.

Is that important? 🤷 What do you want the application to do?

@jairbubbles
Copy link
Author

The problem for a collection like this is what happens if the collection is implemented in such a way that while the first call is running a second call comes in and gets the task from the first call (as it does in that example gist).

The gist is for a ConcurrentDictionary<Tkey, Task<TValue>> if you store a TValue, I don't think you can't have multiple callers waiting on the same task.

Is that important? 🤷 What do you want the application to do?

My use case was pretty simple and switching to a ConcurrentDictionary<Tkey, Task<TValue>> was perfectly fine but I don't believe it's a very obvious change to make. So I would say the current proposal is really about simplifying the life of developers that are migrating code from sync to async.

So the real question would be: do we want to help users in that migration path? Is it important to make their life easier? I would say it's the essence of .NET 😊

@Clockwork-Muse
Copy link
Contributor

I don't think you can't have multiple callers waiting on the same task.

The gist provided earlier explicitly sets things up so multiple callers will wait on the same task. You're right you can't have multiple initiations of the same task, but (for Task, not ValueTask) multiple waiters is allowed and sometimes (as here) very much encouraged.
For that matter, any use of ConcurrentDictionary<Tkey, Task<TValue>> will require multiple callers waiting on the same Task, the only question really being "what is the state of the stored task?".

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Mar 21, 2023

I don't think you can't have multiple callers waiting on the same task.

This is correct. Waiting the same Task on multiple threads is OK. The only issue is that in case the task fails, the stack trace of the exception can be messed up.

My use case was pretty simple and switching to a ConcurrentDictionary<Tkey, Task<TValue>> was perfectly fine

Hmm, why did you prefer David Fowler's gist over Stephen Toub's one-liner? Stephen Toub's GetOrAddAsync is exactly what you have asked for, and it has also significantly less code than the gist. If you don't think that it's good enough for you, why would it be good enough for anyone else? 🤔

@jairbubbles
Copy link
Author

jairbubbles commented Mar 22, 2023

My current use case is very simple, all the parallel tasks are inserting with their own key so I don't need David Fowler's gist.

With no extension method one can write:

var results = new ConcurrentDictionary<int, Task<string>>();
await Parallel.ForEachAsync(Enumerable.Range(1, 100), async (i, cancellationToken) =>
{
	await results.GetOrAdd(i, async x => 
	{
		await Task.Delay(10);
		return $"Test {x}";
	});
});

(see https://dotnetfiddle.net/OcMxbM)

With Stephen Toub's one liner I can write:

var results = new ConcurrentDictionary<int, string>();
await Parallel.ForEachAsync(Enumerable.Range(1, 100), async (i, cancellationToken) =>
{
	await results.GetOrAddAsync(i, async x => 
	{
		await Task.Delay(10);
		return $"Test {x}";
	});
});

(see https://dotnetfiddle.net/cE1Ma6)

The original code was sync and looked like that:

var results = new ConcurrentDictionary<int, string>();

Parallel.ForEach(Enumerable.Range(1, 100), i =>
{
	results.GetOrAdd(i, x => 
	{
		Thread.Sleep(10);
		return $"Test {x}";
	});
});

(see https://dotnetfiddle.net/TNptVk)

With the second option I don't change results so I don't need to modify other parts of the application which were using the results.

@theodorzoulias
Copy link
Contributor

With the second option I don't change results so I don't need to modify other parts of the application which were using the results.

So, assuming that the goal was to migrate from sync to async as smoothly as possible, why did you switch from the original simple ConcurrentDictionary<int, string> to the more complex ConcurrentDictionary<int, Task<string>>? Or you haven't switched yet, and you are still thinking of it?

@jairbubbles
Copy link
Author

In fact my first migration was like that:

var results = new ConcurrentDictionary<int, string>();
await Parallel.ForEachAsync(Enumerable.Range(1, 100), async (i, cancellationToken) =>
{
       var result = await DoSomethingAsync(project);
        results .TryAdd(project, result);
});

As I don't need to get, I know it's not in the dictionary.

It's a bit of a digression but the more I think about it, I feel like we should be able to wrap the result of the await Parallel.ForEachAsync. A pseudo code like that would pretty much materialize more my intent:

var results =await Parallel.ForEachAsync(Enumerable.Range(1, 100), DoSomethingAsync);

// And results would wrap that in a dictionary so that I can access the result efficiently

@theodorzoulias
Copy link
Contributor

This has been proposed (and rejected) in: Provide a result returning Parallel.ForEachAsync overload.

@jairbubbles
Copy link
Author

😮

But it was returning an IEnumerable... not very efficient for consumption afterwards. I'll create a new proposal! 😉

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Mar 22, 2023

That proposal includes Parallel.ForEachAsync oveloads that return both Task<IReadOnlyList<TResult>> and IAsyncEnumerable<TResult>. Which is problematic, because the C# compiler can't resolve overloads that differ only on their return type. I guess that Task<IReadOnlyList<TResult>> is just a fancy way to say Task<TResult[]>. Microsoft won't add this API until/unless there is a huge demand for it. I guess submitting new proposals is one way to manifest such a demand, since old proposals are locked after a short period of inactivity, and people can't upvote or comment on them.

@layomia layomia removed the untriaged New issue has not been triaged by the area owner label May 10, 2023
@layomia layomia added this to the Future milestone May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Collections
Projects
None yet
Development

No branches or pull requests

7 participants