Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add task pool #37

Open
wants to merge 22 commits into
base: master
Choose a base branch
from
Open

Add task pool #37

wants to merge 22 commits into from

Conversation

Azer0s
Copy link
Contributor

@Azer0s Azer0s commented Mar 13, 2022

This is more of a prototype right now but it is already a bit faster than the normal lop.Map implementation.

image

Running on M1 Macbook Pro 2021

@Azer0s Azer0s mentioned this pull request Mar 13, 2022
@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 13, 2022

Currently I have a return slice and just map the values with a function, I will change this however to let the users manage return data themselves

@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 13, 2022

image

Alright, done. Made it a bit faster even

@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 13, 2022

I have done some benchmarks and will commit them later. As of right now, the task pool implementation is about 4x faster.

Copy link
Owner

@samber samber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contrib @Azer0s !

I made a few comments

parallel/slice.go Outdated Show resolved Hide resolved
parallel/task_pool.go Outdated Show resolved Hide resolved
@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 13, 2022

Refactored everything according to comments. Shall I add some doc in README? @samber

@samber
Copy link
Owner

samber commented Mar 13, 2022

Yes please!

Can you update the benchmark at the bottom of README?

@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 14, 2022

@samber Done. Could you look through the documentation I added?

@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 14, 2022

@samber I added another helper function called With. This is useful for the situations described in the readme and for some situations with the taskpool.

@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 27, 2022

please merge @samber

@Bin-Huang
Copy link
Contributor

Bin-Huang commented Mar 28, 2022

Good job. But maybe the API that implemented with method With() is a little complex to use?

I think expandable optional parameters might be more suitable:

func Map[T any, R any](collection []T, iteratee func(T, int) R, options ...*ParallelOption) []R {
}

// Usage Cases:

// set max concurrency count with option
parallel.Map(collection, callback, parallel.Option().Concurrency(20))

// normal calling
parallel.Map(collection, callback)

// expandable to more options in future
options := parallel.Option()
options.Concurrency(10)
options.Timeout(2 * time.Second)
options.Retries(3)
parallel.Map(collection, callback, options)

By the way, this design with optional parameters was well-practiced in other language communities:

Node.js p-map

await pMap(array, callback, { concurrency: 20 })

Node.js bluebird

await Bluebird.map(array, callback, { concurrency: 20 })

Node.js prray

await Prray.from(array).mapAsync(callback, { concurrency: 20 })

@Azer0s
Copy link
Contributor Author

Azer0s commented Mar 28, 2022

@Bin-Huang Ok I found a solution that imo is a bit more elegant.
Kind of a compromise between your and my solution.

Could you take a look? @Bin-Huang @samber

@Bin-Huang
Copy link
Contributor

@Azer0s @samber
I also wrote some code this weekend, maybe we can share more ideas on the code.

#81

for i, item := range collection {
go func(_item T, _i int) {
res := iteratee(_item, _i)
var DefaultPoolSize = runtime.NumCPU() / 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it runtime.NumCPU() can be 1, then DefaultPoolSize will be zero?

Copy link
Contributor

@Bin-Huang Bin-Huang Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know some use cases will cause confusing default behavior about default pool size.

Let says someone wants to request 10 URLs in parallel on a device had 2 CPUs, he will write some code just like that:

responses := parallel.Map(URLs, request)

In this case, requesting a URL is only an IO-intensive task, so for the best performance and efficiency, it should requesting 10 URLs at the same time, just like the code above looks like for the package user.

But DefaultPoolSize is working internally and it is 1 (half of the CPU number), so actually, only one requesting task is started at the same time. And in this case, someone will confused about why the code will be slower 10x times.

The task pool can improve performance only if the developer knows what they doing, and adjusts the pool size thoughtfully for the special scene. A global default pool size can't improve performance in any scene, and often it's worse (if it can, the Golang GMP will make it internally).

I recommend removing the default pool size, and only using the task pool when the concurrency limit option is set.

@yujiachen-y
Copy link

I think this PR is very useful, are we planing to merge it? 😂

@brettmorien
Copy link

Would love to see this feature. Bit of a blocker to have unlimited goroutines getting spun up when using parallel functions.

@kcmvp
Copy link

kcmvp commented May 2, 2024

These methods are very useful, when those are be merged ?

@kcmvp
Copy link

kcmvp commented May 6, 2024

@Azer0s thank your great contribution very much! I like these feature as well. Could please resolve the conflicts and merge it ? @samber

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants