-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to limit number of concurrent tasks. #2
Conversation
…t to twice the number of CPU cores.
Oh, and I only applied this to NSArray+map for now, but the same applies to all other methods. |
Nice, it's good to get real world results. I was lead to believe that GCD was smarter than that about spawning threads. I didn't expect it to take memory use into consideration, but I did believe it factored CPU resources and availability into it's decision making. Did you have any alternatives to the term This is where a benchmark in the project would come in handy. Part of me feels that I'd prefer not to impose a default cap on the number of simultaneous invocations (2 x #cores), unless a benchmark showed it to be better (for non-memory intensive tasks). I only say this because all things being equal, I'd rather defer to GCD to handle concurrency. Since that is its purpose for being. For tasks that take up large amounts of memory, maybe it should be up to the developer to call a method that explicitly limits the number of tasks. Even without this change, they could always use a semaphore in the calling code to achieve the limit. I'm just thinking aloud, I'm not opposed to a default limit. On the other hand, perhaps the best way to handle this is to get rid of the vanilla methods ( |
After glancing at the GCD docs, ‘operation’ might be more appropriate.
On second thought, I believe you’re right.
I was also considering this, but, unless the code path that has no limit still imposes some performance penalty, I felt that the lib should acknowledge the potential issue and provide a solution.
Nah, I would keep the sugar and default to no limit, as you suggested. The good thing about a lib with not a lot of methods is that people with the memory issue will be able to easily find the method that allows you to limit the number of operations. |
Haha, I did too and saw that 'task' get's used quite frequently. The docs for
Do you want to finish this patch, or do you want someone else to? I only ask because I presume that managing CocoaPods would make delegation your M.O. |
I was reading the man page for dispatch_apply, and came across the section on "striding" for performance tuning. I was unfamiliar with striding, it reminds me of loop unrolling. I wonder how it relates, if it all, to the task limits discussed here. It seems like striding could potentially address the performance troubles you ran into, by serializing execution in small doses, but it doesn't give as direct control as a semaphore based limit. On first impression, it seems like it could be a worthwhile addition to the interface, but adding both a task limit and striding seems clunky. Just mentioning it, no response required. |
After posting my response, I realised that the better argument for ‘operation’ over ‘task’ is that the lib name uses the label ‘operations’ :)
Hah, yeah that’s definitely true for the CP stuff, but in this case I’m really open for creating the patch. But if, in the meantime, someone else wants to implement it, I’m totally fine with that as well. I have no preference.
I think it’s better to wait for someone to come along that actually has a situation where dispatching operations is more costly than combining a couple of operations. |
Great, I wasn't sure so I asked.
Busted, but you're right, YAGNI, damnit! |
Btw, in case you have access to the as of yet REDACTED APIs, check the memory related additions to GCD ;) Which also begs the question, should we then still add a built-in facility for this, or explain it in docs or a sample? |
I had skimmed right past those, thanks for the heads up. Interesting additions. Honestly, I don't know the answer to that question. I'd want to run some experiments. I'm thinking maybe our API needs to offer a way to allow callers to dynamically throttle of the internal |
Let’s close this for now and see if others come up with similar questions. |
I'll create an issue to track it, and mention the issue in the README. Thanks for highlighting the issue. |
As I’m in full preparation mode for WWDC, I did not have time to make this into a full fledged patch yet, but I did want to open up discussion on it now.
This current patch limits the number of concurrent tasks to a given number and defaults to twice the number of CPU cores. (Somewhat arbitrary, but likely a good default.)
The reason for this is that GCD will happily spawn as much threads as needed, but it does not take into account other stress on the machine, e.g. memory usage. So if you have a task that takes up a lot of memory per task, your system comes to a grinding halt before it ever finishes.
I’m thinking that it would be nice to be able to set a global default, or if not specified, default to something like twice the CPU cores. It would also be nice to be able to specify that a job does not need to be limited, but it should probably be done per job (not global). I.e. the shorthand methods all check the global default value, whereas the verbose methods /need/ a value, which could be
0
meaning ‘no limit’.