Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Page 660: The warning of what to watch out for under the "Parallel Algorithms Are Not Magic" sounds probably wrong/misleading in part. #141
From the text:
I'm no expert in concurrency or parallelism, but two parts of this section of the text sound very likely to be wrong/misleading.
From what I understand, the existence of mutable state in the function object should only result in the potential for race conditions if that mutable state is shared. In contrast though, internal mutable state that isn't visible outside of the function object seems like it would be fine. The way you've worded it though sounds like no mutable state whatsoever should ever be used inside a function object though, which would be very limiting. It sounds wrong. Your example of a race for std::transform also uses shared mutable state, not just normal internal mutable state, so this lines up with my suspicions here.
On the other hand though, I'm no expert in concurrency and parallelism, so I might be wrong potentially. I don't think so though. The above section of text from the book sounds fishy and doesn't seem to line up with what concurrency/parallelism is supposed to be like, from what I understand of it.
Thus the two problems in the above text:
(1) "any algorithm that passes a function object to the algorithm" is not in fact a red flag, in and of itself, if my understanding is correct.
(2) The other sentence should probably be changed to read as follows (or similar):
Correct me if I'm wrong. I'm not 100% sure, but I'm fairly confident.
Hi @WraithGlade, thanks for posting!
You are absolutely correct that mutable state isn't an issue if it isn't shared. But by passing this mutable state via a function object into a parallel algorithm, you are sharing that mutable state. The parallel algorithm will dispatch work to executing threads that will share the function object you've passed. It's possible to do this correctly e.g. by using locks or atomics, but you need to pay very careful attention--hence the "red flag" idiom.
Does that make sense?
Hey Josh, thanks for the feedback and help.
I tested what you said though and my compiler returned results that do not match what you said. Instead, the results I got back imply that each function object is independently instantiated in some sense and hence probably that only captured references, shared static variables, and any other transitively shared objects are actually shared. The algorithm seems to create separate copies of the function object for every thread as needed. No state changes from within std::transform ever reach the function object that was passed in.
Here's the code I used to test it:
This code produces the following output on my computer:
I'm running Windows 7 64 bit, Visual Studio 2019, C++17, in debug mode, in x64 (i.e. 64 bit) mode.
Am I still missing something? Are these results specific to my computer?
If not though, and these results indicate what they seem to, then it would appear that your statement of how it works is wrong, in which case indeed only shared mutable state can cause problems, not just any mutable state in the function object, in which case the book text should be updated to reflect that to avoid confusion.
Concurrency and parallelism can be very tricky and subtle though. I could be wrong. I had trouble finding a clear answer to this question on the internet.
One strange thing though that I observed was that when I defined a constructor for SqFuncObj (which I later deleted in the above code) that used std::cout to print a message every time a SqFuncObj was constructed then only a single such message ever appeared. This makes it seem like only one instance was constructed... and yet std::transform seems to be keeping the operator() totally independent regardless. Perhaps it is doing a memcopy that bypasses the constructor, or perhaps something else. What are your thoughts on this?
Hi @WraithGlade, thanks for the code example! You've definitely uncovered some nuance here with function object state.
The parallel algorithms require function objects to be copyable. (You can verify this by deleting the copy constructor for your
I think it's relatively niche application to have purely local state on a function object that you've passed into a parallel algorithm. Maybe you'd want this if you were caching calculations within a thread, making network requests, etc.
It's seems more likely that a function object would want to modify some non-local state. Looking to your example, you've illustrated just this point -- the
All this is to say that I would still call this a red flag, in the sense that the programmer should be on high alert that they don't introduce race conditions. But I like your proposed modification, since the text is currently misleading on the point of function object local state:
In a future revision, I think it would be nice to have a paragraph or summarizing our discussion here though!
What do you think?
Another thing I tried was changing the operator() function to this:
... Which does indeed result in some pretty weird stuff being printed to std::cout. I thought for a moment that perhaps this proved that it actually isn't safe after all. However, just a moment later, it occurred to me that this would of course result in weird behavior regardless of the state of num_of_calls, due to the fact that std::cout (and the associated console) counts as a shared mutable state.
Hmm. Still not sure if the num_of_calls variable is safe or not really. Concurrency is subtle stuff sometimes.
The upside of being able to have mutable state in the passed-in function object would be that you could use it to help with computations per each instance maybe. Perhaps it would be better to just keep any such mutable state inside the actual function-local scope of the function call, but it would be nice to know what the exact nuances of what you can get away with in these parallel algorithm calls is.
What of the num_of_calls variable though?
If it was unsafe to have that mutable state attached to the function object then I would think the most likely outcome would be that the num_of_calls would end up with some random high value. Even if the assembly instructions end up interleaved in weird ways, I'd think it would be extremely likely to at least increase past 0.
Instead though, it always ends up as 0, which is more consistent with the theory that copies of the function object are being created than that it is shared mutable state.
Maybe I'll eventually be able to find an official authoritative source on exactly what the behavior here should be somewhere online, but I haven't been able to find one yet.
Oh, and it also occurs to me that I should mention that I'm not actually trying to compute num_of_calls in my code (not really), but rather I am just using that to test the claim about shared mutable state being unstable or not in this context.
It is clear to me that regardless of whether the num_of_calls variable is thread-safe or not that it can't possibly return a reliable value of the number of calls that were made, unless we used locks or other concurrency mechanisms.
My code is just intended to try to figure out whether or not mutable states internal to the function object are thread safe or not or if instead we cannot safely use such constructs.
Hey again Josh, I just saw your new edit you made to your second response.
Thank you very much for the clarifications. The comment about the copy constructor deletion etc made it finally click for me. For some reason I hadn't considered using the copy constructor to test whether the function object was actually being copied or not. I was trying to work from the default constructor instead, to test copying, which in hindsight doesn't make much sense.
Now I understand the nuances enough to feel more comfortable using these parallel algorithms from . And, upon more consideration, I do think you're right that this is a fairly niche use case in many contexts, but its nice to know that it's there. It also makes sense from a function object parameter value copy semantics standpoint that it would be designed this way.
I think adding a new paragraph as you've suggested in your comment edit sounds good. It will probably help some readers get a clearer understanding of it. Where there's one confused person, there's bound to be others.
Anyway: Thanks a bunch for the help. I think I understand it significantly better now.