-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rethink "write to array element" interface? #17999
Comments
Can you show in an example program where you would want it to occur? I have for a long time been thinking that we should have user-facing data structures that are aware of the MCM. What that means to me specifically is that in a case like this, the data structure can cache something, but it hooks the read/write fence operations that we are doing anyway (conceptually always but in practice for cache-remote these actually turns into some kind of call). In that event, these cached values can be "written back" on the fence that causes them to need to be stored. Besides those kind of approaches the other potential direction I can imagine that is very different from what we have today is to have some sort of "reference" record type that the array API can provide and return. But it would go beyond just |
A[i] = 42; where A's domain map is:
I find that a really intriguing idea, particularly if the pattern can be made reasonably accessible / straightforward for a user to write and reason about. The one slight hesitation I have about it is whether it would be too lazy / passive for some cases that would want to use more of an ("eager") write-through approach rather than an ("eventual") write-back approach.
This is more where my own thinking has been, where I'd been wondering whether we'd want to support a paren-less |
To be clear, the MCM-aware user-facing data structure idea would combine with some sort of reference record type - where the data structure author, in implementing the reference record, would have the opportunity to implement immediate write-through or eventual/fence-driven write-back. I might be too limited in my thinking but I think a combination of implicit conversion (for reads) and So anyway I wonder if a reasonable next step would be to sketch out some reference records and start to develop some language support ideas for them & see if we can get the job of implementing those not to be too hard. |
I mostly understood this, but don't understand how write-through would work. I was imagining it would need to be lazy / eventual write-back on the assumption that the thing that would trigger the "write*" method to be called would be the compiler generating a memory fence/flush style operation according to the MCM model. So that suggested to me that I couldn't write the type to do it eagerly, but would be at the mercy of when the compiler believed my write should/could occur?
Oh, I think you're probably right about the implicit conversion for reads, that's a good point. I hadn't thought of that, though you've probably suggested it before in similar conversations. w.r.t. proc foo(x: int) {
x = 42;
}
foo(A[i]); where, if A[i] returned some sort of special record type, that record type itself can't be passed into foo(), the FWIW, I've started sketching out some of these patterns using records within the contexts of arrays, but have been running into problems getting it threaded through all of the array machinery in the internal modules and compiler. The current barrier being that |
I'm imagining that the record author can implement
Well, we could imagine replacing the |
@mppf: Thanks for chewing on this with me.
Are you imagining that this choice between write-through or write-back would be opted into via some sort of high-level bit or pragma or annotation on the Meanwhile, this idea:
is feeling a bit like the "implicit conversion in |
That they'd explicitly create code to save the value somewhere, for write-back; and that they'd start the write immediately, for write-through. Regarding Revisiting the example though: proc foo(ref x: int) {
x = 42;
}
foo(A[i]);
// Q: What can A[i] return to make this work? I wonder if it would be acceptable to lean on other language features (namely, return intent overloading and proc foo(out x: int) {
x = 42;
}
foo(getAElt(i));
proc getAElt(i: int) ref {
// return some sort of type that exists only to be written to with `=`
return new writeHandler(A, i);
}
proc getAElt(i: int) const ref {
// do the regular thing we do today
return A[i];
} What seems good about this:
What isn't working:
|
…nstRef Remove trivial shouldReturnRvalueByConstRef() routine [trivial, not reviewed] This routine was defined to always return 'param true', so just removed it and all where clauses referring to it. I happened across this while helping others navigate ways of creating an array implementation that returns classes instead of eltTypes while exploring workarounds for issue #17999.
As an update on this: One of the groups who was trying to do this has gotten a prototype working that is promising, but not quite as seamless as ideal. The approach taken was essentially:
The main downside of this approach so far is that when the array access is on a RHS, like
This is sufficient for experimentation, but obviously not for the long-term. It also causes problems for composition. For example, other array operations that call dsiAccess() or read the array need the extra parenthesis as well (but only for the cases where the array is implemented by this type. Briefly, I was thinking "Oh, but we could have the
Of these two, the second seems attractive in terms of its orthogonality with paren-ful this methods, but more than a little problematic since it's so syntactically invisible (when would a naming such an object not result in a read? How would other things be done with the object? [edit: Or are these non-issues? Is the idea that any non-ref access to the class would resolve to such a read? I feel this would benefit from more thought]). The first seems attractive conceptually, but challenging to know how to express. On the other hand, if we did come up with a way of expressing it, it might eliminate what is currently a subtle and fairly unique case (return intent overloading). Or, maybe there's some other cool solution that I haven't thought of yet. |
Coming back to one of the original challenges:
Here, we can return a So, this gets me thinking that maybe we could keep the for example proc getRefToElt(i) ref : eltType { ... }
proc getArrayElt(i, ref _realRef=getRefToElt(i), _scope=new endOfRef(_realRef)) ref : eltType {
return _realRef;
}
proc getArrayElt(i) const ref : eltType {
...
}
proc userCode() {
ref x = getArrayElt(1);
// creates temporary _scope to be deinited at end of block
x = 12;
// now _scope is deinited and finalization can occur
getArrayElt(2) = 24; // temporary _scope is deinited after this line
} I think the main thing I am worried about with the above approach is that we wouldn't be able to write something returning a proc getRef() ref {
return getArrayElt(1); // uh-oh, _scope will be deinited here
} So I guess that means that we can't really split the scoping from the Separately, one thing I find a bit unsatisfying about the return intent overloading is that it requires that the collection default-init a new element in order to return it by Anyway, regardless of the Continuing along this line, I think the main choice ahead of us is, for a reference to |
This got me thinking about an alternative to return intent overloading to achieve the same goal (at least, for arrays). I call the idea "compound operators" and I don't remember if it's come up before or not but I'm pretty excited about the possibility. We want to be able to differentiate patterns like this A[i] = f(args); from patterns like this g(A[j]); What if we could express Here I am imagining that the array implementation for A could provide: // this is a "compound operator"
// it matches code that *syntactically* has the form A[i] = rhs
// by defining a function body for the combination of proc this and operator =
operator myarray.this(i: int).=(rhs) {
// a "for instance" implementation doing something nontrivial
lockElt(i);
getOrCreateEltRef(i) = rhs; // assign element
unlockElt(i);
// could do "write back" here, too
}
// this is the regular access operator that is used if the above doesn't match
proc myarray.this(i: int) {
haltIfBoundsCheckFails(i);
return getEltRef(i);
} (As an aside, I'm tempted to also allow What I think is interesting about this idea is that it seems general enough to help with a number of other issues:
What are some issues with this approach?
Despite these issues, I think the idea could go somewhere useful. |
This makes me think of Python's
and You do miss out on the ability to take a var view = new view(arr, 10:20, 3);
foo(view);
proc foo(arr) {
var x = arr[0];
arr[10] = x * x;
} The caching discussion makes me think of aggregators and managers, where you might imagine something like manage writeBack(arr, bufsize=16) as carr {
carr[10] = 20;
// reading arr[10] would give the old value
carr[20] = 30;
} // end of scope flushes writes
// replicator
manage replicateWrites(arr1, arr2) as rarr {
rarr[10:20] = 20 // writes to arr1 and arr2
} The rewrites to avoid temporaries in |
I looked at that page but I'm not quite following what is happening there. Is there compile-time evaluation going on? Or just AST transformation? There also was a lightning talk in LLVM-HPC 2021 about this -- by Alister Johnson "Automatic and Customizable Code Rewriting and Refactoring with Clang". They showed this example which specifies how to rewrite a CUDA kernel launch as a HIP one: [[clang::matcher("kernel_call")]]
auto cuda_kernel(int numBlocks, int numThreads) {
auto arg1, arg2;
{
kernel<<<numBlocks, numThreads>>>(arg1, arg2);
}
}
[[clang::replace("kernel_call")]]
auto hip_kernel(int numBlocks, int numThreads) {
auto arg1, arg2;
{
hipLaunchKernelGGC(kernel, numBlocks, numThreads, 0, 0, arg1, arg2);
}
} What I found exciting about this talk is that it's a pretty simple interface for a user to write some AST match & transform. (The We could consider using a very general strategy along either these lines for the specific problem in this issue. I think the main challenge to doing something extremely general is that it can interfere with a programmer's ability to understand their code. At least with the combinations-of-operators ideas, the behavior of the code at the call site is still (arguably) understandable (without knowing all of the details of all of the implemented combinations). In contrast, with a general rewriting strategy, we'd have to be more agressive about saying something like "Use this feature very carefully or users of your library will be confused". |
Off-issue, @bradcray was asking about whether some of the solutions here could help with optimizing operations on bigints (where the main issue here is wanting to avoid extra allocations - i.e., reuse existing bigints for the results). How would the "compound operators" idea described above in #17999 (comment) handle this? The BigInteger module would have some sort of proc/operator/something that is a single call that the compiler will make for Note that we already effectively have both the compound version and the non-compound version of the operators like |
I wonder if we can do this today with assignments participating in overloads so that x = f(a, b);
// looks for overload
f(a, b, out x); // maybe this needs to be out or ref or ref out?
A[i] = b;
// looks for overload
operator=(b, ref A, i); |
Right but what bugs me about that is that it's asymmetric. In the first case, we have a special |
I'm imagining it would just be one possible overload and it wouldn't "mean" anything in particular, that is up to the definition. I would think we'd also need
Good question and I think in what I've proposed, that would only look for an overload if EDIT: extra thought is that in the bigint example, the function is already written like |
TLDR: While smart references seem like a valuable idea, I think they probably aren't the most natural solution to this problem. Since most of the fine details here concern the collection, we should probably just let the collection itself sort things out with a "compound expression" operator instead of delegating to a smart reference and creating a dependency. I've read some of the backlog here and it seems like thoughts have gone in one of two directions:
I have been thinking about this w.r.t. context managers and aggregation (apparently I should have paid more attention to this thread...). I've entertained the idea of something like: record smartRef {
// What information goes in here to make this useable? Is it data-structure dependent? E.g...
// Need to know the key to commit the write later? Same for value?
type K, V;
type M;
var ptrSlot: c_ptr(M.slotType); // Get a pointer to the actual slot in the map...or some sort of aggregation buffer.
// We store the key and value here rather than in ourselves? Implementation detail...
// This is pretty much the exact same setup as what Brad described above a few months ago.
// Use 'this' for reads, but it's clunky...
operator this(...) {}
// Use 'operator=' for "writes"...
operator =(lhs, rhs) {
// Communicate with the map to do some sort of aggregation...
}
// But now how do I let this reference behave like plain-old-references? I tried forwarding, no luck.
forwarding v;
}
// Q: can a smart reference make this be both task-safe and handle aggregation?
manage myMap.guard(k) as v do v = 1; This is pretty much what Brad described in: #17999 (comment) This code is all over the place, but basically I wanted to try and spruce up the
While I still think smart references could be a useful tool and it would be nice for the language to be able to support them, I think having a sort of "compound expression" e.g.
manage myMap.aggregate(style, bufsize, moreConfigStuff) do
forall i in 0..hi do
myMap[i] += 1; Additionally we can still introduce managers to allow a collection to adopt more coarse-grained locking, and can combine that with aggregation as well. |
One thing occurred to me about the compound operators idea. Suppose you want to do something like this to your array element: A[i] = f(A[i]); Today we can write this like so: ref elt = A[i];
elt = f(elt); to do the array access only once. That might be important for some atomicity reason. (E.g. if the elements are atomic values, we can imagine something like So, that is where I would imagine an The update method is something like this: A.update(i, lambda (ref elt) {elt = f(elt);});
// Or, if we combine the update with the compound operator, we could write
A[i].update(lambda (ref elt) {elt = f(elt);}); The context manager way of doing the same thing looks like this: manage A.element(i) as elt {
elt = f(elt);
} @dlongnecke-cray - I am curious, can a context manager that returns Anyway, I think that the context manager could return a regular reference to But, one of the main points of the context manager or update function here is that the data structure knows when the process of updating the element is done, so that it can know when to do the "write back" (from the issue's original post). So anyway that leads me towards thinking:
I think an interesting thought experiment is to think about how we might be able to extend the compound operator idea to unpredictable combinations. |
@mppf The managed block can access Thanks for pointing out that we still need I still do like compound operator idea for the simple cases, though. I am an advocate for having as much of the locking and aggregation details in the data structure as possible, which is why I was trying to push back a little against the smart reference idea (because it creates this complex relationship between the data structure and another type).
Actually, I'm wondering if this matters in a world where there are no other existing references to // Here we lock on 'k(ey)', only one task sees `ref v` at any time
manage m.element(k) as ref v do v += 1; If
The only thing I can think of right now is some way of wrapping the RHS expression evaluation into some sort of critical section (like an implicit |
It would indeed be nice if the unordered-forall optimization / automatic aggregation optimization could apply. But I imagine that these would need to be communicating with the data structure (as I would imagine that the data structure needs to own and manage the aggregation buffers, etc). As a result, maybe it would take a different form in the implemantion. For example, the compiler might add an argument to the compound operator calls / the enter&leaveThis calls (or something?) to indicate that the loop pattern indicates that the order of these operations don't matter. But anyway the alternative is for the compiler to create the code to be done in an unordered/aggregated way (which is more what #12306 gets to). But to my tastes at the moment, I think it would be better for the data structure to be involved, but benefit from the conclusions of the compiler's analysis. (Basically because the data structure knows more about what is going on than the compiler does). I'm not super confident that either approach is the best/right one, though. |
I am idly wondering if we would be able to remove the return-intent-overloading feature entirely, if we had some success solving this issue in a different way. |
This is kindof re-hashing things already discussed above but I wanted to post a comment summarizing how I currently feel about the compound operators idea. I am still pretty optimistic about the compound operators idea described above. I think the main drawback of it is that it's not inherently clear where to draw the line between combinations that are handled normally through multiple calls and combinations that are a compound operator. That comes up in a few places: 1) understanding some code calling array (say) operators/functions/methods 2) when implementing an array type or some more general data structure, which things need to be supported as compound operators. Due to those limitations, I don't think the compound operators idea can by itself address the histogram race condition (see issues #19102 and #8339). However, I still think that the compound operators idea can be very valuable to us. We just have to decide where to draw the line, so to speak, in terms of which things are implemented as compound operations for array types / more general collections. So, my straw proposal is
|
In two separate contexts that I've been working with recently, I've been facing challenges with how we implement writes to array elements, for cases like this:
A[i] = j;
Specifically, the current interface for writing to an array essentially accesses into the array and returns a reference to the array element that should be written. This works fine for traditional/simple arrays, but for some more complicated approaches that we might like to express, it raises challenges:
I'm looking for inspiration as to how we might be able to adjust our array machinery to support these cases, ideally without too much fuss for the array developer.
The text was updated successfully, but these errors were encountered: