-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No-op Intrinsic in Exo #565
Comments
From the GPU side:
|
I think it is possible that |
Just a general comment on my original post: I suggested above that the reserved name should be |
skeqiqevian
added a commit
that referenced
this issue
Aug 21, 2024
implement a new scheduling op to insert no-op function calls anywhere, addressing #565. This scheduling op will be useful for inserting prefetching and potentially synchronization primitives arbitrarily. For now, it only works if the body of the proc is a single `pass` statement, but we could potentially extend this in the future. --------- Co-authored-by: Yuka Ikarashi <yukatkh@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There has been various examples of situations where we would like to insert no-ops into Exo programs. GPU synchronization #547 is one example. There are some examples in CPUs (prefetching, barriers, pause, etc). Some other examples are architecture independent (inserting debug information into the programs).
I think there are a few questions to answer with regard to this feature:
I think it is useful to have a discussion on what various things we might want from them and from different perspectives so that we can get a general design that is not overfit to one particular use case.
prefetching:
Here are the answers to the questions above driven by ideas I have had to support prefetching on CPUs and some of the issues that show up:
no-op
that can be used as a name to a proc call. This proc can accept an arbitrary number of arguments.insert_no-op(Proc, Gap, *NewExprs)
which inserts ano-op
call at the gap cursor with the arguments being the provided list of new expressions.replace
d by some instruction and so those instructions will be checked. It doesn't matter what the backend checks do for the arguments of an unreplaced no-op call here since the no-op call won't be generated either ways. In any case, we don't really know the type and memory of the no-op proc.Here is how I envision being able to insert prefetches:
This gets at another problem with the memory operand issue I mentioned earlier. What should the precision of this operand be? It is not really any of the types we have. This definition will throw an error on another precision. Should I implement an instruction for each precision? Should there be a way to talk about memory addresses in Exo?
You can also add some nice helpers in the stdlib on top of this mechanism above:
The text was updated successfully, but these errors were encountered: