You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(1) Get rid of the option to share merge buffers, (2) refactor tracking merge buffer events
-- formerly `~into_merge_buffer:Streaming` would not generate an event,
but it should to prevent overriding the source.
(2) will be continued: prohibiting overriding till the routine using the streamed merge buffer finishes.
Copy file name to clipboardExpand all lines: arrayjit/lib/anatomy_of_a_backend.md
+3-4Lines changed: 3 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -144,9 +144,6 @@ When using the default stream, CUDA would predictably write to the standard outp
144
144
OCANNL expects backends to implement FIFO queue scheduling, and an event mechanism for synchronizing between streams (and ideally devices), matching the CUDA specification. On top of events, OCANNL implements per-tensor-node synchronization. 1/3rd of the `device` fields have to do with synchronization:
(** The tensor node that was most recently scheduled to be in the cross-stream merge buffer,
149
-
and its readiness event. *)
150
147
shared_writer_streams :
151
148
(('buffer_ptr, 'dev, 'runner, 'event) stream * 'event) list Hashtbl.M(Tnode).t;
152
149
(** The streams that most recently have been scheduled to update (write to) a
@@ -162,7 +159,7 @@ OCANNL expects backends to implement FIFO queue scheduling, and an event mechani
162
159
events are removed opportunistically. *)
163
160
```
164
161
165
-
and 1/3rd of the stream fields also:
162
+
and some stream fields also:
166
163
167
164
```ocaml
168
165
updating_for : 'event Hashtbl.M(Tnode).t;
@@ -175,6 +172,8 @@ and 1/3rd of the stream fields also:
175
172
removed opportunistically. *)
176
173
```
177
174
175
+
While we never share merge buffers across streams, there is always an event associated with an occupied merge buffer. Its primary use is for tracking the merge buffer's stream as a reader on the source stream.
176
+
178
177
Besides routines, calling `from_host`, `to_host`, `device_to_device` from a backend puts the corresponding tasks on the device's queue. Both invoking a routine and calling these copying functions will perform the necessary event creations and synchronizations to ensure that when scheduling writing into an array precedes scheduling reading from it, the actual writing also precedes the actual reading.
0 commit comments