You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: arrayjit/lib/anatomy_of_a_backend.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -204,22 +204,22 @@ Unless disabled via setting `automatic_host_transfers` to false, `arrayjit` auto
204
204
205
205
-`prepare_read` for synchronization and `to_host` transfers right before a host array is read,
206
206
-`prepare_write` for synchronization right before a host array is written to,
207
-
-`host_read_by_devices` for tracking which devices have scheduled transferring the data already.
207
+
-`devices_not_lagging_host` for tracking which devices have scheduled transferring the data already, or don't need transferring because they computed or scheduled computing the data themselves.
208
208
209
-
Since currently the tagging is per-device, for per-stream tensor nodes might need supplementary `from_host` (or `device_to_device`) calls in rare situations.
209
+
Since currently the tagging is per-device, for per-stream, tensor nodes might need supplementary `from_host` (or `device_to_device`) calls in rare situations.
210
210
211
211
There are three code components to the automation.
212
212
213
213
- Within `Tnode`:
214
-
- The helper function `do_read` unconditionally invokes synchronization code, and if `automatic_host_transfers` invokes data transfer code, as stored in the `prepare_read` field of a node; then clears the field.
214
+
- The helper function `do_read` unconditionally invokes synchronization code, and if `automatic_host_transfers`it invokes data transfer code, as stored in the `prepare_read` field of a node; then clears the field.
215
215
- The helper function `do_write` unconditionally invokes synchronization code as stored in the `prepare_write` field of a node, then clears the field.
216
216
-`do_read` is invoked from `points_1d`, `points_2d`, `get_value`, `get_values` of `Tnode`; and also from `to_dag` and `print` of `Tensor`.
217
217
-`do_write` is invoked from `set_value`, `set_values`.
218
218
-`Tnode` exposes `prepare_read` and `prepare_write` for updating the fields: only the new data transfer is preserved, but the synchronization codes are combined.
219
219
- Within `Backends.Add_buffer_retrieval_and_syncing`:
220
220
- The `update_writer_event` helper adds the after-modification event to synchronization and sets data transfer to `to_host` from the stream, using `prepare_read`. This happens for `device_to_device` and `sync_routine` (after scheduling the routine) scheduling calls, and independently of `automatic_host_transfers`.
221
-
- Moreover, `sync_routine`, before scheduling the routine and only if `automatic_host_transfers`, directly schedules `from_host` for input nodes that are not tagged with the device (via `host_read_by_devices`). Note that input nodes are the "read only" and "read before write" nodes that are not constants.
221
+
- Moreover, `sync_routine`, before scheduling the routine and only if `automatic_host_transfers`, directly schedules `from_host` for input nodes that are not tagged with the device (via `devices_not_lagging_host`). Note that input nodes are the "read only" and "read before write" nodes that are not constants.
222
222
- Within `Backends.Raise_backend.alloc_if_needed`:
223
-
- If `automatic_host_transfers` and the node allocated for the context is a constant, `alloc_if_needed` directly schedules `from_host` for the node regardless of whether it is tagged with the device (via `host_read_by_devices`); it does add the device tag to the node (if missing).
223
+
- If `automatic_host_transfers` and the node allocated for the context is a constant, `alloc_if_needed` directly schedules `from_host` for the node regardless of whether it is tagged with the device (via `devices_not_lagging_host`); it does add the device tag to the node (if missing).
224
224
225
225
**Note:** we do **not** invoke `Tnode.do_read` from within `Backends.Add_buffer_retrieval_and_syncing.from_host`, since to adequately handle such transfers one should deliberately use `device_to_device` functions. This can lead to confusing behavior, in particular observing (or not) a tensor node (on host) can change later computations by inserting (or not) an additional `to_host` before a `from_host`. This aspect of the design might change in the future.
0 commit comments