You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lib/syntax_extensions.md
+10-8Lines changed: 10 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -344,16 +344,18 @@ The syntax of an axis spec:
344
344
345
345
Examples:
346
346
347
-
-`...|...->... => 0`, `...|... => 0` and `... => 0` are equivalent: reduce all axes of the argument into a single number. Useful e.g. for reducing losses to a single number.
347
+
-`...|...->... => 0`: reduce all axes of the argument into a single number. Useful e.g. for reducing losses to a single number.
348
+
-`...|... => 0`, `...->... => 0`, `... => 0` do the same but will fail if the argument has axes of the kind for which the ellipsis is missing.
-`...->... => ...->...`, `...|... => ...|...`, `... => ...`: fully pointwise but will fail if the argument has axes of the kind for which the ellipsis is missing.
-`...|...->... => ...->...` and `...->... => ...->...` are equivalent: reduce the batch axes into the result.
352
+
-`...|...->... => ...->...`: reduce the batch axes into the result.
351
353
-`2...|...->... => ...|...->...`: slice the tensor at dimension 2 of the leftmost batch axis. Note that the tensor operation `@|` implements slicing at the leftmost batch axis for arbitrary dimension.
352
-
-`...|... => ...|...2`: expand the tensor by putting the argument at leftmost output dimension 2 of the result (and reduce input axes if any). `rhs ++ "...|... => ...|...2"` will fill the other cells of the new tensor with zeroes; `[%cd lhs =:* rhs ~logic:"...|... => ...|...2"]` will fill the other cells of `lhs` with ones since it's the neutral element of the assignment (reduction) operator.
353
-
-`ijk => kji`: reverse the three rightmost output axes, reduce any other axes.
354
+
-`...|... => ...|...2`: expand the tensor by putting the argument at leftmost output dimension 2 of the result (and reduce input axes if any). `rhs ++ "...|... => ...|...2"` will fill the other cells of the new tensor with zeroes; `[%cd lhs =:* rhs ~logic:"...|... => ...|...2"]` will fill the other cells of `lhs` with ones since it's the neutral element of the assignment (reduction) operator, here with ones.
355
+
-`ijk => kji`: reverse the three output axes, fails if the argument has any other axes.
354
356
-`ijk => ki`: as above but also reduce the second-leftmost output axis.
355
-
-`..v..|ijk => ..v..kji`: reverse the three rightmost output axes, reduce any other output and input axes, pointwise for batch axes, pairing the batch axes with the leftmost output axes of the result.
356
-
-`2..v..|... => ..v..`: slice the tensor at dimension 2 of the leftmost batch axis, reduce all its input and output axes, preserve its other batch axes as output axes.
357
+
-`..v..|...ijk => ..v..kji`: reverse the three rightmost output axes, reduce any other output axes, pointwise for batch axes, pairing the batch axes with the leftmost output axes of the result. Fails if the argument has input axes.
358
+
-`2..v..|... => ..v..`: slice the tensor at dimension 2 of the leftmost batch axis, reduce all its output axes, preserve its other batch axes as output axes. Fails if the argument has input axes.
357
359
358
360
## Further features of the syntax extension %cd
359
361
@@ -413,7 +415,7 @@ If you recall, inline declared param tensors get lifted out of functions except
413
415
414
416
```ocaml
415
417
let mlp_layer ~config =
416
-
let w = TDSL.param "w" and b = TDSL.param ~output_dims:[ config.hid_dim ] in
418
+
let w = TDSL.param "w" and b = TDSL.param ~output_dims:[ config.hid_dim ] "b" in
417
419
fun x -> TDSL.O.(w * x + b)
418
420
```
419
421
@@ -519,4 +521,4 @@ type comp = {
519
521
}
520
522
```
521
523
522
-
The tensor nodes that are in `asgns` but not in `embedded_nodes`, and are on-device, must already be present in contexts with which the computation is linked. Such non-embedded nodes can be seen as inputs to the computation -- except that for `backprop` code of a tensor, they are actually the outputs! Embedded nodes are closely related to _rootness_ -- when a node has not been used in the code of another tensor, it is a root (a forward root for value nodes and a backprop root for grad nodes). `embedded_nodes` were roots the first time they were used in `asgns`.
524
+
The tensor nodes that are in `asgns` but not in `embedded_nodes`, and are on-device, must already be present in contexts with which the computation is linked. Such non-embedded nodes can be seen as inputs to the computation -- except that for `backprop` code of a tensor, they are actually the outputs! Embedded nodes are closely related to _rootness_ -- when a node has not been used in the code of another tensor, it is a root (a forward root for value nodes and a backprop root for grad nodes). `embedded_nodes` were roots the first time they were used in `asgns`. Parameters, as created by `Tensor.param`, are not embedded in the code that uses them and thus will not be in `embedded_nodes` of the forward and backprop code over the parameters; however, they will constitute the `embedded_nodes` of the `Tensor.init_params` code.
0 commit comments