Documentation fixes

lukstafi · lukstafi · commit c007bdeb6115 · 2025-09-06T21:22:40.000+02:00
diff --git a/README.md b/README.md
@@ -36,23 +36,25 @@ OCANNL is sponsored by [Ahrefs](https://ocaml.org/success-stories/peta-byte-scal
 
 ## Usage
 
-Starting from OCANNL 0.5.2, the CUDA backend requires at least CUDA version 12.8. The Metal backend requires at least MSL version 3.1.
+The CUDA backend requires at least CUDA version 12.8. The Metal backend requires at least MSL version 3.1.
 
 [API documentation entry point](https://ahrefs.github.io/ocannl/dev/).
 
 A possible route to learning OCANNL:
 
 1. Read [the introductory slides](https://ahrefs.github.io/ocannl/docs/basics_backprop_training_codegen.html).
-2. Get some basic grasp of the aims and design of the project by reading or skimming files in [test/](test/).
-3. Read the syntax extensions documentation [docs/syntax_extensions.md](docs/syntax_extensions.md).
-4. Read the introductory part of the shape inference documentation [docs/shape_inference.md](docs/shape_inference.md).
-5. Read the configuration documentation [ocannl_config.example](ocannl_config.example).
-6. Improve your understanding by reading or skimming: [lib/shape.mli](lib/shape.mli), [lib/tensor.mli](lib/tensor.mli), [lib/operation.ml](lib/operation.ml), [arrayjit/lib/backend_intf.ml](arrayjit/lib/backend_intf.ml), [lib/train.ml](lib/train.ml), and [lib/nn_blocks.ml](lib/nn_blocks.ml).
-7. Read [docs/anatomy_of_a_backend.md](arrayjit/lib/anatomy_of_a_backend.md).
-8. Read the implementation overview:
-   1. Shape inference details [docs/shape_inference.md](docs/shape_inference.md).
-   2. Backend-independent optimizations [docs/lowering_and_inlining.md](arrayjit/lib/lowering_and_inlining.md) -- _lowering_ means translating (compiling) from the high-level representation (as assignments) to the low-level representation.
-   3. More documentation to come.
+2. Read [the migration guide](docs/migration_guide.md).
+3. Soon: [shapes and the generalized einsum beginner-to-advanced slides](https://ahrefs.github.io/ocannl/docs/shapes_and_einsum.html).
+4. Read the syntax extensions documentation [docs/syntax_extensions.md](docs/syntax_extensions.md).
+5. Read the introductory part of the shape inference documentation [docs/shape_inference.md](docs/shape_inference.md).
+6. Read the NN building blocks file [lib/nn_blocks.ml](lib/nn_blocks.ml).
+7. Skim the configuration documentation [ocannl_config.example](ocannl_config.example).
+8. Improve your understanding by reading or skimming: [lib/shape.mli](lib/shape.mli), [lib/tensor.mli](lib/tensor.mli), [lib/operation.ml](lib/operation.ml), [arrayjit/lib/backend_intf.ml](arrayjit/lib/backend_intf.ml), [lib/train.ml](lib/train.ml).
+9. Read [docs/anatomy_of_a_backend.md](arrayjit/lib/anatomy_of_a_backend.md).
+10. Read the implementation overview:
+   1. The various tests.
+   2. Shape inference details [docs/shape_inference.md](docs/shape_inference.md).
+   3. Backend-independent optimizations [docs/lowering_and_inlining.md](arrayjit/lib/lowering_and_inlining.md) -- _lowering_ means translating (compiling) from the high-level representation (as assignments) to the low-level representation.
 
 ### Using the tracing debugger with CUDA computations
 
diff --git a/arrayjit.opam b/arrayjit.opam
@@ -10,7 +10,7 @@ authors: ["Lukasz Stafiniak"]
 license: "BSD-2-Clause"
 tags: ["deeplearning" "array" "jit" "CUDA" "Metal"]
 homepage: "https://github.com/lukstafi/ocannl"
-doc: "https://github.com/lukstafi/ocannl/blob/master/README.md"
+doc: "https://ahrefs.github.io/ocannl/docs/"
 bug-reports: "https://github.com/lukstafi/ocannl/issues"
 depends: [
   "ocaml" {>= "5.3.0"}
diff --git a/docs/migration_guide.md b/docs/migration_guide.md
@@ -32,9 +32,10 @@ This is why pooling needs a dummy constant kernel - to carry shape info between
 | `F.dropout(x, p=0.5)` | `dropout ~rate:0.5 () ~train_step x` | Needs train_step for PRNG |
 | `F.relu(x)` | `relu x` | Direct function application |
 | `F.softmax(x, dim=-1)` | `softmax ~spec:"... | ... -> ... d" () x` | Specify axes explicitly |
-| `torch.matmul(a, b)` | `a * b` or `a +* "...; ... => ..." b` | Einsum for complex cases |
+| `torch.matmul(a, b)` | `a * b` or `a +* "..b.. -> ..a..; ..b.. => ..a.." b` | Einsum for complex cases |
 | `x.mean(dim=[1,2])` | `x ++ "... | h, w, c => ... | 0, 0, c" ["h"; "w"] /. (dim h *. dim w)` | Sum then divide |
-| `x.sum(dim=-1)` | `x ++ "... | ... d => ... | 0"` | Reduce by summing |
+| `x.sum(dim=-1, keepdim=True)` | `x ++ "... | ... d => ... | ... 0"` | Reduce by summing |
+| `x.sum(dim=-1, keepdim=False)` | `x ++ "... | ... d => ... | ..."` | Reduce by summing |
 
 ## Tensor Creation Patterns
 
@@ -138,7 +139,7 @@ OCANNL's einsum has two syntax modes:
    
 2. **Multi-character mode**:
    - Triggered by ANY comma in the spec
-   - Trailing commas ignored
+   - Trailing commas ignored (can be used to trigger multi-char mode)
    - Identifiers can be multi-character (e.g., `height`, `width`)
    - Must be separated by non-alphanumeric: `,` `|` `->` `;` `=>`
    - Makes convolution syntax less confusing: `stride*out+kernel`
@@ -152,9 +153,8 @@ OCANNL's einsum has two syntax modes:
 
 ### Row Variables
 - `...` context-dependent ellipsis: expands to `..batch..` in batch position, `..input..` before `->`, `..output..` after `->`
-- `..b..` for batch axes (arbitrary number)
-- `..ic..`, `..oc..` for input/output channels (can be multi-dimensional)
-- `..spatial..` for spatial dimensions
+- Single-char mode example: `..b..|` for batch axes (arbitrary number)
+- Multi-char mode examples: `h, w, ..ic..`, `h, w, ..oc..` for input/output channels (can be multi-dimensional), `..spatial.., channel` for spatial dimensions
 
 ## Common Gotchas and Solutions
 
diff --git a/dune-project b/dune-project
@@ -23,7 +23,7 @@
 
 (license "BSD-2-Clause")
 
-(documentation https://github.com/lukstafi/ocannl/blob/master/README.md)
+(documentation https://ahrefs.github.io/ocannl/docs/)
 
 ; We give up on npy / ocannl_npy for now.
 
diff --git a/neural_nets_lib.opam b/neural_nets_lib.opam
@@ -10,7 +10,7 @@ authors: ["Lukasz Stafiniak"]
 license: "BSD-2-Clause"
 tags: ["deeplearning" "tensor" "backprop" "jit" "CUDA" "Metal"]
 homepage: "https://github.com/lukstafi/ocannl"
-doc: "https://github.com/lukstafi/ocannl/blob/master/README.md"
+doc: "https://ahrefs.github.io/ocannl/docs/"
 bug-reports: "https://github.com/lukstafi/ocannl/issues"
 depends: [
   "ocaml" {>= "5.3.0"}