Skip to content

Commit b933682

Browse files
committed
slides-shapes_and_einsum.md, collab with Claude Opus
This was driven by Claude, but I corrected things and wrote examples 1 and 2 in `Advanced: Building Custom Operations`, after some back-and-forth Claude wrote example 3. Prompt: Let's write a user-centered introduction to how shapes work in OCANNL. Let's put the slides in docs/slides-shapes_and_einsum.md , and write them using slipshow navigation metadata as described in docs/CLAUDE.md . The slides should take a user from a beginner to advanced in making full use of shape inference and generalized einsum notation when building neural network models. They should end up aware of how projections work, how to lean on shape inference or row variables / ellipsis notation to not commit to dimension sizes or for example the number of batch axes unnecessarily. They should learn when to use the dedicated einsum operators `++`, `+*`, `@^^`, `@^+` (these operators are translated by syntax extensions to `einsum1`, `einsum`, `einmax1`, `tropical`). They should be able to use what they learned to construct a new model not yet in lib/nn_blocks.ml . Consider these sources of information: files docs/syntax_extensions.md , docs/shape_inference.md , lib/nn_blocks.ml , lib/shape.mli , selected parts of lib/operation.ml , selected parts of docs/slides-basics_backprop_training_codegen.md . Let me also provide some points that might not be stated sufficiently explicitly in other documentation. (1) The split of axes into kinds does not enforce semantics, because the generalized einsum notation can make aribtrary use of the axes. However, it offers expressivity gains. (2) It's important to stress the syntactic difference with NumPy: since we use `->` to separate input and output axes, it cannot mean separating the argument tensor(s) from the result tensor -- thus `=>` is used to the left of the result tensor; similarly, we use `;` to separate arguments and (in multi-char mode) `,` to separate identifiers within a row. (3) Remember to use kind separators where you intend to use the distinct axis kinds, e.g use `|` after batch axes . (4) To trigger multichar mode there must be a comma in the spec, it can be a trailing comma e.g. "input->output, => output->input" . (5) A reminder that, as defined in lib/operation.ml , `*` stands for tensor multiplication and `*.` stands for pointwise multiplication when working with tensor expressions (rather than low-level assignments in the `%cd` syntax). (7) The user can define operations analogous to the `einsum1` and `einsum` operation in lib/operation.ml , for example with the max operator as theaccumulation operator -- this is not so scary, operations can be easily added by users even if not inside lib/operation.ml . Summary 1 by Claude: I've created comprehensive slides on shapes and einsum notation in OCANNL. The slides progress from beginner to advanced concepts, covering: Structure: 1. Introduction - Why shape inference matters 2. Basics - NumPy differences, axis kinds 3. Row variables - Flexible broadcasting with ... and ..var.. 4. Multi-character mode - When to use commas 5. Einsum operators - ++, +*, @^^, @^+ 6. Convolutions - Using the + syntax 7. Dimension capture - Extracting sizes for computation 8. Custom operations - Building beyond built-ins 9. Practical patterns - Real-world examples 10. Shape inference - How it works under the hood 11. Projections - Loop generation details 12. Tips and exercise - Best practices with hands-on example Key features of the slides: - Uses slipshow navigation metadata (pauses, up/down, IDs) - Progresses from simple to complex concepts - Includes practical examples from nn_blocks.ml - Emphasizes the "principle of least commitment" - Covers all the points you mentioned (axis kinds, row variables, syntactic differences, etc.) - Ends with an exercise building custom attention The slides teach users to leverage OCANNL's powerful shape inference rather than over-specifying dimensions, and to use row variables for maximum flexibility when building neural network models. Prompt 2: We also have the documentation docs/migration_guide.md . Review your work in light of that file. Note that a user is likely to read both files, although I am not sure which order to recommend -- I'm leaning to recommend the migration guide after the slides we currently work on. Note that you made some mistakes that I corrected. Apply any updated you deem are still helpful. Summary 2 by Claude: Corrections applied: 1. Fixed the ellipsis examples to properly show axis kinds (batch | input -> output) 2. Corrected *. for pointwise multiplication (not * which is tensor multiplication) 3. Added clarification for single-char vs multi-char row variable syntax (..v.. vs ..var..) 4. Fixed the matrix multiplication examples to properly show axis kinds Enhancements from migration guide: 1. Added a "Common Pitfalls" section highlighting operator differences 2. Included the 0.5 + 0.5 trick for creating shape-inferred constants 3. Added connection to migration guide in the "Next Steps" 4. Included "The OCANNL Way" philosophy section emphasizing the difference from PyTorch/TF Recommendation for reading order: I suggest users read these slides first, then the migration guide. The slides provide conceptual understanding of shapes and einsum from first principles, while the migration guide offers practical translation patterns for those coming from other frameworks. The slides build intuition that makes the migration guide's examples clearer. The slides now accurately represent OCANNL's shape system with: - Proper understanding of the three axis kinds - Clear examples of single-char vs multi-char einsum modes - Practical patterns from actual usage in nn_blocks.ml - Warnings about common mistakes - Connection to further resources Summary 3 by Claude: I've replaced the problematic "Custom Max-Pool" example with the "Reduce Last N Dimensions" example. This new example: 1. Shows genuine programmatic spec building - you can't write a literal spec when n is a runtime parameter 2. Has a real use case - reducing an arbitrary number of trailing dimensions is useful for various pooling and aggregation operations 3. Is easy to understand - the example comment shows exactly what spec gets generated for n=3 4. Uses single-char mode appropriately - generates sequential letters (a, b, c, ...) for the dimension variables The example clearly demonstrates when and why you'd need to build specs programmatically rather than using literal strings! Summary 4: I've updated the "Principle of least commitment" section to accurately reflect how OCANNL handles different axis kinds: 1. Made it clear that "...|...->..." is needed for truly arbitrary shapes (all three axis kinds) 2. Added context for when simpler patterns work: - "...->..." for parameters (which typically don't have batch axes) - "...|..." for data tensors (which typically don't have input axes) 3. Fixed the binary op pattern to show the full specification needed for complete generality 4. Clarified that "... | ..d.. => ... | 0" specifically reduces output axes This now accurately represents OCANNL's requirement that missing axis kinds are assumed absent, not arbitrary, which is an important distinction for users to understand when writing maximally flexible code.
1 parent 53331e4 commit b933682

File tree

3 files changed

+489
-3
lines changed

3 files changed

+489
-3
lines changed

.github/workflows/gh-pages-docs.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@ jobs:
3838
- name: Syntax extensions
3939
run: pandoc --toc -s --embed-resources --css=docs/html/style.css docs/syntax_extensions.md -o docs/html/syntax_extensions.html
4040

41+
- name: Shapes and Einsum
42+
run: pandoc --toc -s --embed-resources --css=docs/html/style.css docs/slides-shapes_and_einsum.md -o docs/html/shapes_and_einsum.html
43+
4144
- name: Deploy
4245
uses: peaceiris/actions-gh-pages@v4
4346
with:

docs/html/index.html

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -196,15 +196,20 @@ <h1>🌐 OCANNL Directory</h1>
196196
description: 'Introduction to OCANNL: Supervised Learning, backpropagation, training loop, code generation'
197197
},
198198
{
199-
name: 'RL-REINFORCE.html',
200-
title: 'Introduction to Reinforcement Learning',
201-
description: 'Introduction to Reinforcement Learning, the REINFORCE and GRPO algorithms'
199+
name: 'shapes_and_einsum.html',
200+
title: 'Shapes and Einsum in OCANNL: From Basics to Advanced',
201+
description: 'Shapes and the generalized einsum notation in OCANNL: with explanations and examples'
202202
},
203203
{
204204
name: 'migration_guide.html',
205205
title: 'Migration Guide',
206206
description: 'Migration Guide: PyTorch/TensorFlow to OCANNL'
207207
},
208+
{
209+
name: 'RL-REINFORCE.html',
210+
title: 'Introduction to Reinforcement Learning',
211+
description: 'Introduction to Reinforcement Learning, the REINFORCE and GRPO algorithms'
212+
},
208213
{
209214
name: 'syntax_extensions.html',
210215
title: 'Syntax Extensions',

0 commit comments

Comments
 (0)