You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add concatenation to the einsum syntax (an axis that isq a concatenation of two axes each from another tensor); it's a generalization of stacking tensors.
83
87
***0.9: Optimize performance: program search.**
84
88
* Instead of dynamic scheduling as in tinygrad, we can schedule statically by program search.
85
-
* We should also reproduce the search that tinygrad is doing.
89
+
* We should also reproduce the search that tinygrad is doing. Inspiration: Halide.
86
90
* Check which optimizations are missing against the implementation of [llm.c](https://github.com/karpathy/llm.c).
87
91
* Milestone phrasing: Program search with execution-based per-backend or aggregate-of-backends cost functions. Starting with augmenting the tiling and layout mechanisms from v0.8 with cost functions, progressing to a broader range of code graph rewriting rules.
88
92
***1.0: Few documentation gaps, some degree of feature completeness, ergonomics, safety.**
@@ -153,18 +157,18 @@ OCANNL follows different design choices than [OWL](https://ocaml.xyz/). For exam
153
157
154
158
Although the project is called `ocannl`, the main package is called `neural_nets_lib`, to avoid the (opam linter's) complaint that the name can be confused with other packages. This also clarifies that `ocannl` is composed of `arrayjit` and `neural_nets_lib`.
155
159
156
-
The dependency on `cudajit`and `metal`is optional, so you have to install them first to enable the CUDA or Apple Metal backends.
160
+
The dependency on `cudajit` is optional so you have to install it first to enable the CUDA backend. The dependency on `metal` is MacOS-specific but automatic.
157
161
158
162
### Code Organization
159
163
160
164
The codebase is organized to separate user-facing recipes from framework internals:
161
165
162
166
-**`lib/`**: User-facing recipes and utilities
163
167
-`train.ml` - Training utilities and optimizers
164
-
-`nn_blocks.ml` - Neural network building blocks (transformers, attention, etc.)
168
+
-`nn_blocks.ml` - Neural network building blocks (transformers, attention, convolution, etc.)
165
169
-`ocannl.ml` - Re-exports for backward compatibility
0 commit comments