Skip to content

Commit e569617

Browse files
committed
Update CHANGES in preparation for 0.4.0 release
1 parent 127e804 commit e569617

File tree

1 file changed

+21
-13
lines changed

1 file changed

+21
-13
lines changed

CHANGES.md

Lines changed: 21 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,4 @@
1-
## [0.4.1] -- next
2-
3-
### Added
4-
5-
- TODO: API improvements for mixed precision computations.
6-
7-
### Fixed
8-
9-
- TODO: Proper implementation of half precision. Requires OCaml 5.2.
10-
11-
## [0.4.0] -- 2024-07-??
1+
## [0.4.0] -- 2024-09-04
122

133
### Added
144

@@ -17,7 +7,14 @@
177
- backends just need to support device-to-device transfers,
188
- merging gets implemented in "user space".
199
- CUDA streaming multiprocessor parallelism via streams <-> virtual devices.
20-
- TODO(#262): "term punning" for `%cd`.
10+
- Support for `cuda-gdb` and `compute-sanitizer` (pass the right arguments to cudajit).
11+
- Inline declarations for (non-differentiable) tensors in the `%cd` syntax.
12+
- A minimal wrapper `Sync_backend` creating CPU backends with a single device only, where all calls are synchronous. (It's a baseline and helps debugging.)
13+
- In progress: proper (condition variables based) scheduler. The legacy scheduler (pipes based) kept for now as baseline and to help debugging.
14+
- Documentation for the syntax extensions.
15+
- `%op` syntax: when under a `~config` parameter, refine the inline declared params' labels with `config.label`.
16+
- `%op` syntax: incorporate the input tensor's (if any) label in the resulting tensor's label.
17+
- Comments in config files using the line prefix `~~`.
2118

2219
### Changed
2320

@@ -31,7 +28,18 @@
3128
- split the `device` type into virtual `device` and `physical_device`,
3229
- removed the direct support for `merge`, instead relying on merge buffers.
3330
- Updated to cudajit 0.4.
34-
- TODO: a template for C-syntax backends, refactoring CC and CUDA backends.
31+
- A template for C-syntax backends, refactoring CC and CUDA backends.
32+
- Improvements to handling of tensor node labels, and to the `Tnode.debug_name` function.
33+
- Output files generated by backends, and files generated by logging, in separate subdirectories.
34+
- C-syntax logging: also output the pre-assignment value when logging an assignment.
35+
- Migrated to ppx_minidebug 2.0 with the benefits it brings: no runtime passing, `Utils.settings.log_level` unified with ppx_minidebug's log levels.
36+
37+
### Fixed
38+
39+
- Allow verifying that non-embedded tensor nodes of the tensor(s) associated with a linked code are already in the context passed to `link` (resp. `link_batch`), since they won't get introduced into the context. It is the responsibility of helper functions (such as those in `Train`) to ensure the check.
40+
- Fixed both known and newly discovered shortcomings of the syntax extensions.
41+
- In particular, `%op` syntax: lift `~config` applications out of (tensor) functions.
42+
- Multiple other tiny fixes.
3543

3644
## [0.3.3] -- 2024-04-24
3745

0 commit comments

Comments
 (0)