You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGES.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,10 @@
2
2
3
3
### Added
4
4
5
-
-The previously-mocked support for half precision.
5
+
-Implemented the previously-mocked support for half precision (FP16).
6
6
- We work around the missing Ctypes coverage by not using `Ctypes.bigarray_start`.
7
+
- We check FP16 constants for overflow.
8
+
- We output half precision specific code from the CUDA backend.
7
9
8
10
### Changed
9
11
@@ -18,6 +20,8 @@
18
20
-`debug_log_from_routines` should only happen when `log_level > 1`.
19
21
- Bugs in `Multicore_backend`: `await` was not checking queue emptiness, `worker`'s `Condition.broadcast` was non-atomically guarded (doesn't need to be), possible deadloop due to the lockfree queue -- now replaced with `saturn_lockfree`.
20
22
- Reduced busy-waiting inside `c_compile_and_load`, propagating compilation errors now instead of infinite loop on error.
23
+
- Fixed loss of significant digits for small numbers when outputting files.
24
+
- Added missing mixed-precision conversions in the `C_syntax` backend builder.
0 commit comments