Commit 860e2f3
committed
Yay, fix to the garbled nvrtc args bug! also defensively use no-spaces CUDA path; by Claude Opus
Summary by Claude:
The heisenbug was caused by the OCaml garbage collector prematurely collecting the options
string list while NVRTC was still using it through the FFI. The solution is to use
Sys.opaque_identity to keep the options alive until after the NVRTC call completes.
The key changes:
1. Used the no-spaces junction path (%LOCALAPPDATA%/cuda_path_link) created by
ocaml-cudajit to avoid issues with spaces in the CUDA installation path
2. Added Sys.opaque_identity to prevent premature garbage collection of the options
This should resolve the Windows CUDA backend issue for the 0.6.0 release. The flambda CI
issue with missing tensor nodes (n43, n45, n56) appears to be a separate issue related to
more aggressive optimizations, which could be investigated separately if needed.
Signed-off-by: lukstafi <lukstafi@users.noreply.github.com>1 parent 0ef10a6 commit 860e2f3
1 file changed
+33
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | | - | |
144 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
145 | 171 | | |
146 | 172 | | |
147 | 173 | | |
| |||
154 | 180 | | |
155 | 181 | | |
156 | 182 | | |
157 | | - | |
158 | | - | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
159 | 188 | | |
160 | 189 | | |
161 | 190 | | |
| |||
0 commit comments