-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Example 08 not runnable with Cuda #451
Comments
@TimerErTim can you try re-running this with the latest main? I think one of the PRs I merged earlier fixed this |
@coreylowman I just did but the error remains. Would I need to rebuild the kernels? |
@TimerErTim nah they will rebuild if they are changed so you shouldn't need to. I'll take a look at this! |
Ahh this is because our sum kernels assume that the number of elements in the result are less than the number of physical elements. Here's a unit test that currently fails for Cuda: #[test]
fn test_sum_reduce_to_more_than_physical_elements() {
let dev: TestDevice = Default::default();
let a: Tensor<_, TestDtype, _> = dev.tensor([1.0, 2.0, 3.0]);
let b = a.broadcast::<Rank3<4, 3, 2>, _>();
let c = b.sum::<Rank2<4, 3>, _>();
assert_eq!(c.array(), [[2.0, 4.0, 6.0]; 4]);
} In sum, chunk_len ( Will require some rewriting of our reduction kernels. FYI @nkoppel |
@coreylowman The example is now executable as intended, but this code: let b = a.broadcast::<Rank2<5, 3>, _>();
assert_eq!(b.array(), [[1.0, 2.0, 3.0]; 5]);
let c = b.broadcast::<Rank4<7, 5, 3, 2>, _>().to_device(&Cpu::default());
assert_eq!(c.array(), [[[[1.0; 2], [2.0; 2], [3.0; 2]]; 5]; 7]);
let d = c.mean::<Rank2<5, 3>, _>();
assert_eq!(d.array(), [[1.0, 2.0, 3.0]; 5]);
let e = dev.tensor([1.0]);
let f = e.broadcast::<Rank2<1, 1>, Axis<1>>(); still results in the very weird assertion error inside |
When running the
08-tensor-broadcast-reduce
example on a Cuda device, it fails on reducing thec
tensor on two axes with the following error message:The code used is the following:
The example runs fine on the CPU.
Changing the code to this:
results in another error:
The text was updated successfully, but these errors were encountered: