You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have this rather odd problem where sometimes a Tensor's values are overwritten by the value of another calculation down the road. It only happens in the LibTorch backend, not with the NdArray backend (only one I've tested).
It seems to be a result of some sort of combination of reshaping and taking the exponent. Sometimes, when I take a slice, the original value will be replaced by that of the exponent of the slice (which is cloned). It really was a gnarly bug to find in the code but I managed to make a very small working example.
#[test]fnbizarre() -> Result<()>{let zeros = Tensor::<NdArray,1>::zeros([2],&NdArrayDevice::default());
zeros.clone().slice([1..2]).reshape([1]).exp();//Works as expectedassert_eq!(
zeros.to_data(),
Tensor::<NdArray, 1>::zeros([2], &NdArrayDevice::default()).to_data());let zeros = Tensor::<LibTorch,1>::zeros([2],&LibTorchDevice::default());
zeros.clone().slice([1..2]).reshape([1]).clone().exp();//Works as expected thanks to the second clone after reshaping.assert_eq!(
zeros.to_data(),
Tensor::<LibTorch, 1>::zeros([2], &LibTorchDevice::default()).to_data());let zeros = Tensor::<LibTorch,1>::zeros([2],&LibTorchDevice::default());
zeros.clone().slice([1..2]).reshape([1]).exp();//Doesn't work, leads to zeroes being equal to [0.0, 1.0]assert_eq!(
zeros.to_data(),
Tensor::<LibTorch, 1>::zeros([2], &LibTorchDevice::default()).to_data());Ok(())}
The first two asserts work fine whereas the last one fails, as the second value of zeros has been replaced by e^0=1. (If I try with different values the number is consistently equal to e^x).
I have no idea what could possibly be causing this, but it happens with both the current release version (0.12.1) and with the current commit on the main branch.
Thanks!
The text was updated successfully, but these errors were encountered:
Thanks, @MichaelGoodale, for reporting the bug. We track the number of references on each tensor and reuse some in place for improved performance and memory usage without you having to do anything. However, there might be a bug in one of the operations that can invalidate the state. With your example, it's probably going to be easy to fix it. Thanks.
Hi, I have this rather odd problem where sometimes a Tensor's values are overwritten by the value of another calculation down the road. It only happens in the LibTorch backend, not with the NdArray backend (only one I've tested).
It seems to be a result of some sort of combination of reshaping and taking the exponent. Sometimes, when I take a slice, the original value will be replaced by that of the exponent of the slice (which is cloned). It really was a gnarly bug to find in the code but I managed to make a very small working example.
The first two asserts work fine whereas the last one fails, as the second value of zeros has been replaced by e^0=1. (If I try with different values the number is consistently equal to e^x).
I have no idea what could possibly be causing this, but it happens with both the current release version (0.12.1) and with the current commit on the main branch.
Thanks!
The text was updated successfully, but these errors were encountered: