Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: The size of tensor a (10) must match the size of tensor b (8) #70

Closed
NJuntunen opened this issue Dec 9, 2021 · 3 comments
Closed

Comments

@NJuntunen
Copy link

Hi,

I am trying to fit a tabnet model to a large dataset using unsupervised and supervised training like in the article. I am getting this error message every now and then when I have trained the unsupervised model and I am using it in supervised mode. Sometimes it helps just to run the exact same code again but sometimes it pops up every time. Don't really know how to approach this error. Sorry for not having a reproducible example, the data I am using is bit sensitive.

Error in (function (self, src, non_blocking) :
The size of tensor a (10) must match the size of tensor b (8) at non-singleton dimension 0
Exception raised from infer_size_impl at ....\aten\src\ATen\ExpandUtils.cpp:28 (most recent call first):
00007FF92EF010D200007FF92EF01070 c10.dll!c10::Error::Error [ @ ]
00007FF92EF00BAE00007FF92EF00B60 c10.dll!c10::detail::torchCheckFail [ @ ]
00007FF8BA78201500007FF8BA781DD0 torch_cpu.dll!at::DynamicLibrary::sym [ @ ]
00007FF8BA78344900007FF8BA783420 torch_cpu.dll!at::infer_size_dimvector [ @ ]
00007FF8BA799F5500007FF8BA799E00 torch_cpu.dll!at::TensorIteratorBase::compute_shape [ @ ]
00007FF8BA79843200007FF8BA7983D0 torch_cpu.dll!at::TensorIteratorBase::build [ @ ]
00007FF8BA74CDE200007FF8BA74CDA0 torch_cpu.dll!at::TensorIteratorConfig::build [ @ ]
00007FF8BA8E744A00007FF8BA8E6C60 torch_cpu.dll!at::native::copy_ [ @ ]
00007FF8BA8E6CB700007FF8BA8E6C60 torch_cpu.dll!at::native::copy_ [ @ ]
00007FF8BB040D9E00007FF8BB040CF0 torch_cpu.dll!at::redispatch::copy_ [ @ ]
00007FF8BCD03C3900007FF8BCD029E0 torch_cpu.dll!torch::autograd::VariableType::allCUDATypes [ @ ]
00007FF8BB040D9E00007FF8BB040CF0 torch_cpu.dll!at::redispatch::copy_ [ @ ]
00007FF8BCD0387B00007FF8BCD029E0 torch_cpu.dll!torch::autograd::VariableType::allCUDATypes [ @ ]
00007FF8BB45C18200007FF8BB45C050 torch_cpu.dll!at::Tensor::copy_ [ @ ]
00007FF9279A2EEF00007FF9279A2E00 lantern.dll!lantern_Tensor_copy__tensor_tensor_bool [ @ ]
0000000065FEE0180000000065FEDFF0 torchpkg.dll!Z45cpp_torch_method_copy__self_Tensor_src_Tensor15XPtrTorchTensorS_b [ @ ]
0000000065EEDC7D0000000065EEDBE0 torchpkg.dll!torch_cpp_torch_method_copy__self_Tensor_src_Tensor [ @ ]
000000006C7A7BAE000000006C79F730 R.dll!Rf_NewFrameConfirm [ @ ]
000000006C7A886D000000006C79F730 R.dll!Rf_NewFrameConfirm [ @ ]
000000006C7ED189000000006C7E5EC0 R.dll!R_initAssignSymbols [ @ ]
000000006C7FCBF1000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FE907000000006C7FE460 R.dll!R_cmpfun1 [ @ ]
000000006C7FFB6A000000006C7FF9B0 R.dll!Rf_applyClosure [ @ ]
000000006C7FCD9C000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7740AD000000006C76D110 R.dll!Rf_coerceVector [ @ ]
000000006C7ED189000000006C7E5EC0 R.dll!R_initAssignSymbols [ @ ]
000000006C7FCBF1000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FE907000000006C7FE460 R.dll!R_cmpfun1 [ @ ]
000000006C7FFB6A000000006C7FF9B0 R.dll!Rf_applyClosure [ @ ]
000000006C7F4F54000000006C7E5EC0 R.dll!R_initAssignSymbols [ @ ]
000000006C7FCBF1000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FE907000000006C7FE460 R.dll!R_cmpfun1 [ @ ]
000000006C7FFB6A000000006C7FF9B0 R.dll!Rf_applyClosure [ @ ]
000000006C7F4F54000000006C7E5EC0 R.dll!R_initAssignSymbols [ @ ]
000000006C7FCBF1000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FE907000000006C7FE460 R.dll!R_cmpfun1 [ @ ]
000000006C7FFB6A000000006C7FF9B0 R.dll!Rf_applyClosure [ @ ]
000000006C7FCD9C000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C8008B7000000006C7FFDD0 R.dll!R_execMethod [ @ ]
000000006C7FCFE5000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FE907000000006C7FE460 R.dll!R_cmpfun1 [ @ ]
000000006C7FFB6A000000006C7FF9B0 R.dll!Rf_applyClosure [ @ ]
000000006C7FCD9C000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C8008B7000000006C7FFDD0 R.dll!R_execMethod [ @ ]
000000006C7FCFE5000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FE907000000006C7FE460 R.dll!R_cmpfun1 [ @ ]
000000006C7FFB6A000000006C7FF9B0 R.dll!Rf_applyClosure [ @ ]
000000006C7FCD9C000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C8008B7000000006C7FFDD0 R.dll!R_execMethod [ @ ]
000000006C7FCFE5000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FD4B9000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FD938000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7F2602000000006C7E5EC0 R.dll!R_initAssignSymbols [ @ ]
000000006C7FCBF1000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FD4B9000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FCF04000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FD4B9000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FD938000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7F2602000000006C7E5EC0 R.dll!R_initAssignSymbols [ @ ]
000000006C7FCBF1000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C7FE907000000006C7FE460 R.dll!R_cmpfun1 [ @ ]
000000006C7FFB6A000000006C7FF9B0 R.dll!Rf_applyClosure [ @ ]
000000006C7FCD9C000000006C7FCA80 R.dll!Rf_eval [ @ ]
000000006C8008B7000000006C7FFDD0 R.dll!R_execMethod [ @ ]

@cregouby
Copy link
Collaborator

cregouby commented Dec 10, 2021

Hello @NJuntunen,

Without leaking any sensitive date, are you able to share your data preparation code ( i.e. the recipe or the formula or just the str() of your outcome after preprocessing), the tabnet_pretrain() and tabnet_fit() config parameters, and as well the infrastructure stack ( i.e. tabnet version, torch version, CPU or GPU...) so that we get a taste of what part of the code to look at ( i.e. regression vs classification, batch and minibatch size, ... ? .

Your issue is tough, but I think we all need to shape a kind of troubleshouting pattern is such case...

@NJuntunen
Copy link
Author

Hi @cregouby, and thanks for offering help! The issue was indeed in data preprocessing and more precise I used step_nzv() in my recipe to exclude NA values from the dataset. So, as you might guess in some cases by chance the step_nzv eliminated some column from one of the datasets (unsupervised or supervised) which caused this error. By first look the error code looked bit challenging but luckily the problem wasn't bigger than that.

@cregouby
Copy link
Collaborator

Thanks for the feedback.

This scenario will become even trickier as now we support missing values in the unsupervised step. Gives me food for though for the best way to message user of such potential issue in the future...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants