You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the TabNetClassifier to do classification tasks in Google Colab. Something is wrong when I switch from CPU to GPU. The codes are bug-free but when executing with GPU, the program says cuda is detected and I assume the codes are run in GPU. However, the intermediate info is completely different from CPU version. I did not modify anything, just switch the device from CPU to GPU and execute another time ... I am attaching the code here but I can't paste everything here since I am working on a project.
def train(five_fold_data, X_pretrain):
unsupervised_model = TabNetPretrainer(
optimizer_fn=torch.optim.Adam,
optimizer_params=dict(lr=2e-2),
mask_type='entmax' # "sparsemax"
)
boundary = int(0.8 * len(X_pretrain))
unsupervised_model.fit(
X_train=X_pretrain[:boundary],
eval_set=[X_pretrain[boundary:]],
pretraining_ratio=0.8,
)
clf = TabNetClassifier(
optimizer_fn=torch.optim.Adam,
optimizer_params=dict(lr=2e-2),
scheduler_params={"step_size":10, # how to use learning rate scheduler
"gamma":0.9},
scheduler_fn=torch.optim.lr_scheduler.StepLR,
mask_type='entmax' # This will be overwritten if using pretrain model
)
model_sets = list()
for i, [(x_train, y_train), (x_eval, y_eval)] in enumerate(five_fold_data):
clf.fit(
X_train=x_train, y_train=y_train,
eval_set=[(x_eval, y_eval)],
eval_name=['valid'],
eval_metric=['auc', 'accuracy'],
max_epochs=120,
num_workers=2,
loss_fn=None,
from_unsupervised=unsupervised_model
)
model_sets.append(clf)
return model_sets
What is the current behavior?
The CPU version returns satisfactory results where GPU version does not.
If the current behavior is a bug, please provide the steps to reproduce.
I am not sure if it is a bug. But I am willing to share with you the entire code privately if you want.
Expected behavior
I assume both versions return exactly identical results.
Screenshots
Other relevant information:
poetry version: NA
python version: 3.7
Operating System: Linux
Additional tools: NA
Additional context
The text was updated successfully, but these errors were encountered:
about CPU/GPU differences : pytorch-tabnet performs reproducible training, i.e with the same configuration and random seed you should en up with the same results. However, switching from CPU to GPU changes the configuration, so you can't reproduce the exact same seed both with CPU and GPU. This is not a limitation of pytorch-tabnet but more generally of pytorch, and more generally of GPU computing. Nothing can be done about this, but your results should be SIMILAR. If you see a huge difference this means your problem is very sensitive to random seeds, try to change the seed on CPU and see if you see the same behavior.
the same CPU/GPU seed exists for the pretrainer. A poor pretrained model can cause large difference in the end, for this reason, you should also try your pipeline without pretraining with CPU and GPU and see if they are very different.
currently your for loop is leaky, indeed the current latest version of pytorch-tabnet does not follow the warm-start policy of scikit-learn (fix is here : feat: add warm_start matching scikit-learn #340 and coming soon in the next release). Currently since the definition of the clf is outside your for loop, you are basically finetuning the same model on different splits, which means your model has been training on the validation fold previously. This leads to overconfidence on the CV result of your model, please consider defining the model inside the for loop to avoid this.
I'm temporarily closing this as there's no way to reproduce your reported problem and we did not hear back from you.
Please feel free to reopen once you have more information to share.
Describe the bug
I am using the
TabNetClassifier
to do classification tasks in Google Colab. Something is wrong when I switch from CPU to GPU. The codes are bug-free but when executing with GPU, the program says cuda is detected and I assume the codes are run in GPU. However, the intermediate info is completely different from CPU version. I did not modify anything, just switch the device from CPU to GPU and execute another time ... I am attaching the code here but I can't paste everything here since I am working on a project.What is the current behavior?
The CPU version returns satisfactory results where GPU version does not.
If the current behavior is a bug, please provide the steps to reproduce.
I am not sure if it is a bug. But I am willing to share with you the entire code privately if you want.
Expected behavior
I assume both versions return exactly identical results.
Screenshots
Other relevant information:
poetry version: NA
python version: 3.7
Operating System: Linux
Additional tools: NA
Additional context
The text was updated successfully, but these errors were encountered: