When I run wide and deeptab,I got stuck #68

zhang-HZAU · 2021-12-10T07:24:12Z

Hi pytorch_widedeep team,
First of all thank you for your contributions to the wide_deep field. Recently I am doing a task about classification. The essence is similar to the predicted adult salary level shown in the example in the project readme. All of my input data has been processed as numerical values. The data contains continuous values and some indicators are discrete values (such as: 0,1,2...). In addition, the data contains missing values. My task is a five-category problem.
Question one:
I used the code used to predict adult wages in the readme, changed the data input part and changed "binary" to "multiclass" to adapt to my task.When I executing the following code:

wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=cross_cols)
X_wide = wide_preprocessor.fit_transform(df_train)
wide = Wide(wide_dim=np.unique(X_wide).shape[0], pred_dim=1)

tab_preprocessor = TabPreprocessor(embed_cols=embed_cols, continuous_cols=cont_cols)
X_tab = tab_preprocessor.fit_transform(df_train)
deeptabular = TabMlp(
    mlp_hidden_dims=[64, 32],
    column_idx=tab_preprocessor.column_idx,
    embed_input=tab_preprocessor.embeddings_input,
    continuous_cols=cont_cols,
)

model = WideDeep(wide=wide, deeptabular=deeptabular)

I got the following error: "RuntimeWarning: invalid value encountered in true_divide". I understand that it is caused by the occurrence of 0 divided by 0, but the error message given by the code is not enough for me to locate the problem segment. I would like to ask how to solve the problem?
Question two:
In the process of using the function to divide the training set into the validation set:

trainer = Trainer(model, objective="multiclass", metrics=[Accuracy])
trainer.fit(
    X_wide=X_wide,
    X_tab=X_tab,
    target=target,
    n_epochs=15,
    batch_size=16,
    val_split=0.1,
)

I reported the error "IndexError: index 648 is out of bounds for axis 0 with size 530" exceeding the index range of the training set. I have no solution.
Upload the source code to the attachment, and see the ipynb file running results for detailed error information. Looking forward to your answer~
liver_predict.zip

The text was updated successfully, but these errors were encountered:

jrzaurin · 2021-12-10T08:57:24Z

Hey @zhang-HZAU

Thanks for the issue.

You are not the first one that finds this not intuitive so we will add a warning as soon as possible.

The "issue" is that when you have a multiclass classification problem, you need to specify the number of classes via the pred_dim param to the WideDeep class, as this has no notion/information of the target.

see here:
https://pytorch-widedeep.readthedocs.io/en/latest/model_components.html#pytorch_widedeep.models.wide_deep.WideDeep

model = WideDeep(wide=wide, deeptabular=deeptabular, pred_dim=5)

And the number of classes needs to start from 0.

Going to leave this issue open until we add that Warning if the pred_dim is not passed. We will also alias it as num_classes

Thanks again

Regarding the second question, I will have a look to the code.

Could you please open a separate issue?

Cheers!
J.

jrzaurin · 2021-12-10T09:35:44Z

@zhang-HZAU

is you could send me the data or point me towards where I can get it would be great

(also, the library is not autoML :) so you would have to impute the NaN before passing it to the preprocessor....just saying🙂

zhang-HZAU · 2021-12-11T01:47:24Z

Hey @zhang-HZAU

Thanks for the issue.

You are not the first one that finds this not intuitive so we will add a warning as soon as possible.

The "issue" is that when you have a multiclass classification problem, you need to specify the number of classes via the pred_dim param to the WideDeep class, as this has no notion/information of the target.

see here: https://pytorch-widedeep.readthedocs.io/en/latest/model_components.html#pytorch_widedeep.models.wide_deep.WideDeep
model = WideDeep(wide=wide, deeptabular=deeptabular, pred_dim=5)
And the number of classes needs to start from 0.

Going to leave this issue open until we add that Warning if the pred_dim is not passed. We will also alias it as num_classes

Thanks again

Regarding the second question, I will have a look to the code.

Could you please open a separate issue?

Cheers! J.
Thank you for your answers. According to your suggestions, my problem feedback is in another "issue", as follows: #70

jrzaurin self-assigned this Dec 10, 2021

jrzaurin added the enhancement label Dec 10, 2021

zhang-HZAU mentioned this issue Dec 11, 2021

To be continue:When I run Wide and DeepTab,I got stuck #70

Closed

jrzaurin closed this as completed Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When I run wide and deeptab,I got stuck #68

When I run wide and deeptab,I got stuck #68

zhang-HZAU commented Dec 10, 2021

jrzaurin commented Dec 10, 2021

jrzaurin commented Dec 10, 2021 •

edited

zhang-HZAU commented Dec 11, 2021

When I run wide and deeptab,I got stuck #68

When I run wide and deeptab,I got stuck #68

Comments

zhang-HZAU commented Dec 10, 2021

jrzaurin commented Dec 10, 2021

jrzaurin commented Dec 10, 2021 • edited

zhang-HZAU commented Dec 11, 2021

jrzaurin commented Dec 10, 2021 •

edited