New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INVALID_ARGUMENT: No defined default loss for this combination of label type and task #100
Comments
@AlirezaSadeghi can you specify your label type? |
@Cheril311 If I'm understanding you correctly, I've already done it in the text, it's the 2nd entry in the It's an So the Did I answer your question? If not please elaborate if possible. |
@AlirezaSadeghi my bad |
Hi AlirezaSadeghi, If the loss argument of the Gradient boosted tree is not specified, it is selected automatically from the label type, label values and task. The error you reported indicates that there is no loss matching your label. Looking at your example, a likely situation is that your int64 label only contains zeros. Can you check it? Alternatively, you can specify the loss to be the "BINOMIAL_LOG_LIKELIHOOD" i.e. binary classification loss. On my side, I'll improve the error message for this particular situation. |
Hi @achoum , Yup your assumption is actually right, I'm just testing the pipeline and running the model on a part of the training set, which includes all zeros for starters. Didn't know that might become an issue. I'll try with |
Okay doing that, it tells me this:
It's somehow assuming the task is not "binary classification"? |
@achoum just an fyi, have you seen my last comment? Wondering if you've got any further insights. |
If your task is not a binary classification task, you can try setting the loss to MULTINOMIAL_LOG_LIKELIHOOD |
My task "is" binary classification, and the labels are all 0s, don't know how it's assuming the task is not "binary classification". (as I've already mentioned before) |
Oh, apologies, I overlooked that part in your first message |
@achoum No new updates/insights on this? 😔 |
If all your labels are all 0, the framework detects that this is not a binary classification and fails. While for unit testing, training on dataset where all the labels have the value could make sense, this error/failure helps to catch error in datasets. |
I'm trying to use
GradientBoostedTreesModel
in a TFX pipeline, the code is roughly as follows:This unfortunately gives me an
INVALID_ARGUMENT: No defined default loss for this combination of label type and task
exception and fails the model training.Definition of
_input_fn
is as follows:Which basically parses the schema into feature specs, parses the batch of TF-examples and finally maps them to a tuple of (Dict[feature_name, Tensor], Tensor), results is like this:
Labels can be 0 or 1 and the task is a binary classification task.
Any idea what I might be doing wrong here?
Mac OS Monterey, tfdv 0.2.4, python 3.8, tfx 1.7
The text was updated successfully, but these errors were encountered: