Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASK] MIND test dataset doesn't work for run_eval #2105

Closed
ubergonmx opened this issue May 28, 2024 · 3 comments
Closed

[ASK] MIND test dataset doesn't work for run_eval #2105

ubergonmx opened this issue May 28, 2024 · 3 comments
Labels
help wanted Need help from developers

Comments

@ubergonmx
Copy link
Contributor

ubergonmx commented May 28, 2024

Description

The following code:

label = [0 for i in impr.split()]

It is essentially making each news ID in the impression list non-clicked.

Instead of modifying the code, I modified the test behaviors file and added -0 to each news ID in the impression list (e.g., N712-0 N231-0).
Now I get the following error after running run_eval:

model.run_eval(test_news_file, test_behaviors_file)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File <timed exec>:1

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/recommenders/models/newsrec/models/base_model.py:335, in BaseModel.run_eval(self, news_filename, behaviors_file)
    331 else:
    332     _, group_labels, group_preds = self.run_slow_eval(
    333         news_filename, behaviors_file
    334     )
--> 335 res = cal_metric(group_labels, group_preds, self.hparams.metrics)
    336 return res

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/recommenders/models/deeprec/deeprec_utils.py:594, in cal_metric(labels, preds, metrics)
    591         res["hit@{0}".format(k)] = round(hit_temp, 4)
    592 elif metric == "group_auc":
    593     group_auc = np.mean(
--> 594         [
    595             roc_auc_score(each_labels, each_preds)
    596             for each_labels, each_preds in zip(labels, preds)
    597         ]
    598     )
    599     res["group_auc"] = round(group_auc, 4)
    600 else:

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/recommenders/models/deeprec/deeprec_utils.py:595, in <listcomp>(.0)
    591         res["hit@{0}".format(k)] = round(hit_temp, 4)
    592 elif metric == "group_auc":
    593     group_auc = np.mean(
    594         [
--> 595             roc_auc_score(each_labels, each_preds)
    596             for each_labels, each_preds in zip(labels, preds)
    597         ]
    598     )
    599     res["group_auc"] = round(group_auc, 4)
    600 else:

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/sklearn/metrics/_ranking.py:567, in roc_auc_score(y_true, y_score, average, sample_weight, max_fpr, multi_class, labels)
    565     labels = np.unique(y_true)
    566     y_true = label_binarize(y_true, classes=labels)[:, 0]
--> 567     return _average_binary_score(
    568         partial(_binary_roc_auc_score, max_fpr=max_fpr),
    569         y_true,
    570         y_score,
    571         average,
    572         sample_weight=sample_weight,
    573     )
    574 else:  # multilabel-indicator
    575     return _average_binary_score(
    576         partial(_binary_roc_auc_score, max_fpr=max_fpr),
    577         y_true,
   (...)
    580         sample_weight=sample_weight,
    581     )

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/sklearn/metrics/_base.py:75, in _average_binary_score(binary_metric, y_true, y_score, average, sample_weight)
     72     raise ValueError("{0} format is not supported".format(y_type))
     74 if y_type == "binary":
---> 75     return binary_metric(y_true, y_score, sample_weight=sample_weight)
     77 check_consistent_length(y_true, y_score, sample_weight)
     78 y_true = check_array(y_true)

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/sklearn/metrics/_ranking.py:337, in _binary_roc_auc_score(y_true, y_score, sample_weight, max_fpr)
    335 """Binary roc auc score."""
    336 if len(np.unique(y_true)) != 2:
--> 337     raise ValueError(
    338         "Only one class present in y_true. ROC AUC score "
    339         "is not defined in that case."
    340     )
    342 fpr, tpr, _ = roc_curve(y_true, y_score, sample_weight=sample_weight)
    343 if max_fpr is None or max_fpr == 1:

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Any fix or workaround to this? How do I get the scores?

Other Comments

Originally posted by @ubergonmx in #1673 (comment)

I am trying to train the NAML model with the valid + test set.

@ubergonmx ubergonmx added the help wanted Need help from developers label May 28, 2024
@miguelgfierro
Copy link
Collaborator

it seems that is an error with AUC because there is just one class. It's like all your labels are one class.

I would try to look into the data and make sure you have positive and negagive classes

@ubergonmx
Copy link
Contributor Author

it seems that is an error with AUC because there is just one class. It's like all your labels are one class.

I would try to look into the data and make sure you have positive and negagive classes

Thank you. I think this was also an issue before, but it was closed as there's no test set with labels.

@miguelgfierro
Copy link
Collaborator

Sounds good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Need help from developers
Projects
None yet
Development

No branches or pull requests

2 participants