Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precision/recall scores not applicable to dropped samples #137

Open
bfhealy opened this issue Oct 25, 2022 · 0 comments
Open

Precision/recall scores not applicable to dropped samples #137

bfhealy opened this issue Oct 25, 2022 · 0 comments
Assignees
Labels
bug Something isn't working invalid This doesn't seem right

Comments

@bfhealy
Copy link
Collaborator

bfhealy commented Oct 25, 2022

The Dataset class in scope/utils.py allows for samples to be dropped to balance positive and negative examples for a class. In the training process, the code attempts to evaluate an f1 score for the dropped samples (full code below):
https://github.com/ZwickyTransientFacility/scope/blob/ac22f211a188eb1736e35eb9c6c552a902735fb5/scope.py#L665

if datasets["dropped_samples"] is not None:
            # log model performance on the dropped samples
            if verbose:
                print("Evaluating on samples dropped from the training set:")
            stats = classifier.evaluate(datasets["dropped_samples"], verbose=verbose)
            if verbose:
                print(stats)

            if not kwargs.get("test", False):
                for param, value in zip(param_names, stats):
                    wandb.run.summary[f"dropped_samples_{param}"] = value
                p, r = (
                    wandb.run.summary["dropped_samples_precision"],
                    wandb.run.summary["dropped_samples_recall"],
                )
                wandb.run.summary["dropped_samples_f1"] = 2 * p * r / (p + r)

Since the dropped samples only contain labels from the underrepresented class, the precision p and recall r will be sometimes be zero, leading to a divide by zero error and a 'failed' run in wandb. A quick solution is to remove the f1 score from the metrics computed for the dropped samples.

@bfhealy bfhealy added bug Something isn't working invalid This doesn't seem right labels Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

2 participants