Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mrr calculation #886

Merged
merged 38 commits into from Oct 11, 2020
Merged

Add mrr calculation #886

merged 38 commits into from Oct 11, 2020

Conversation

zkid18
Copy link
Contributor

@zkid18 zkid18 commented Jul 15, 2020

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contribution guide?
  • Did you check the code style? catalyst-make-codestyle && catalyst-check-codestyle (pip install -U catalyst-codestyle).
  • Did you make sure to update the docs? We use Google format for all the methods and classes.
  • Did you check the docs with make check-docs?
  • Did you write any new necessary tests?
  • Did you add your new functionality to the docs?
  • Did you update the CHANGELOG?
  • You can use 'Login as guest' to see Teamcity build logs.

Description

Related Issue

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@pep8speaks
Copy link

pep8speaks commented Jul 15, 2020

Hello @zkid18! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-10-11 10:49:27 UTC

@mergify
Copy link

mergify bot commented Jul 15, 2020

This pull request is now in conflicts. @zkid18, could you fix it? 🙏

import torch


def mrr(outputs: torch.Tensor, targets: torch.Tensor):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@zkid18 zkid18 Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll have a look at callbacks.

@zkid18
Copy link
Contributor Author

zkid18 commented Jul 16, 2020

I still consider the proper design for MRR@K. Probably _metric@k is quite common task, so rather than implement this functionality inside every metrics extend _metric with, say, topKMixin

mrr (float): the mrr score
"""
outputs = outputs.clone()
targets = targets.clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need clone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to follow the 'shared clones', which is a common pattern in torch. But it seems it not so necessary here.

Copy link
Member

@Scitator Scitator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, looks like we need to fix codestyle

import torch


def mrr(outputs: torch.Tensor,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from catalyst.utils import metrics


class MRRCallback(MetricCallback):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for tests, I think better return to the question when we implement at least one Learning to Rank models.

@mergify mergify bot dismissed Scitator’s stale review July 27, 2020 12:11

Pull request has been modified.

@mergify
Copy link

mergify bot commented Jul 28, 2020

This pull request is now in conflicts. @zkid18, could you fix it? 🙏

output_key (str): output key to use for auc calculation;
specifies our ``y_pred``
prefix (str): name to display for mrr when printing
activation (str): An torch.nn activation applied to the outputs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need activation? mrr metric doesn't have activation support

@Scitator
Copy link
Member

Scitator commented Jul 30, 2020

@zkid18 could you please add a sanity check test for MRRCallback? like "init and compute on one sample"

@zkid18 zkid18 requested a review from ditwoo as a code owner September 6, 2020 07:21
bagxi
bagxi previously requested changes Sep 10, 2020
from catalyst.utils.metrics.auc import auc
from catalyst.utils.metrics.cmc_score import cmc_score_count, cmc_score
from catalyst.utils.metrics.dice import dice, calculate_dice
from catalyst.utils.metrics.f1_score import f1_score
from catalyst.utils.metrics.mrr import mrr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please follow the alphabetical order of imports?

num_epochs=2,
verbose=True,
callbacks=[MRRCallback, SchedulerCallback(reduced_metric="loss")]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line at the end of py file is missed

import torch


def mrr(outputs: torch.Tensor, targets: torch.Tensor, k=100) -> torch.Tensor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please define the types for all args?

The mrr score for each user.

"""
k = min(outputs.size()[1], k)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
k = min(outputs.size()[1], k)
k = min(outputs.size(1), k)

def mrr(outputs: torch.Tensor, targets: torch.Tensor, k=100) -> torch.Tensor:

"""
Calculate the MRR score given model ouptputs and targets
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate it if you could extend documentation with the explanation of the metric or add link users will be able to read more about this score.


"""
k = min(outputs.size()[1], k)
_, indices_for_sort = outputs.sort(descending=True, dim=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use torch.topk here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the comment is more relevant to the next part
true_sorted_by_pred_shrink = true_sorted_by_preds[:, :k]

Anyway, I might missing the advantage of torch.topk over the proposed approach.
We need to sort predictions by the corresponding indexes of the outputs. Is there is a way in pytorch to sort in that fashion?

@mergify mergify bot dismissed bagxi’s stale review September 22, 2020 14:05

Pull request has been modified.

Scitator
Scitator previously approved these changes Sep 29, 2020
CHANGELOG.md Outdated
@@ -39,7 +39,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
## [20.08] - 2020-08-09

### Added
- Full metric learning pipeline including training and validation stages ([#886](https://github.com/catalyst-team/catalyst/pull/876))
- MRR metrics calculation ([#886](https://github.com/catalyst-team/catalyst/pull/886))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we need to move it up :)

@Scitator
Copy link
Member

Scitator commented Oct 4, 2020

@zkid18 could you please update your branch and resolver the above questions? thank you!

@mergify mergify bot dismissed Scitator’s stale review October 8, 2020 04:37

Pull request has been modified.

@zkid18
Copy link
Contributor Author

zkid18 commented Oct 8, 2020

@Scitator have fixed the typos, and moved mrr metric under metrics folder.
Could you check if I understood this idea correctly?

@zkid18
Copy link
Contributor Author

zkid18 commented Oct 8, 2020

@Scitator
Not sure if this problem related to codestyle itself or rather actions pipeline.
https://github.com/catalyst-team/catalyst/runs/1224097493?check_suite_focus=true


MRR
~~~~~~~~~~~~~~~~~~~~~~
.. automodule:: catalyst.utils.metrics.mrr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like here is an error with docs :)

@mergify mergify bot dismissed Scitator’s stale review October 11, 2020 10:50

Pull request has been modified.

@Scitator Scitator merged commit dbc94a7 into catalyst-team:master Oct 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants