Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndcg refactor #1481

Merged
merged 8 commits into from Mar 14, 2024
Merged

ndcg refactor #1481

merged 8 commits into from Mar 14, 2024

Conversation

FelipeAdachi
Copy link
Contributor

@FelipeAdachi FelipeAdachi commented Mar 13, 2024

Description

This PR:

  • Adds ndcg value expecations to the tests, not just counts
  • Changes convert_non_numeric=True to be used for string columns (integers, floats in scores and bools should use the default convert_non_numeric=False )
  • adds score_column support: when a score is available, like in this example, one should use score_column and target_column , and not prediction_column (see 2 last tests on the sklearn examples)
  • Changes the logic to generate target column values to make it compatible with all scenarios
  • Fixes prediction and ideal relevance calculation for non numeric case
  • Handles DivisionbyZero edge case when idcg=0 (ndcg set to 1 if no relevant documents exist)
  • If K is not passed, metrics will be calculated according to predictions cols's length (and metrics named accordingly - k is no longer omitted in the names when k is None)

For the Numeric case:

  • If predictions+target or scores+target columns are both provided, they need to be of same length.

  • If prediction_column is provided with target column, prediction col contains the rank of suggested items, starting with 1

  • If only prediction column is provided, it is assumed that the order encodes the ranks of recommendations: first item in the list is the first recommendation. The value in the list encodes the relevance score

  • I have reviewed the Guidelines for Contributing and the Code of Conduct.

@FelipeAdachi FelipeAdachi changed the title Update predicted and ideal relevances ndcg refactor Mar 13, 2024
Copy link
Contributor

@jamie256 jamie256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few questions and minor comments. Thanks for working on these!

python/whylogs/experimental/api/logger/__init__.py Outdated Show resolved Hide resolved
python/tests/experimental/api/test_logger.py Outdated Show resolved Hide resolved
python/tests/experimental/api/test_logger.py Show resolved Hide resolved
Copy link
Contributor

@jamie256 jamie256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better, thanks!

@jamie256 jamie256 merged commit 164985c into mainline Mar 14, 2024
19 checks passed
@jamie256 jamie256 deleted the dev/felipe/ndcg-fixes branch March 14, 2024 23:01
@jamie256 jamie256 linked an issue Mar 14, 2024 that may be closed by this pull request
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

log_batch_ranking_metrics error
3 participants