Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coverage refactor, added CoverageMetricSet and CoverageScoreCard #42

Merged
merged 16 commits into from Jan 6, 2022

Conversation

kajocina
Copy link
Contributor

@kajocina kajocina commented Jan 5, 2022

Summary

Please provide a high-level summary of the changes for the changes and notes for the reviewers

  • Code passes all tests
  • Unit tests provided for these changes
  • Documentation and docstrings added for these changes

Changes

  • changed signature for the Coverage metrics, it now requires supplying the relevant user and item spaces plus a list of tuples (user, item) as final predictions
  • added the item coverage and user coverage metrics to CoverageMetricSet
  • created a ScoreCard for coverage metrics (they need a different signature than classification and regression)

Copy link
Contributor

@benedekrozemberczki benedekrozemberczki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely well done @kajocina. Really nice work - with this in we will release the 0.1.0 version. Currently it has conflicts with the metricset.py. Please take a look!

@cthoyt
Copy link
Contributor

cthoyt commented Jan 6, 2022

Would also be good to add an annotator for coverage functions, same as for classification functions. I could do this in a later PR :)

self.all_users = all_users
self.all_items = all_items

def get_coverage_metrics(self, recommendations: List[Tuple]) -> pd.DataFrame:
Copy link
Contributor

@cthoyt cthoyt Jan 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should override get_performance_metrics instead of making a new function

Hmm but since it has a different interface (like you said, sorry I missed that!) it wouldn't work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this really changes the interface of a score card, perhaps there is a need for an alternate, higher-level abstraction for a score card that only has the generate_report() function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that we could do something like this - keep a parent ScoreCard class and define user-called score cards as children classes of ScoreCard (for ranking, rating, coverage etc.). Should be a slightly cleaner structure. I can do this next.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had considered that the score card class could be merged with the metric set classes, too


def item_coverage(relevant_items: List, recommendations: List[List]) -> float:

def user_coverage(possible_users_items: List[List], recommendations: List[Tuple]) -> float:
Copy link
Contributor

@cthoyt cthoyt Jan 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give a more specific type annotation for possible_users_items? Is this a List[List[int]]?

Same for the recommendations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched it to be a tuple since it always has to be of length 2. Added explicit types now. Users can use integers or strings, depending what they're using in their system.

setup.py Outdated
@@ -1,6 +1,6 @@
from setuptools import find_packages, setup

install_requires = ["numpy", "sklearn", "pandas", "scipy", "scikit-learn"]
install_requires = ["numpy", "sklearn", "pandas==1.3.5", "scipy", "scikit-learn"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest <= 1.3.5 since this probably is a bit too restrictive at the moment

I hope the pandas new release things gets cleared up asap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, changed to 1.3.4 now :)

@codecov-commenter
Copy link

Codecov Report

Merging #42 (ee64141) into main (8fe5de1) will decrease coverage by 0.41%.
The diff coverage is 93.05%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #42      +/-   ##
==========================================
- Coverage   99.77%   99.36%   -0.42%     
==========================================
  Files          16       16              
  Lines         886      942      +56     
==========================================
+ Hits          884      936      +52     
- Misses          2        6       +4     
Impacted Files Coverage Δ
rexmex/metrics/coverage.py 81.81% <81.81%> (-5.69%) ⬇️
rexmex/scorecard.py 97.82% <94.44%> (-2.18%) ⬇️
rexmex/metricset.py 98.36% <100.00%> (+0.11%) ⬆️
tests/integration/test_aggregation.py 100.00% <100.00%> (ø)
tests/unit/test_metrics.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8fe5de1...ee64141. Read the comment docs.

@benedekrozemberczki benedekrozemberczki merged commit 6ec4993 into main Jan 6, 2022
@kajocina kajocina deleted the coverage_refactor branch January 23, 2022 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants