Coverage refactor, added CoverageMetricSet and CoverageScoreCard #42

kajocina · 2022-01-05T15:48:01Z

Summary

Please provide a high-level summary of the changes for the changes and notes for the reviewers

Code passes all tests
Unit tests provided for these changes
Documentation and docstrings added for these changes

Changes

changed signature for the Coverage metrics, it now requires supplying the relevant user and item spaces plus a list of tuples (user, item) as final predictions
added the item coverage and user coverage metrics to CoverageMetricSet
created a ScoreCard for coverage metrics (they need a different signature than classification and regression)

…e signature for both

…ased metrics

…d (get_performance_metrics -> get_coverage_metrics)

benedekrozemberczki

Extremely well done @kajocina. Really nice work - with this in we will release the 0.1.0 version. Currently it has conflicts with the metricset.py. Please take a look!

cthoyt · 2022-01-06T12:25:42Z

Would also be good to add an annotator for coverage functions, same as for classification functions. I could do this in a later PR :)

cthoyt · 2022-01-06T12:27:55Z

rexmex/scorecard.py

+        self.all_users = all_users
+        self.all_items = all_items
+
+    def get_coverage_metrics(self, recommendations: List[Tuple]) -> pd.DataFrame:


~~This should override get_performance_metrics instead of making a new function~~

Hmm but since it has a different interface (like you said, sorry I missed that!) it wouldn't work

Since this really changes the interface of a score card, perhaps there is a need for an alternate, higher-level abstraction for a score card that only has the generate_report() function

I was thinking that we could do something like this - keep a parent ScoreCard class and define user-called score cards as children classes of ScoreCard (for ranking, rating, coverage etc.). Should be a slightly cleaner structure. I can do this next.

I also had considered that the score card class could be merged with the metric set classes, too

cthoyt · 2022-01-06T13:05:58Z

rexmex/metrics/coverage.py


-def item_coverage(relevant_items: List, recommendations: List[List]) -> float:
+
+def user_coverage(possible_users_items: List[List], recommendations: List[Tuple]) -> float:


can you give a more specific type annotation for possible_users_items? Is this a List[List[int]]?

Same for the recommendations

I switched it to be a tuple since it always has to be of length 2. Added explicit types now. Users can use integers or strings, depending what they're using in their system.

…he stable release yet?)

cthoyt · 2022-01-06T14:24:49Z

setup.py

@@ -1,6 +1,6 @@
 from setuptools import find_packages, setup

-install_requires = ["numpy", "sklearn", "pandas", "scipy", "scikit-learn"]
+install_requires = ["numpy", "sklearn", "pandas==1.3.5", "scipy", "scikit-learn"]


I'd suggest <= 1.3.5 since this probably is a bit too restrictive at the moment

I hope the pandas new release things gets cleared up asap

Thanks, changed to 1.3.4 now :)

…rs/items, not a List

codecov-commenter · 2022-01-06T14:37:53Z

Codecov Report

Merging #42 (ee64141) into main (8fe5de1) will decrease coverage by 0.41%.
The diff coverage is 93.05%.

@@            Coverage Diff             @@
##             main      #42      +/-   ##
==========================================
- Coverage   99.77%   99.36%   -0.42%     
==========================================
  Files          16       16              
  Lines         886      942      +56     
==========================================
+ Hits          884      936      +52     
- Misses          2        6       +4

Impacted Files	Coverage Δ
rexmex/metrics/coverage.py	`81.81% <81.81%> (-5.69%)`	⬇️
rexmex/scorecard.py	`97.82% <94.44%> (-2.18%)`	⬇️
rexmex/metricset.py	`98.36% <100.00%> (+0.11%)`	⬆️
tests/integration/test_aggregation.py	`100.00% <100.00%> (ø)`
tests/unit/test_metrics.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8fe5de1...ee64141. Read the comment docs.

kajocina added 9 commits January 5, 2022 12:12

refactored coverage, split into user/item coverage but forced the sam…

451b141

…e signature for both

added tests for two new versions of coverage

0997a89

import fix

93841e7

added coverage MetricSet

17a562f

added CoverageScoreCard handling the specific signature of coverage-b…

a35baf4

…ased metrics

added CoverageScoreCard integration test

e59f83a

updated coverage info + better links

684e8b4

updated coverage info

0789831

changed method name to avoid changing signature of parent class metho…

3d05d65

…d (get_performance_metrics -> get_coverage_metrics)

kajocina requested a review from benedekrozemberczki January 5, 2022 15:51

benedekrozemberczki approved these changes Jan 6, 2022

View reviewed changes

cthoyt reviewed Jan 6, 2022

View reviewed changes

Merge branch 'main' into coverage_refactor

7674928

cthoyt reviewed Jan 6, 2022

View reviewed changes

kajocina added 2 commits January 6, 2022 14:06

fixed pandas version (CI pipeline tries installing 1.4 which is not t…

c486f1b

…he stable release yet?)

added precise type expectations

5ec0859

cthoyt reviewed Jan 6, 2022

View reviewed changes

kajocina added 4 commits January 6, 2022 14:32

pandas 1.3.5 -> 1.3.4

678a1fa

fixed the typing on coverage metrics, it now expects a Tuple with use…

585f240

…rs/items, not a List

fix coverage scorecard to use a tuple instead of a list

ccdd8de

fix coverage metric tests to use a tuple, not a list

ee64141

benedekrozemberczki merged commit 6ec4993 into main Jan 6, 2022

kajocina deleted the coverage_refactor branch January 23, 2022 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coverage refactor, added CoverageMetricSet and CoverageScoreCard #42

Coverage refactor, added CoverageMetricSet and CoverageScoreCard #42

kajocina commented Jan 5, 2022

benedekrozemberczki left a comment •

edited

cthoyt commented Jan 6, 2022

cthoyt Jan 6, 2022 •

edited

cthoyt Jan 6, 2022

kajocina Jan 6, 2022

cthoyt Jan 6, 2022

cthoyt Jan 6, 2022 •

edited

kajocina Jan 6, 2022

cthoyt Jan 6, 2022

kajocina Jan 6, 2022

codecov-commenter commented Jan 6, 2022


		def item_coverage(relevant_items: List, recommendations: List[List]) -> float:

		def user_coverage(possible_users_items: List[List], recommendations: List[Tuple]) -> float:

Coverage refactor, added CoverageMetricSet and CoverageScoreCard #42

Coverage refactor, added CoverageMetricSet and CoverageScoreCard #42

Conversation

kajocina commented Jan 5, 2022

Summary

Changes

benedekrozemberczki left a comment • edited

Choose a reason for hiding this comment

cthoyt commented Jan 6, 2022

cthoyt Jan 6, 2022 • edited

Choose a reason for hiding this comment

cthoyt Jan 6, 2022

Choose a reason for hiding this comment

kajocina Jan 6, 2022

Choose a reason for hiding this comment

cthoyt Jan 6, 2022

Choose a reason for hiding this comment

cthoyt Jan 6, 2022 • edited

Choose a reason for hiding this comment

kajocina Jan 6, 2022

Choose a reason for hiding this comment

cthoyt Jan 6, 2022

Choose a reason for hiding this comment

kajocina Jan 6, 2022

Choose a reason for hiding this comment

codecov-commenter commented Jan 6, 2022

Codecov Report

benedekrozemberczki left a comment •

edited

cthoyt Jan 6, 2022 •

edited

cthoyt Jan 6, 2022 •

edited