Adds `mapk` implementation #50

mrkaye97 · 2023-02-20T23:24:56Z

Closes #49

I do have some questions about the implementation though. I'll leave them as comments

mrkaye97 · 2023-02-20T23:28:07Z

tests/test_metrics.py

+        #  precision over the two sets of predictions
+        self.assertEqual(mean_abs_recall_k, ((1/9) + (2/9)) / 2)
+
+    def test_apk(self):


It would be great if someone could check the numbers in these tests to make sure the math is actually correct :)

mrkaye97 · 2023-02-21T14:28:38Z

Some additional questions / comments:

Is there a reason why there's no CI process running the unit tests?
I was having trouble restoring the dependencies in the lockfile using poetry. To get poetry install working I just manually nuked the pyproject file and the lockfile and completely reinitialized the project from scratch. It'd be great if someone could try to repro that issue, though. Here's the error:

(recmetrics-py3.9) (base) matt@Matt-Kayes-MBP recmetrics % poetry install
Installing dependencies from lock file

Unable to read the lock file (Cannot declare ('package', 'dependencies') twice (at line 1115, column 22)).

test coverage when k is less the the number of predictions, use k Revert "when k is less the the number of predictions, use k" This reverts commit e24e800. cleanup adds note on k vs len(predicted) linting match Stanford slides edge cases test fixes

mrkaye97 · 2023-02-21T16:35:17Z

recmetrics/metrics.py

+    return score / true_positives
+
+
+def mark(actual: List[list], predicted: List[list], k=10) -> float:


drive by: this should return a float, not an int

mrkaye97 · 2023-02-21T16:35:29Z

recmetrics/metrics.py

@@ -152,11 +214,29 @@ def mark(actual: List[list], predicted: List[list], k=10) -> int:
        example: [['X', 'Y', 'Z'], ['X', 'Y', 'Z']]
    Returns:
    -------
-        mark: int
+        mark: float


mrkaye97 · 2023-02-21T16:35:39Z

tests/test_metrics.py

@@ -80,7 +80,7 @@ def test_catalog_coverage(self):

    def test_mark(self):
        """
-        Test mean absolute recall @ k (MAPK) function
+        Test mean absolute recall @ k (MARK) function


mrkaye97 · 2023-02-21T16:36:11Z

recmetrics/metrics.py

@@ -107,7 +107,7 @@ def catalog_coverage(predicted: List[list], catalog: list, k: int) -> float:
    catalog_coverage = round(L_predictions/(len(catalog)*1.0)*100,2)
    return catalog_coverage

-def _ark(actual: list, predicted: list, k=10) -> int:
+def _ark(actual: list, predicted: list, k=10) -> float:


mrkaye97 · 2023-02-21T16:38:11Z

recmetrics/metrics.py

@@ -139,7 +139,69 @@ def _ark(actual: list, predicted: list, k=10) -> int:

    return score / len(actual)

-def mark(actual: List[list], predicted: List[list], k=10) -> int:
+def _pk(actual: list, predicted: list, k) -> float:


Question: Let me know if this implementation is too different than the _ark / mark one -- I figured it made sense to just factor this out. But in broader terms, maybe it makes sense to use the recommender_precision method inside of _apk?

Update: I removed _pk in favor of just using the pre-existing precision calc. hopefully that refactor is okay!

This is great! thanks for this improvement!

mrkaye97 · 2023-02-21T16:54:35Z

recmetrics/metrics.py

@@ -310,6 +388,10 @@ def make_confusion_matrix(y: list, yhat: list) -> None:
    plt.yticks([0,1], [1,0])
    plt.show()

+def _precision(predicted, actual):


Let me know if this refactor is okay! I pulled _precision out of recommender_precision so that I could reuse it in the _apk calc. I also moved the rounding back inside of recommender_precision as to not have breaking changes there but also so that the raw precision scores aren't rounded.

If you'd prefer I didn't do this, I can re-implement this precision calc inside of _apk

mrkaye97 commented Feb 20, 2023

View reviewed changes

adds mapk implementation

63fb6ac

test coverage when k is less the the number of predictions, use k Revert "when k is less the the number of predictions, use k" This reverts commit e24e800. cleanup adds note on k vs len(predicted) linting match Stanford slides edge cases test fixes

mrkaye97 force-pushed the adds-mapk branch from 4a070a3 to 63fb6ac Compare February 21, 2023 16:32

recall returns floats

6875bc8

mrkaye97 commented Feb 21, 2023

View reviewed changes

mrkaye97 added 2 commits February 21, 2023 11:45

cleanup

aa583fa

refactor

07487ef

mrkaye97 commented Feb 21, 2023

View reviewed changes

mrkaye97 and others added 3 commits February 21, 2023 11:54

remove pk

8b88654

type cleanup

408b4d6

Merge branch 'master' into adds-mapk

d7e008b

ytang07 merged commit f9a4ca5 into statisticianinstilettos:master Feb 21, 2023

mrkaye97 mentioned this pull request Feb 22, 2023

Adds Minimal CI Setup w/ GHA #52

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds `mapk` implementation #50

Adds `mapk` implementation #50

mrkaye97 commented Feb 20, 2023

mrkaye97 Feb 20, 2023

mrkaye97 commented Feb 21, 2023

mrkaye97 Feb 21, 2023

mrkaye97 Feb 21, 2023

mrkaye97 Feb 21, 2023

mrkaye97 Feb 21, 2023

mrkaye97 Feb 21, 2023

mrkaye97 Feb 21, 2023

statisticianinstilettos Feb 21, 2023

mrkaye97 Feb 21, 2023

		return score / true_positives


		def mark(actual: List[list], predicted: List[list], k=10) -> float:

Adds mapk implementation #50

Adds mapk implementation #50

Conversation

mrkaye97 commented Feb 20, 2023

Choose a reason for hiding this comment

mrkaye97 commented Feb 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Adds `mapk` implementation #50

Adds `mapk` implementation #50