Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds mapk implementation #50

Merged
merged 7 commits into from Feb 21, 2023
Merged

Conversation

mrkaye97
Copy link
Contributor

Closes #49

I do have some questions about the implementation though. I'll leave them as comments

# precision over the two sets of predictions
self.assertEqual(mean_abs_recall_k, ((1/9) + (2/9)) / 2)

def test_apk(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if someone could check the numbers in these tests to make sure the math is actually correct :)

@mrkaye97
Copy link
Contributor Author

Some additional questions / comments:

  1. Is there a reason why there's no CI process running the unit tests?
  2. I was having trouble restoring the dependencies in the lockfile using poetry. To get poetry install working I just manually nuked the pyproject file and the lockfile and completely reinitialized the project from scratch. It'd be great if someone could try to repro that issue, though. Here's the error:
(recmetrics-py3.9) (base) matt@Matt-Kayes-MBP recmetrics % poetry install
Installing dependencies from lock file

Unable to read the lock file (Cannot declare ('package', 'dependencies') twice (at line 1115, column 22)).

test coverage

when k is less the the number of predictions, use k

Revert "when k is less the the number of predictions, use k"

This reverts commit e24e800.

cleanup

adds note on k vs len(predicted)

linting

match Stanford slides

edge cases

test fixes
return score / true_positives


def mark(actual: List[list], predicted: List[list], k=10) -> float:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by: this should return a float, not an int

@@ -152,11 +214,29 @@ def mark(actual: List[list], predicted: List[list], k=10) -> int:
example: [['X', 'Y', 'Z'], ['X', 'Y', 'Z']]
Returns:
-------
mark: int
mark: float
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by

@@ -80,7 +80,7 @@ def test_catalog_coverage(self):

def test_mark(self):
"""
Test mean absolute recall @ k (MAPK) function
Test mean absolute recall @ k (MARK) function
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by

@@ -107,7 +107,7 @@ def catalog_coverage(predicted: List[list], catalog: list, k: int) -> float:
catalog_coverage = round(L_predictions/(len(catalog)*1.0)*100,2)
return catalog_coverage

def _ark(actual: list, predicted: list, k=10) -> int:
def _ark(actual: list, predicted: list, k=10) -> float:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by

@@ -139,7 +139,69 @@ def _ark(actual: list, predicted: list, k=10) -> int:

return score / len(actual)

def mark(actual: List[list], predicted: List[list], k=10) -> int:
def _pk(actual: list, predicted: list, k) -> float:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Let me know if this implementation is too different than the _ark / mark one -- I figured it made sense to just factor this out. But in broader terms, maybe it makes sense to use the recommender_precision method inside of _apk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I removed _pk in favor of just using the pre-existing precision calc. hopefully that refactor is okay!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! thanks for this improvement!

@@ -310,6 +388,10 @@ def make_confusion_matrix(y: list, yhat: list) -> None:
plt.yticks([0,1], [1,0])
plt.show()

def _precision(predicted, actual):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this refactor is okay! I pulled _precision out of recommender_precision so that I could reuse it in the _apk calc. I also moved the rounding back inside of recommender_precision as to not have breaking changes there but also so that the raw precision scores aren't rounded.

If you'd prefer I didn't do this, I can re-implement this precision calc inside of _apk

@ytang07 ytang07 merged commit f9a4ca5 into statisticianinstilettos:master Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement MAP@k
3 participants