Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements on diversity, novelty etc. metrics #1470

Merged
merged 24 commits into from
Jul 16, 2021
Merged

Conversation

anargyri
Copy link
Collaborator

@anargyri anargyri commented Jul 7, 2021

Description

Made improvements on diversity metrics for Spark in GitHub repo:

  • Moved diversity metrics into the same file as other Spark metrics
  • Updated READMEs
  • Added info in the notebook explaining which definitions are used, referring to the relevant citations
  • Added unit test for the notebook
  • Improved the code e.g. replaced cross join, changed novelty definitions to conform with Castells et al.

Related Issues

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging branch and not to main branch.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@anargyri anargyri changed the title Andreas/diversity Improvements on diversity, novelty etc. metrics Jul 7, 2021
@anargyri anargyri requested a review from YanZhangADS July 7, 2021 12:28
tests/unit/reco_utils/evaluation/test_spark_evaluation.py Outdated Show resolved Hide resolved
reco_utils/evaluation/spark_evaluation.py Show resolved Hide resolved
reco_utils/evaluation/spark_evaluation.py Outdated Show resolved Hide resolved
reco_utils/evaluation/spark_evaluation.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@YanZhangADS YanZhangADS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below metrics are based on paper Castells et. al. Novelty and diversity metrics for recommender systems: choice, discovery and relevance, ECIR 2011.

historical-item-novelty: an item is less frequently being interacted with in the training data (Eq. 1, Eq. 6)
novelty: "novelty" is built on top of "historical-item-novelty". It is weighted average historical-item-novelty for all recommended items in the whole reco system, not respect to one particular user (Eq. 10)

return self.df_diversity

# Novelty metrics
def item_novelty(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formula/definition for calculating item_novelty, user_novelty has changed. I don't fully get how the source equations (1), (6), (7) operate. Can we have a discussion about it?

My original definition is based on equation (1) in Castells's paper and "Novelty" section in article "https://eugeneyan.com/writing/serendipity-and-accuracy-in-recommender-systems/".

Quote "A better measure of novelty is to consider the population’s interactions (e.g., purchases, clicks) instead of recommendations. This reflects how likely the user has been exposed to the product based on the population historical engagement."
@anargyri @gramhagen

Copy link
Collaborator Author

@anargyri anargyri Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I have removed user novelty, equation (7) is not used anywhere in the code.

The definition of item novelty you had came from the blog post above. I changed this definition to conform with the Castells et al. paper. I suggest we stick to the paper instead of a blog post for several reasons, including that papers in the literature follow the same approach which is based on a probabilistic model. Moreover, the papers have been peer reviewed whereas a blog post reflects just one person's ideas.

What happened was that the blogger changed the standard definition because he misunderstands these models in the literature. He proposes his own definition but it is incorrect and doesn't correspond to any actual probability of interest. It is basically justified on rough intuition. He wrongly states that Castells et al. compute the ratio for p(i) on the recommended items (they compute it only on the historical items). Other papers do compute this ratio on the recommended items. What the blogger proposes instead is a ratio where the numerator is computed on recommended items and the denominator on historical items. This doesn't reflect well novelty, because what it really does is estimate the probability that the item will be recommended, given that the item has not been selected, interacted with etc. in the past. But it is clear that even if 99% of users have watched a movie in the past, this ratio can be anywhere between 0 and 1 depending on the recommendations of the algorithm to the remaining 1% of users. If the algorithm does not recommend the movie to any of them then the ratio is 0 and the movie has maximum novelty, according to E. Yan. But this clearly doesn't make sense, since this movie is very widely known across the general population of users.

@miguelgfierro
Copy link
Collaborator

@YanZhangADS @anargyri feel free to merge when you consider the work is finished. After this is merged, I'll work on #1390

@anargyri
Copy link
Collaborator Author

It looks like GitHub does not render the formulas properly on the branch (this is true for other notebooks too). You can see them all right on VSCode though.

@miguelgfierro
Copy link
Collaborator

@anargyri I'm able to render the formulas in reviewnb, however, this one is not rendering:

\textrm{IL} = \frac{1}{|M|} \sum_{u \in M} \frac{1}{\binom{N_r(u)}{2}} \sum_{i,j \in N_r (u),\, i

at first sight, there is a } missing

@anargyri
Copy link
Collaborator Author

anargyri commented Jul 16, 2021

@anargyri I'm able to render the formulas in reviewnb, however, this one is not rendering:

\textrm{IL} = \frac{1}{|M|} \sum_{u \in M} \frac{1}{\binom{N_r(u)}{2}} \sum_{i,j \in N_r (u),\, i

at first sight, there is a } missing

This looks like a ReviewNB bug. It works for me on VSCode. Also on Jupyter.

@anargyri anargyri enabled auto-merge July 16, 2021 15:47
@anargyri anargyri merged commit 99e7f0e into staging Jul 16, 2021
@miguelgfierro miguelgfierro deleted the andreas/diversity branch July 16, 2021 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants