Improvements on diversity, novelty etc. metrics #1470

anargyri · 2021-07-07T12:26:52Z

Description

Made improvements on diversity metrics for Spark in GitHub repo:

Moved diversity metrics into the same file as other Spark metrics
Updated READMEs
Added info in the notebook explaining which definitions are used, referring to the relevant citations
Added unit test for the notebook
Improved the code e.g. replaced cross join, changed novelty definitions to conform with Castells et al.

Related Issues

Checklist:

I have followed the contribution guidelines and code style for this project.
I have added tests covering my contributions.
I have updated the documentation accordingly.
This PR is being made to staging branch and not to main branch.

Staging to main: prepare for release

…; rewrite item novelty without cross join.

review-notebook-app · 2021-07-07T12:26:56Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

examples/03_evaluate/README.md

reco_utils/evaluation/spark_evaluation.py

examples/03_evaluate/README.md

reco_utils/evaluation/spark_evaluation.py

tests/unit/reco_utils/evaluation/test_spark_evaluation.py

reco_utils/evaluation/spark_evaluation.py

YanZhangADS

Below metrics are based on paper Castells et. al. Novelty and diversity metrics for recommender systems: choice, discovery and relevance, ECIR 2011.

historical-item-novelty: an item is less frequently being interacted with in the training data (Eq. 1, Eq. 6)
novelty: "novelty" is built on top of "historical-item-novelty". It is weighted average historical-item-novelty for all recommended items in the whole reco system, not respect to one particular user (Eq. 10)

YanZhangADS · 2021-07-09T13:02:58Z

reco_utils/evaluation/spark_evaluation.py

+        return self.df_diversity
+
+    # Novelty metrics
+    def item_novelty(self):


The formula/definition for calculating item_novelty, user_novelty has changed. I don't fully get how the source equations (1), (6), (7) operate. Can we have a discussion about it?

My original definition is based on equation (1) in Castells's paper and "Novelty" section in article "https://eugeneyan.com/writing/serendipity-and-accuracy-in-recommender-systems/".

Quote "A better measure of novelty is to consider the population’s interactions (e.g., purchases, clicks) instead of recommendations. This reflects how likely the user has been exposed to the product based on the population historical engagement."
@anargyri @gramhagen

Now I have removed user novelty, equation (7) is not used anywhere in the code.

The definition of item novelty you had came from the blog post above. I changed this definition to conform with the Castells et al. paper. I suggest we stick to the paper instead of a blog post for several reasons, including that papers in the literature follow the same approach which is based on a probabilistic model. Moreover, the papers have been peer reviewed whereas a blog post reflects just one person's ideas.

What happened was that the blogger changed the standard definition because he misunderstands these models in the literature. He proposes his own definition but it is incorrect and doesn't correspond to any actual probability of interest. It is basically justified on rough intuition. He wrongly states that Castells et al. compute the ratio for p(i) on the recommended items (they compute it only on the historical items). Other papers do compute this ratio on the recommended items. What the blogger proposes instead is a ratio where the numerator is computed on recommended items and the denominator on historical items. This doesn't reflect well novelty, because what it really does is estimate the probability that the item will be recommended, given that the item has not been selected, interacted with etc. in the past. But it is clear that even if 99% of users have watched a movie in the past, this ratio can be anywhere between 0 and 1 depending on the recommendations of the algorithm to the remaining 1% of users. If the algorithm does not recommend the movie to any of them then the ratio is 0 and the movie has maximum novelty, according to E. Yan. But this clearly doesn't make sense, since this movie is very widely known across the general population of users.

miguelgfierro · 2021-07-15T15:59:45Z

@YanZhangADS @anargyri feel free to merge when you consider the work is finished. After this is merged, I'll work on #1390

anargyri · 2021-07-15T17:26:20Z

It looks like GitHub does not render the formulas properly on the branch (this is true for other notebooks too). You can see them all right on VSCode though.

miguelgfierro · 2021-07-15T20:56:31Z

@anargyri I'm able to render the formulas in reviewnb, however, this one is not rendering:

\textrm{IL} = \frac{1}{|M|} \sum_{u \in M} \frac{1}{\binom{N_r(u)}{2}} \sum_{i,j \in N_r (u),\, i

at first sight, there is a } missing

anargyri · 2021-07-16T09:12:33Z

@anargyri I'm able to render the formulas in reviewnb, however, this one is not rendering:
\textrm{IL} = \frac{1}{|M|} \sum_{u \in M} \frac{1}{\binom{N_r(u)}{2}} \sum_{i,j \in N_r (u),\, i
at first sight, there is a } missing

This looks like a ReviewNB bug. It works for me on VSCode. Also on Jupyter.

miguelgfierro and others added 12 commits June 17, 2021 17:41

Merge pull request #1456 from microsoft/staging

efaa3d7

Staging to main: prepare for release

Move Spark diversity metrics inside spark_evaluation .py; fix novelty…

2ce94d1

…; rewrite item novelty without cross join.

Add citations

f64c1ef

Add citations

dd8136c

Update diversity metrics from PR 1465

e776a0a

Merge branch 'zhangya/diversity_metrics' into andreas/diversity

e18e20b

Fix item novelty

f771050

Modify tests

eb5cd48

User novelty

66996c7

Novelty

d3a7db0

Test relevance in serendipity; update README

893a2ef

Edit diversity metrics notebook; add test

93eee9f

anargyri requested review from gramhagen, loomlike, miguelgfierro, wutaomsft and yueguoguo as code owners July 7, 2021 12:26

anargyri changed the title ~~Andreas/diversity~~ Improvements on diversity, novelty etc. metrics Jul 7, 2021

anargyri requested a review from YanZhangADS July 7, 2021 12:28

anargyri mentioned this pull request Jul 7, 2021

Improvements on diversity metrics #1453

Closed

anargyri added 2 commits July 7, 2021 14:40

Fix docstrings

a649e64

Correct novelty scaling

59c433a

gramhagen reviewed Jul 8, 2021

View reviewed changes

examples/03_evaluate/README.md Outdated Show resolved Hide resolved

gramhagen reviewed Jul 8, 2021

View reviewed changes

reco_utils/evaluation/spark_evaluation.py Show resolved Hide resolved

Edit README

51efc49

gramhagen mentioned this pull request Jul 9, 2021

[FEATURE] Evaluation API / module consistency #1472

Closed

gramhagen approved these changes Jul 9, 2021

View reviewed changes

YanZhangADS reviewed Jul 9, 2021

View reviewed changes

examples/03_evaluate/README.md Outdated Show resolved Hide resolved

reco_utils/evaluation/spark_evaluation.py Show resolved Hide resolved

gramhagen reviewed Jul 9, 2021

View reviewed changes

reco_utils/evaluation/spark_evaluation.py Show resolved Hide resolved

gramhagen reviewed Jul 9, 2021

View reviewed changes

reco_utils/evaluation/spark_evaluation.py Outdated Show resolved Hide resolved

anargyri added 2 commits July 9, 2021 16:59

Edit README

a66d452

Rename item and user novelty

a4ce141

gramhagen approved these changes Jul 13, 2021

View reviewed changes

miguelgfierro approved these changes Jul 13, 2021

View reviewed changes

anargyri added 4 commits July 14, 2021 15:15

Add edge case in unit test for novelty

a58b303

Add explanation about novelty in docstring

ccc89fc

Add more detail in docstring for novelty

de1a297

Remove user novelty

2956c2e

YanZhangADS approved these changes Jul 15, 2021

View reviewed changes

Add formulas from Yan's document inside notebook.

e0236c9

anargyri added 2 commits July 16, 2021 10:15

Skip Wikidata test

b0218f4

Disable wikidata tests

6b396e0

anargyri enabled auto-merge July 16, 2021 15:47

anargyri merged commit 99e7f0e into staging Jul 16, 2021

miguelgfierro deleted the andreas/diversity branch July 16, 2021 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements on diversity, novelty etc. metrics #1470

Improvements on diversity, novelty etc. metrics #1470

anargyri commented Jul 7, 2021 •

edited

review-notebook-app bot commented Jul 7, 2021

YanZhangADS left a comment

YanZhangADS Jul 9, 2021

anargyri Jul 15, 2021 •

edited

miguelgfierro commented Jul 15, 2021

anargyri commented Jul 15, 2021

miguelgfierro commented Jul 15, 2021

anargyri commented Jul 16, 2021 •

edited

Improvements on diversity, novelty etc. metrics #1470

Improvements on diversity, novelty etc. metrics #1470

Conversation

anargyri commented Jul 7, 2021 • edited

Description

Related Issues

Checklist:

review-notebook-app bot commented Jul 7, 2021

YanZhangADS left a comment

Choose a reason for hiding this comment

YanZhangADS Jul 9, 2021

Choose a reason for hiding this comment

anargyri Jul 15, 2021 • edited

Choose a reason for hiding this comment

miguelgfierro commented Jul 15, 2021

anargyri commented Jul 15, 2021

miguelgfierro commented Jul 15, 2021

anargyri commented Jul 16, 2021 • edited

anargyri commented Jul 7, 2021 •

edited

anargyri Jul 15, 2021 •

edited

anargyri commented Jul 16, 2021 •

edited