[MRG] DOC Fix error in documentation of trustworthiness #9800

wdevazelhes · 2017-09-19T14:50:27Z

Reference Issue

What does this implement/fix? Explain your changes.

It fixes an error in the docstring of function manifold.t_sne.trustworthiness: the rank in the formula should be the rank in the original input space.

lesteve · 2017-09-19T21:30:15Z

@tomMoral any opinion on this since you have worked on t-SNE recently?

jnothman

I think the key to get across is that any unexpected nearest neighbors in the embedded space are penalised in proportion to their rank in the original space.

jnothman · 2017-09-20T04:44:50Z

sklearn/manifold/t_sne.py

@@ -387,10 +387,10 @@ def trustworthiness(X, X_embedded, n_neighbors=5, precomputed=False):
        T(k) = 1 - \frac{2}{nk (2n - 3k - 1)} \sum^n_{i=1}
            \sum_{j \in U^{(k)}_i} (r(i, j) - k)

-    where :math:`r(i, j)` is the rank of the embedded datapoint j


I think there's some confusion in "according to", and r is not described as a function of i. Can you write it in your own words? Use multiple sentences. Perhaps describe U before you describe r. You can also say "for each sample i" to make it simpler

I agree that U should be move before r and the "for each" should help defining things.
Perhaps you can use "input space" instead of original space and insist on the fact that the data sample j is the r(i,j)-th neighbors of the data sample i

wdevazelhes · 2017-09-20T10:17:04Z

Thanks for the comments. I tried to include them in the new commit. I also added @jnothman sum-up of the function in the docstring, tell me if it seems good to you.

tomMoral

LGTM It is way clearer with this phrasing.

tomMoral · 2017-09-20T10:49:30Z

sklearn/manifold/t_sne.py

-    is the set of points that are in the k nearest neighbors in the embedded
-    space but not in the original space.
+    where for each sample i, :math:`U^{(k)}_i` are all samples j that are in
+    the k-nearest neighbor of i in the embedded space but are the :math:`r(i,


Maybe use the "output space" would be clearer? input/output seem clearer when talking about mappings.

Definitely, I will do so.

tomMoral · 2017-09-20T10:50:20Z

sklearn/manifold/t_sne.py

+    the k-nearest neighbor of i in the embedded space but are the :math:`r(i,
+    j)`-th nearest neighbor of i in the input space with r(i, j) > k. In other
+    words, any unexpected nearest neighbors in the embedded space are penalised
+    in proportion to their rank in the input space.


I really like this sum up! 👍

I agree 👍

jnothman · 2017-09-24T11:30:36Z

sklearn/manifold/t_sne.py

-    :math:`U^{(k)}_i` is the set of points that are in the k nearest
-    neighbors in the embedded space but not in the original space.
+    where for each sample i, :math:`U^{(k)}_i` are all samples j that are in
+    the k-nearest neighbor of i in the output space but are the :math:`r(i,


I still find this difficult. Can we just get rid of U from above and just have max(0, r(i, j) - k) or max(k, r(i, j)) - k?

jnothman

Sorry to be annoying. I'm now thinking that the problem with U was that it expressed two things: being in i's embedded neighborhood and being outside of i's original neighborhood. It's now a bit weird that we don't have a function for "the k-neighborhood of i in embedded space".

Otherwise, I think this is a vast improvement. Thank you.

jnothman · 2017-09-25T10:20:06Z

sklearn/manifold/t_sne.py

-    according to the pairwise distances between the embedded datapoints,
-    :math:`U^{(k)}_i` is the set of points that are in the k nearest
-    neighbors in the embedded space but not in the original space.
+    where for each sample i, j is among its k nearest neighbors in the output


output -> embedded?

I made this change according to @tomMoral comment:

Maybe use the "output space" would be clearer? input/output seem clearer when talking about mappings.

Both ways seem clear to me: "embedded" is more precise, but maybe "output" is clear enough in this case and maybe simpler ?

I'm not too fussed either way

wdevazelhes · 2017-09-25T11:57:28Z

No pb, I agree, I will fix that

jnothman · 2017-09-25T23:37:46Z

wonderful

Fix error in documentation of trustworthiness

ffb0fad

jnothman reviewed Sep 20, 2017

View reviewed changes

Make documentation simpler

38de739

replace original space by input space

912f226

tomMoral reviewed Sep 20, 2017

View reviewed changes

change embedding space into output space

6192b6d

wdevazelhes changed the title ~~DOC Fix error in documentation of trustworthiness~~ [MRG] DOC Fix error in documentation of trustworthiness Sep 20, 2017

jnothman reviewed Sep 24, 2017

View reviewed changes

Changing docstring for clarity

692a7ed

jnothman reviewed Sep 25, 2017

View reviewed changes

Reformulate for better understanding.

d493d3b

jnothman merged commit 8de1844 into scikit-learn:master Sep 25, 2017

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

DOC Fix error in documentation of trustworthiness (scikit-learn#9800)

4b4b9b8

wdevazelhes deleted the i/9799 branch November 27, 2017 15:15

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

DOC Fix error in documentation of trustworthiness (scikit-learn#9800)

b0eca64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] DOC Fix error in documentation of trustworthiness #9800

[MRG] DOC Fix error in documentation of trustworthiness #9800

wdevazelhes commented Sep 19, 2017

lesteve commented Sep 19, 2017

jnothman left a comment

jnothman Sep 20, 2017

tomMoral Sep 20, 2017

wdevazelhes commented Sep 20, 2017

tomMoral left a comment •

edited

tomMoral Sep 20, 2017

wdevazelhes Sep 20, 2017

tomMoral Sep 20, 2017

wdevazelhes Sep 20, 2017

jnothman Sep 24, 2017

jnothman left a comment

jnothman Sep 25, 2017

wdevazelhes Sep 25, 2017

jnothman Sep 25, 2017

jnothman Sep 25, 2017

wdevazelhes commented Sep 25, 2017

jnothman commented Sep 25, 2017

[MRG] DOC Fix error in documentation of trustworthiness #9800

[MRG] DOC Fix error in documentation of trustworthiness #9800

Conversation

wdevazelhes commented Sep 19, 2017

Reference Issue

What does this implement/fix? Explain your changes.

lesteve commented Sep 19, 2017

jnothman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wdevazelhes commented Sep 20, 2017

tomMoral left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnothman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wdevazelhes commented Sep 25, 2017

jnothman commented Sep 25, 2017

tomMoral left a comment •

edited