New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG + 1] enable metric = 'cosine' for tsne computation #9623
Changes from 5 commits
0c2dca6
a94a995
9882b5b
02fce30
5c1de71
db96562
eade3c3
185fd08
97aa6e4
8689db3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,6 +30,8 @@ | |
from scipy.spatial.distance import pdist | ||
from scipy.spatial.distance import squareform | ||
from sklearn.metrics.pairwise import pairwise_distances | ||
from sklearn.metrics.pairwise import manhattan_distances | ||
from sklearn.metrics.pairwise import cosine_distances | ||
|
||
|
||
x = np.linspace(0, 1, 10) | ||
|
@@ -764,3 +766,21 @@ def test_bh_match_exact(): | |
assert n_iter['exact'] == n_iter['barnes_hut'] | ||
assert_array_almost_equal(X_embeddeds['exact'], X_embeddeds['barnes_hut'], | ||
decimal=3) | ||
|
||
|
||
def test_tsne_with_different_distance_metrics(): | ||
"""Make sure that TSNE works for different distance metrics""" | ||
random_state = check_random_state(0) | ||
n_components_original = 3 | ||
n_components_embedding = 2 | ||
X = random_state.randn(50, n_components_original).astype(np.float32) | ||
metrics = ['manhattan', 'cosine'] | ||
dist_funcs = [manhattan_distances, cosine_distances] | ||
for metric, dist_func in zip(metrics, dist_funcs): | ||
tsne_1 = TSNE(metric=metric, n_components=n_components_embedding, | ||
random_state=0).fit_transform(X) | ||
tsne_2 = TSNE(metric='precomputed', | ||
n_components=n_components_embedding, | ||
random_state=0).fit_transform(dist_func(X)) | ||
t = trustworthiness(tsne_1, tsne_2, n_neighbors=1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm confused about what you're testing here. I think we can just test that the outputs are identical if the second is precomputed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The trustworthiness is designed to test the validity of the results, between the input and the output space, using the euclidean distance. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
assert_greater(t, 0.9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of variable lazy naming, even less when they are confusing like this (tsne_1 makes you think this is TSNE estimator while it is a transformed X ...). You could do