Bug: running optimize() multiple times produces different result compared to running it once #228

dkobak · 2023-01-27T11:14:45Z

Running optimize(n_iter=1) for 100 times should give the same result as optimize(n_iter=100), but it doesn't. Here is a reproducible example:

n = 100
np.random.seed(42)
X = np.random.randn(n, 2)

from openTSNE.affinity import PerplexityBasedNN
from openTSNE import TSNEEmbedding
from openTSNE.initialization import random as random_init

A = PerplexityBasedNN(X)
I = random_init(n, random_state=42)

E1 = TSNEEmbedding(I, A, random_state=42)
E2 = TSNEEmbedding(I, A, random_state=42)

E1.optimize(n_iter=100, inplace=True)

for i in range(100):
    E2.optimize(n_iter=1, inplace=True)
    
plt.figure(figsize=(4, 4), layout='constrained')
plt.scatter(E1[:,0], E1[:,1])
plt.scatter(E2[:,0], E2[:,1])

Maybe it has something to do with how the gains are saved in between the optimize() calls?

The text was updated successfully, but these errors were encountered:

dkobak · 2023-01-27T11:50:20Z

Hmm, does not look like it's due to gains: if I use 2 iterations instead of 100 in the snippet above, then E1 and E2 are still not identical, yet E1.optimizer.gains and E2.optimizer.gains are the same.

dkobak · 2023-01-30T14:41:05Z

So I figured out what is going on here. update vectors are not save between the optimize() calls, so momentum term has no effect if one uses optimize(n_iter=1). I guess it's debatable what is better, but I would say that if we keep the gains between the optimize() calls, then we should also keep the update vectors.

I prepared a quick PR that implements that but unfortunately it makes some tests fail, and I could not fix it yet.

pavlin-policar · 2023-02-02T09:49:03Z

Thanks for tracking this down. This is definitely a bug. Calling optimize once with iter=100 should definitely be the same as calling optimize 100 times with iter=1.

dkobak mentioned this issue Jan 28, 2023

Store the last update between optimize() calls #229

Merged

pavlin-policar added the bug Something isn't working label Feb 2, 2023

pavlin-policar closed this as completed in #229 Feb 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: running optimize() multiple times produces different result compared to running it once #228

Bug: running optimize() multiple times produces different result compared to running it once #228

dkobak commented Jan 27, 2023 •

edited

Loading

dkobak commented Jan 27, 2023

dkobak commented Jan 30, 2023

pavlin-policar commented Feb 2, 2023

Bug: running optimize() multiple times produces different result compared to running it once #228

Bug: running optimize() multiple times produces different result compared to running it once #228

Comments

dkobak commented Jan 27, 2023 • edited Loading

dkobak commented Jan 27, 2023

dkobak commented Jan 30, 2023

pavlin-policar commented Feb 2, 2023

dkobak commented Jan 27, 2023 •

edited

Loading