New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get stable results? #85
Comments
Hi there - There are several sources of stochastic behaviour in Ivis (see issue #31 ). Here's an example script that should provide reproducible results between Ivis runs.
import os
os.environ["PYTHONHASHSEED"]="0"
import random
import numpy as np
import numpy as np
import tensorflow as tf
import random as python_random
# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.
np.random.seed(123)
# The below is necessary for starting core Python generated random numbers
# in a well-defined state.
random.seed(123)
# The below set_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/random/set_seed
tf.random.set_seed(1234)
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import MinMaxScaler
from sklearn.neighbors import NearestNeighbors
from ivis import Ivis
iris = load_iris()
data = iris.data
target = iris.target
X = MinMaxScaler().fit_transform(data)
# Here we're creating a fixed NN matrix. For large out-of-memroy datasets, you can achieve the same
# with Ivis' Annoy functionality (https://bering-ivis.readthedocs.io/en/latest/api.html#neighbour-retrieval),
# i.e. build the index separately and then pass it into the Ivis constructor.
nbrs = NearestNeighbors(n_neighbors=5).fit(X)
distances, indices = nbrs.kneighbors(X)
model = Ivis(embedding_dims=2, k=5, batch_size=X.shape[0],
neighbour_matrix=indices,
n_epochs_without_progress=5, verbose=0)
model.fit(X)
embeddings = model.transform(X)
plt.scatter(embeddings[:, 0], embeddings[:, 1], c=target) You should get this result: |
Interesting - you should be getting identical results between each run, as long as all seeds are set before Ivis module is imported. Are you using Ivis' built-in nearest neighbour search, or are you pre-building the nearest neighbour matrix? Other contributing factors may be how different versions of python, tensorflow, and numpy handle RNG... |
Since you mentioned that the results should be identical, I checked some lib versions, I was using ivis |
Hello Folks,
thank you for all the work on this lib. I have a question about reproducibility: Is there a way to set a random seed or random state and get stable results?
I'm trying to achieve this with:
I'm aware that these are not threadsafe, so this may be the reason of the not reproducible results. Anyway, is there any way to enforce this?
The text was updated successfully, but these errors were encountered: