New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug with index.build(ntrees) #48
Comments
Hi - could you check a couple of things:
import annoy from the environment where ivis was installed?
import ivis
ivis.__version__ |
Annoy imports without issues (and pip freeze shows that annoy is at 1.16.0) |
Ok, cool. I'm still not able to reproduce the error and our build tests are passing ok both on a Mac and Ubuntu with python 3.7. Could you provide a bit more info:
Copying in @Szubie to see if we can get to the bottom of this. |
It looks like the Annoy is running into issues when building the KNN index. The error given is similar to the one mentioned in this open issue on the Annoy repo: spotify/annoy#404 It might be worth reinstalling an older version of Annoy (such as 1.15.2) to see if that resolves the issue. |
@idroz @Szubie ok so I downgraded to 1.15.2 for Annoy and it went past the first part without errorring, BUT then it dies here. And to answer your question @idroz, I am using the Iris example verbatim from the README. Building KNN index
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 150/150 [00:00<00:00, 165173.43it/s]
Extracting KNN from index
0%| | 0/150 [00:00<?, ?it/s]
WARNING:tensorflow:From /opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be remove
d in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Training neural network
Epoch 1/1000
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-8-ddab2956ed6f> in <module>
----> 1 embeddings = model.fit_transform(X_scaled)
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/ivis.py in fit_transform(self, X, Y, shuffle_mode)
289 """
290
--> 291 self.fit(X, Y, shuffle_mode)
292 return self.transform(X)
293
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/ivis.py in fit(self, X, Y, shuffle_mode)
269 """
270
--> 271 self._fit(X, Y, shuffle_mode)
272 return self
273
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/ivis.py in _fit(self, X, Y, shuffle_mode)
251 shuffle=shuffle_mode,
252 workers=multiprocessing.cpu_count(),
--> 253 verbose=self.verbose)
254 self.loss_history_ += hist.history['loss']
255
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, u
se_multiprocessing, shuffle, initial_epoch)
1295 shuffle=shuffle,
1296 initial_epoch=initial_epoch,
-> 1297 steps_name='steps_per_epoch')
1298
1299 def evaluate_generator(self,
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_generator.py in model_iteration(model, data, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, wo
rkers, use_multiprocessing, shuffle, initial_epoch, mode, batch_size, steps_name, **kwargs)
219 step = 0
220 while step < target_steps:
--> 221 batch_data = _get_next_batch(generator)
222 if batch_data is None:
223 if is_dataset:
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_generator.py in _get_next_batch(generator)
361 """Retrieves the next batch of input data."""
362 try:
--> 363 generator_output = next(generator)
364 except (StopIteration, errors.OutOfRangeError):
365 return None
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/data_utils.py in get(self)
783 except Exception: # pylint: disable=broad-except
784 self.stop()
--> 785 six.reraise(*sys.exc_info())
786
787
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
691 if value.__traceback__ is not tb:
692 raise value.with_traceback(tb)
--> 693 raise value
694 finally:
695 value = None
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/data_utils.py in get(self)
777 try:
778 while self.is_running():
--> 779 inputs = self.queue.get(block=True).get()
780 self.queue.task_done()
781 if inputs is not None:
/opt/conda/envs/ivisumap/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
668 return self._value
669 else:
--> 670 raise self._value
671
672 def _set(self, i, obj):
/opt/conda/envs/ivisumap/lib/python3.6/multiprocessing/pool.py in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
117 job, i, func, args, kwds = task
118 try:
--> 119 result = (True, func(*args, **kwds))
120 except Exception as e:
121 if wrap_exception and func is not _helper_reraises_exception:
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/data_utils.py in get_index(uid, i)
569 The value at index `i`.
570 """
--> 571 return _SHARED_SEQUENCES[uid][i]
572
573
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/data/triplet_generators.py in __getitem__(self, idx)
127 placeholder_labels = self.placeholder_labels[:len(batch_indices)]
128 triplet_batch = [self.knn_triplet_from_neighbour_list(row_index, self.neighbour_matrix[row_index])
--> 129 for row_index in batch_indices]
130
131 if (issparse(self.X)):
/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/data/triplet_generators.py in <listcomp>(.0)
127 placeholder_labels = self.placeholder_labels[:len(batch_indices)]
128 triplet_batch = [self.knn_triplet_from_neighbour_list(row_index, self.neighbour_matrix[row_index])
--> 129 for row_index in batch_indices]
130
131 if (issparse(self.X)):
IndexError: index 0 is out of bounds for axis 0 with size 0 |
It still looks like Annoy is not working properly and returns a 0-length array when queried for KNNs. Could you test Annoy in isolation - what's the output of the following script: from annoy import AnnoyIndex
import random
f = 40
t = AnnoyIndex(f, 'angular')
for i in range(1000):
v = [random.gauss(0, 1) for z in range(f)]
t.add_item(i, v)
t.build(10)
t.save('test.ann')
u = AnnoyIndex(f, 'angular')
u.load('test.ann')
print(u.get_nns_by_item(0, 1000)) |
It actually prints the print(u.get_nns_by_item(0, 1000))
[0, 655, 105, 412, 24, 604, 927, 147, 697, 131, 314, 554, 756, 102, 38, 184, 239, 87, 755, 932, 973, 505, 208, 597, 212, 103, 155, 950, 881, 133, 654, 52, 742, 134, 601, 419, 323, 386, 687, 582, 75, 179, 542, 191, 299, 66, 593, 494, 606, 433, 348, 631, 904, 279, 736, 235, 903, 449, 693, 234, 730, ...... |
Hi - unfortunately we can't reproduce the issue. We ran a Debian VM with conda environments on a Unix machine and python 3.6 and 3.7 - ivis works as expected. This does sound similar to an earlier issue #25 , which is Windows related... where parallel neighbour retrieval fails due to differences in multi-processing approaches between Linux and Windows. Are you by any chance using a Linux subsystem for Windows? |
Thanks for this. This makes me suspect that the issue is caused by the Annoy index being built directly on disk. I've pushed a new branch which allows the user to specify whether or not to build the Annoy index directly on disk when constructing the Ivis object here: https://github.com/beringresearch/ivis/tree/on-disk-build-config Hopefully disabling the on_disk_build by passing in |
You're indeed correct @idroz : I am using Debian through the subsystem for Linux |
Interesting use-case. I think that a branch submitted by @Szubie should address that issue. Let us know if it works and we’ll merge to master. |
Thank you so much!! I can confirm that creating the Ivis model with |
Thanks for confirming. We have merged the fixes addressing this issue into the master branch here: |
I too got the error "IndexError: index 0 is out of bounds for axis 0 with size 0" while attempting to run Ivis in windows. I added the fixes in f5707df, but the error remained. |
Thanks, @pwolff . In which module did you add |
Hello,
I'm trying to run the ivis examples (both the simple iris one and the mnist one, and I keep getting this error whenever the model fitting is being called (running this on Debian). Any thoughts?
The text was updated successfully, but these errors were encountered: