Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with index.build(ntrees) #48

Closed
sadatnfs opened this issue Oct 8, 2019 · 15 comments
Closed

Bug with index.build(ntrees) #48

sadatnfs opened this issue Oct 8, 2019 · 15 comments
Assignees

Comments

@sadatnfs
Copy link

sadatnfs commented Oct 8, 2019

Hello,

I'm trying to run the ivis examples (both the simple iris one and the mnist one, and I keep getting this error whenever the model fitting is being called (running this on Debian). Any thoughts?

In [7]: embeddings = ivis.fit_transform(mnist.data)

Error truncating file: Invalid argument
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-7-d5f1692c2b85> in <module>
----> 1 embeddings = ivis.fit_transform(mnist.data)

/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/ivis.py in fit_transform(self, X, Y, shuffle_mode)
    289         """
    290
--> 291         self.fit(X, Y, shuffle_mode)
    292         return self.transform(X)
    293

/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/ivis.py in fit(self, X, Y, shuffle_mode)
    269         """
    270
--> 271         self._fit(X, Y, shuffle_mode)
    272         return self
    273

/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/ivis.py in _fit(self, X, Y, shuffle_mode)
    146                 print('Building KNN index')
    147             build_annoy_index(X, self.annoy_index_path,
--> 148                               ntrees=self.ntrees, verbose=self.verbose)
    149
    150         datagen = generator_from_index(X, Y,

/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/data/knn.py in build_annoy_index(X, path, ntrees, verbose)
     28
     29     # Build n trees
---> 30     index.build(ntrees)
     31     if platform.system() == 'Windows':
     32         index.save(path)

Exception: Invalid argument
@idroz
Copy link
Collaborator

idroz commented Oct 8, 2019

Hi - could you check a couple of things:

  1. Annoy library installation - what do you get if you run:
import annoy

from the environment where ivis was installed?

  1. What ivis version is used?
import ivis
ivis.__version__

@sadatnfs
Copy link
Author

sadatnfs commented Oct 8, 2019

Annoy imports without issues (and pip freeze shows that annoy is at 1.16.0)
Ivis version installed is 1.5.3

@idroz
Copy link
Collaborator

idroz commented Oct 8, 2019

Ok, cool.

I'm still not able to reproduce the error and our build tests are passing ok both on a Mac and Ubuntu with python 3.7. Could you provide a bit more info:

  1. Did you install ivis from git or directly from pip?
  2. How is ivis object constructed - are you copying the example verbatim or are you passing additional parameters into the constructor?
  3. I noticed that you're using a conda environment - would you mind trying ivis inside a standard virtual environment? We haven't had any issues with conda in the past, but conda annoy ships under a different name (python-annoy) and may be there is a conflict somewhere... It's a long shot, but worth checking.

Copying in @Szubie to see if we can get to the bottom of this.

@idroz idroz assigned idroz and Szubie Oct 8, 2019
@Szubie
Copy link
Collaborator

Szubie commented Oct 8, 2019

It looks like the Annoy is running into issues when building the KNN index. The error given is similar to the one mentioned in this open issue on the Annoy repo: spotify/annoy#404

It might be worth reinstalling an older version of Annoy (such as 1.15.2) to see if that resolves the issue.

@sadatnfs
Copy link
Author

sadatnfs commented Oct 8, 2019

@idroz @Szubie ok so I downgraded to 1.15.2 for Annoy and it went past the first part without errorring, BUT then it dies here.

And to answer your question @idroz, I am using the Iris example verbatim from the README.

Building KNN index
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 150/150 [00:00<00:00, 165173.43it/s]
Extracting KNN from index
  0%|                                                                                                                                                                                                                                                   | 0/150 [00:00<?, ?it/s]
WARNING:tensorflow:From /opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be remove
d in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Training neural network
Epoch 1/1000
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-8-ddab2956ed6f> in <module>
----> 1 embeddings = model.fit_transform(X_scaled)

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/ivis.py in fit_transform(self, X, Y, shuffle_mode)
    289         """
    290
--> 291         self.fit(X, Y, shuffle_mode)
    292         return self.transform(X)
    293

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/ivis.py in fit(self, X, Y, shuffle_mode)
    269         """
    270
--> 271         self._fit(X, Y, shuffle_mode)
    272         return self
    273

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/ivis.py in _fit(self, X, Y, shuffle_mode)
    251             shuffle=shuffle_mode,
    252             workers=multiprocessing.cpu_count(),
--> 253             verbose=self.verbose)
    254         self.loss_history_ += hist.history['loss']
    255

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, u
se_multiprocessing, shuffle, initial_epoch)
   1295         shuffle=shuffle,
   1296         initial_epoch=initial_epoch,
-> 1297         steps_name='steps_per_epoch')
   1298
   1299   def evaluate_generator(self,

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_generator.py in model_iteration(model, data, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, wo
rkers, use_multiprocessing, shuffle, initial_epoch, mode, batch_size, steps_name, **kwargs)
    219     step = 0
    220     while step < target_steps:
--> 221       batch_data = _get_next_batch(generator)
    222       if batch_data is None:
    223         if is_dataset:

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_generator.py in _get_next_batch(generator)
    361   """Retrieves the next batch of input data."""
    362   try:
--> 363     generator_output = next(generator)
    364   except (StopIteration, errors.OutOfRangeError):
    365     return None

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/data_utils.py in get(self)
    783     except Exception:  # pylint: disable=broad-except
    784       self.stop()
--> 785       six.reraise(*sys.exc_info())
    786
    787

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/data_utils.py in get(self)
    777     try:
    778       while self.is_running():
--> 779         inputs = self.queue.get(block=True).get()
    780         self.queue.task_done()
    781         if inputs is not None:

/opt/conda/envs/ivisumap/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
    668             return self._value
    669         else:
--> 670             raise self._value
    671
    672     def _set(self, i, obj):

/opt/conda/envs/ivisumap/lib/python3.6/multiprocessing/pool.py in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
    117         job, i, func, args, kwds = task
    118         try:
--> 119             result = (True, func(*args, **kwds))
    120         except Exception as e:
    121             if wrap_exception and func is not _helper_reraises_exception:

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/data_utils.py in get_index(uid, i)
    569       The value at index `i`.
    570   """
--> 571   return _SHARED_SEQUENCES[uid][i]
    572
    573

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/data/triplet_generators.py in __getitem__(self, idx)
    127         placeholder_labels = self.placeholder_labels[:len(batch_indices)]
    128         triplet_batch = [self.knn_triplet_from_neighbour_list(row_index, self.neighbour_matrix[row_index])
--> 129                          for row_index in batch_indices]
    130
    131         if (issparse(self.X)):

/opt/conda/envs/ivisumap/lib/python3.6/site-packages/ivis/data/triplet_generators.py in <listcomp>(.0)
    127         placeholder_labels = self.placeholder_labels[:len(batch_indices)]
    128         triplet_batch = [self.knn_triplet_from_neighbour_list(row_index, self.neighbour_matrix[row_index])
--> 129                          for row_index in batch_indices]
    130
    131         if (issparse(self.X)):

IndexError: index 0 is out of bounds for axis 0 with size 0

@idroz
Copy link
Collaborator

idroz commented Oct 9, 2019

It still looks like Annoy is not working properly and returns a 0-length array when queried for KNNs. Could you test Annoy in isolation - what's the output of the following script:

from annoy import AnnoyIndex
import random

f = 40
t = AnnoyIndex(f, 'angular') 
for i in range(1000):
    v = [random.gauss(0, 1) for z in range(f)]
    t.add_item(i, v)

t.build(10)
t.save('test.ann')

u = AnnoyIndex(f, 'angular')
u.load('test.ann')
print(u.get_nns_by_item(0, 1000))

@sadatnfs
Copy link
Author

sadatnfs commented Oct 9, 2019

It actually prints the u object:

print(u.get_nns_by_item(0, 1000))
[0, 655, 105, 412, 24, 604, 927, 147, 697, 131, 314, 554, 756, 102, 38, 184, 239, 87, 755, 932, 973, 505, 208, 597, 212, 103, 155, 950, 881, 133, 654, 52, 742, 134, 601, 419, 323, 386, 687, 582, 75, 179, 542, 191, 299, 66, 593, 494, 606, 433, 348, 631, 904, 279, 736, 235, 903, 449, 693, 234, 730,  ......

@idroz
Copy link
Collaborator

idroz commented Oct 10, 2019

Hi - unfortunately we can't reproduce the issue. We ran a Debian VM with conda environments on a Unix machine and python 3.6 and 3.7 - ivis works as expected.

This does sound similar to an earlier issue #25 , which is Windows related... where parallel neighbour retrieval fails due to differences in multi-processing approaches between Linux and Windows. Are you by any chance using a Linux subsystem for Windows?

@Szubie
Copy link
Collaborator

Szubie commented Oct 10, 2019

It actually prints the u object:

Thanks for this. This makes me suspect that the issue is caused by the Annoy index being built directly on disk.

I've pushed a new branch which allows the user to specify whether or not to build the Annoy index directly on disk when constructing the Ivis object here: https://github.com/beringresearch/ivis/tree/on-disk-build-config

Hopefully disabling the on_disk_build by passing in build_index_on_disk=False when constructing the Ivis object will help solve the issue.

@sadatnfs
Copy link
Author

You're indeed correct @idroz : I am using Debian through the subsystem for Linux

@idroz
Copy link
Collaborator

idroz commented Oct 10, 2019

Interesting use-case. I think that a branch submitted by @Szubie should address that issue. Let us know if it works and we’ll merge to master.

@sadatnfs
Copy link
Author

Thank you so much!! I can confirm that creating the Ivis model with build_index_on_disk=False works!

@Szubie
Copy link
Collaborator

Szubie commented Oct 15, 2019

Thank you so much!! I can confirm that creating the Ivis model with build_index_on_disk=False works!

Thanks for confirming. We have merged the fixes addressing this issue into the master branch here:
f5707df

@Szubie Szubie closed this as completed Oct 15, 2019
@pwolff
Copy link

pwolff commented Oct 19, 2019

I too got the error "IndexError: index 0 is out of bounds for axis 0 with size 0" while attempting to run Ivis in windows. I added the fixes in f5707df, but the error remained.
The solution was quite simple, though.
Ivis launches child processes, and in windows this requires using fork to start them.
Just add the "if name == 'main':" to the main module.
This is easy to forget when you simply drop code in an ide.

@idroz
Copy link
Collaborator

idroz commented Oct 21, 2019

Thanks, @pwolff . In which module did you add If name == 'main'? We can't seem to reproduce the error in a Windows machine - would you be able to submit a pull request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants