-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Describe the bug
At Google Colab
!pip install faiss-cpu works
import faiss no error
but
embeddings_dataset.add_faiss_index(column='embeddings')
returns
[/usr/local/lib/python3.11/dist-packages/datasets/search.py](https://localhost:8080/#) in init(self, device, string_factory, metric_type, custom_index)
247 self.faiss_index = custom_index
248 if not _has_faiss:
--> 249 raise ImportError(
250 "You must install Faiss to use FaissIndex. To do so you can run conda install -c pytorch faiss-cpu or conda install -c pytorch faiss-gpu. "
251 "A community supported package is also available on pypi: pip install faiss-cpu or pip install faiss-gpu. "
because
_has_faiss = importlib.util.find_spec("faiss") is not None at the beginning of datasets/search.py returns False
when
the same code at colab notebook returns
ModuleSpec(name='faiss', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7b7851449f50>, origin='/usr/local/lib/python3.11/dist-packages/faiss/init.py', submodule_search_locations=['/usr/local/lib/python3.11/dist-packages/faiss'])
But
import datasets
datasets.search._has_faiss
at colab notebook also returns False
The same story with _has_elasticsearch
Steps to reproduce the bug
- Follow https://huggingface.co/learn/nlp-course/chapter5/6?fw=pt at Google Colab
- till
embeddings_dataset.add_faiss_index(column='embeddings') embeddings_dataset.add_elasticsearch_index(column='embeddings')- https://colab.research.google.com/drive/1h2cjuiClblqzbNQgrcoLYOC8zBqTLLcv#scrollTo=3ddzRp72auOF
Expected behavior
I've only started Tutorial and don't know exactly. But something tells me that embeddings_dataset.add_faiss_index(column='embeddings')
should work without Import Error
Environment info
Google Colab notebook with default config