Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory requirement for TorchIndex #8

Closed
oussaidene opened this issue Feb 7, 2023 · 1 comment
Closed

Memory requirement for TorchIndex #8

oussaidene opened this issue Feb 7, 2023 · 1 comment

Comments

@oussaidene
Copy link

oussaidene commented Feb 7, 2023

Hi,

NumpyIndex does not require the full index to be in memory, but can the same be said on TorchIndex?
I'm trying to run an experiment on the MS MARCO dev set with TorchIndex qnd i get this error saying cuda is out of memory.

Thanks for any help!

Here's the code:
`#Training
model = pyterrier_dr.TctColBert(model_name='distilbert-base-uncased')
dataset = pt.get_dataset('irds:msmarco-passage/train/judged')
model.fit(dataset=dataset, steps=1000)

#Evaluation
dataset = pt.get_dataset('irds:msmarco-passage/dev/judged')
index = pyterrier_dr.TorchIndex('index.torch')
idx_pipeline = model >> index
idx_pipeline.index(dataset.get_corpus_iter())

retr_pipeline = model >> index
pt.Experiment(
[retr_pipeline],
dataset.get_topics(),
dataset.get_qrels(),
eval_metrics=["map", "recip_rank","ndcg", "ndcg_cut_10"]
)`

And i get this error:
`Traceback (most recent call last):
File "/XXX/DR_cuda/DR_cuda.py", line 22, in
pt.Experiment(

File "/XXX/dr_env/lib/python3.8/site-packages/pyterrier/pipelines.py", line 450, in Experiment
time, evalMeasuresDict = _run_and_evaluate(

File "/XXX/dr_env/lib/python3.8/site-packages/pyterrier/pipelines.py", line 170, in _run_and_evaluate
res = system.transform(topics)

File "/XXX/dr_env/lib/python3.8/site-packages/pyterrier/ops.py", line 335, in transform
topics = m.transform(topics)

File "/XXX/dr_env/lib/python3.8/site-packages/pyterrier_dr/indexes.py", line 676, in transform
scores = query_vecs @ self._cuda_data[:bsize].T

RuntimeError: CUDA out of memory. Tried to allocate 33.70 GiB (GPU 0; 10.92 GiB total capacity; 1.63 GiB already allocated; 8.27 GiB free; 2.00 GiB reserved in total by PyTorch)

srun: error: gpu-nc06: task 0: Exited with exit code 1`

@oussaidene oussaidene changed the title Memory requerement for TorchIndex Memory requirement for TorchIndex Feb 7, 2023
@seanmacavaney
Copy link
Collaborator

Hi @oussaidene,

TorchIndex works by moving chunks of data to the GPU memory. The size of these chunks can be controlled with the idx_mem parameter, which specifies an (approximation) of the memory it will use (in bytes). By default, it's 500000000 (~500MB), but you can adjust that as you need. Since your encoder model is probably also on GPU, this size will need to be lower than your total GPU memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants