Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda Running out of memory during evaluate_predictions #7

Closed
utkarsh-bhalode opened this issue Jun 24, 2021 · 2 comments
Closed

Cuda Running out of memory during evaluate_predictions #7

utkarsh-bhalode opened this issue Jun 24, 2021 · 2 comments

Comments

@utkarsh-bhalode
Copy link

utkarsh-bhalode commented Jun 24, 2021

I am trying to replicate the results of Compcos using Mit-States dataset & i am getting the following error:

File "train.py", line 228, in
main()
File "train.py", line 108, in main
test(epoch, image_extractor, model, testloader, evaluator_val, writer, args, logpath)
File "train.py", line 193, in test
stats = evaluator.evaluate_predictions(results, all_attr_gt, all_obj_gt, all_pair_gt, all_pred_dict, topk=args.topk)
File "/nfs4/krissna/OWCZL/czsl-main/czsl-main/models/common.py", line 489, in evaluate_predictions
results = self.score_fast_model(scores, obj_truth, bias = bias, topk = topk)
File "/nfs4/krissna/OWCZL/czsl-main/czsl-main/models/common.py", line 363, in score_fast_model
scores[~mask] += bias # Add bias to test pairs
RuntimeError: CUDA out of memory. Tried to allocate 4.18 GiB (GPU 0; 10.92 GiB total capacity; 4.68 GiB already allocated; 2.44 GiB free; 7.85 GiB reserved in total by PyTorch)

(Currently using Cuda10.2 10Gis)
Please suggest some solution for the above error.
Thanks

@mancinimassimiliano
Copy link
Collaborator

Hi @utkarsh-bhalode.

Yes, open-world evaluation is pretty costly. I have added a new flag: --cpu_eval that if you specify at the end of the training/test commands, makes the script use CPU memory during evaluation. Note that this makes the evaluation even slower though.

We will try to speed it up in the future, but for now, please let me know if this simple fix works for you.

@utkarsh-bhalode
Copy link
Author

Hey! @mancinimassimiliano
Yes, it works now using CPU memory. Thanks a lot!
Thanks for your insightful work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants