-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameters that affect GPU RAM usage #5
Comments
Hello @ChengYuChuan , thanks for the question! Unfortunately, none of the MM-SHAP script parameter reduce VRAM utilization.
Proposed Solution:So the only thing I can imagine is that you should reduce the batch size of the shap.Explainer which is defined here. I think all you need to do to affect this parameter there, is change this line to Hope this helps! |
hello @LetiP , thanks for you quick reply. I am trying to reduce the batch size as you said, but the weird thing is May I ask what's your computer spec. ? I am considering to try the whole program on a Mac M1 chip with 16 RAM or deploy it on google colab, but the Cuda part seems to me that only support the NVIDIA GPU. However, this project is based on python 3.6 I am worried that python 3.8 would not colabrate with the env setting.... (Since if I make it run on Mac M1, I need to use the latest PyTorch 1.12 wiht python 3.8) Have you ever heard someone run the project on MAC system or colab? (sorry,I know this question may not be in your experience. :/) |
Where exactly the code throws OOM errors? Because from what you just said, it might be even at the level of the model loading and it does not get into the SHAP analysis where the decrease of the batch size might help. I used a GPU with a 24GB VRAM. I see no reason for why the code would not work with python 3.8. Definitely worth a try. I did use Linux on the local cluster so sorry, I have no experience with mac and colab for this project. Just thinking that maybe you could also try to:
|
Sorry, I should have described the problem more specifically in the beginning. Yes, I think you are right. the problem is narrowed down to the model loading part. In the beginning, I thought it was only an evaluation between different metrics, therefore I thought a normal PC could handle this. Here is my error message.
when I want to activate torch.no_grad(), how should I manage this?
|
Ok, so it is clear that your problem is not the shap (interpretability code), but just plain model inferencing. Given your hardware, you cannot load the ALBEF model. The code does not even get to the interpretation. I think you need to edit this, to with torch.no_grad():
checkpoint = torch.load(model_path, map_location='cpu')
msg = model.load_state_dict(checkpoint, strict=False) Also, it is best to put that also around the line that throws the error message for you, just in case. model_prediction = model(image, text_input) |
@LetiP Thank you so much. After I talk to my lecturer, I switch the task to cluster. I would like to implement the project there. Therefore, I won't encounter this issue anymore. Thank you again. :D |
Great! Best of luck and may the cluster be with you. 🤞 |
Thanks for the wrok. I tried to reproduce the results of this paper, however I encountered the problem of insufficient GPU RAM.
The python file I am trying:
mm-shap_albef_dataset.py
DATA I used:
existence
number of sample:
all
The basic setting I used:
The problem I encounter:
I am wondering if I cannot scale my GPU RAM right now, which Parameters I should pay attention to?
following few option on my mind now:
num_samples
patch_size (?)
I try this today, but it will affect the shape torch.Size. I only change it from
16
to32
Thanks for your reading and time in advance.
The text was updated successfully, but these errors were encountered: