Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grad CAM for multiple input arguments #501

Closed
IshitaB28 opened this issue May 6, 2024 · 6 comments
Closed

Grad CAM for multiple input arguments #501

IshitaB28 opened this issue May 6, 2024 · 6 comments

Comments

@IshitaB28
Copy link

I am trying to use GradCam on my model that takes more than one input arguments. I tried to pass *input_tensor instead of input_tensor since I have a list of 4 arguments. I was trying to modify the source code accordingly. I am facing a series of errors. I am at this point now:
Traceback (most recent call last):
File "/home/ishita-wicon/Documents/QA/ISIQA/UNET/expl_exp.py", line 449, in
grayscale_cam = cam(input_tensor=[left_patches, right_patches, left_image_patches, right_image_patches], targets=targets)
File "/home/ishita-wicon/.local/lib/python3.10/site-packages/pytorch_grad_cam/base_cam.py", line 192, in call
return self.forward(input_tensor,
File "/home/ishita-wicon/.local/lib/python3.10/site-packages/pytorch_grad_cam/base_cam.py", line 105, in forward
cam_per_layer = self.compute_cam_per_layer(*input_tensor, #changes to * for list
TypeError: BaseCAM.compute_cam_per_layer() takes 4 positional arguments but 7 were given

Is there any other way?

@Studentpengyu
Copy link

Hi there, I am facing a similar issue. My model requires an input that consists of a list containing two tensors. How did you handle this? Could you share your solution?

@IshitaB28
Copy link
Author

IshitaB28 commented May 28, 2024

Hi, so I handled it by combining the inputs into one tensor and then separating them out in the forward function of my model.

For example if you need to input 4 things into your model, you do:

inp = torch.cat((a, b, c, d), dim = 0)
grayscale_cam = cam(input_tensor=inp, targets=None)

And then, in the forward function of your model:

def forward(self, inp):
a, b, c, d = inp[0], inp[1], inp[2], inp[3]

before you proceed with further steps

@Studentpengyu
Copy link

Thank you for your prompt response. I will try it!

@Studentpengyu
Copy link

Hi, I have modified the input format by first combining the inputs into one tensor and then separating them out in the forward function of my model.

In my case, my two inputs are an image tensor and a text tensor. Since these two types have different sizes, direct concatenation is not possible. I flatten both and concatenate along the first dimension, then record the lengths and shapes for later separation. The models can work well in this way. However, there is still an issue.

Here is a snippet of my code:

inp = torch.cat((text_flat, image_flat), dim = 0)
grayscale_cam = cam(input_tensor=inp, targets=None)

The grayscale_cam will be flattened too, and the size of it was Not what I expected. The size is the sum of the text and image tensors, while I expect it to be the size of the image tensor because I need to display grayscale_cam on the image.

Therefore, I extracted the image part, but the resulting image was completely incorrect.
image

@Studentpengyu
Copy link

Hi there, I wanted to let you know that my issue has been resolved.

Here is my solution:
I moved the text features as fixed features into the forward function. Since my text features are extracted using a large language model and do not need updating, this approach works well.

Thank you for your help!

@IshitaB28
Copy link
Author

Hello, good to know that its solved now. Thanks for sharing the issue and the solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants