Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCAV: cannot run compute_cavs() in cuda #719

Closed
pdpino opened this issue Jul 14, 2021 · 1 comment
Closed

TCAV: cannot run compute_cavs() in cuda #719

pdpino opened this issue Jul 14, 2021 · 1 comment

Comments

@pdpino
Copy link

pdpino commented Jul 14, 2021

馃悰 Bug

When using a model in the GPU, when no cav vectors are precomputed (i.e. the ./cav folder has not been created), and when using the default classifier for TCAV() (i.e. passing classifier=None), running tcav.compute_cavs() throws a wrong-device error.

To Reproduce

Steps to reproduce the behavior:

  1. Remove precomputed cav vectors (if any), i.e. the ./cav folder
  2. Run the following code
import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
from captum.concept import TCAV, Concept

DEVICE = 'cuda'

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 10, 10)
        self.pool = nn.AdaptiveAvgPool2d((1, 1))
        self.flatten = nn.Flatten()
        self.classifier = nn.Linear(10, 1)
    def forward(self, images):
        # images shape: batch_size, 3, height, width
        x = self.conv(images) # shape: batch_size, 10, features-height, features-width
        x = self.pool(x) # shape: batch_size, 10, 1, 1
        x = self.flatten(x) # shape: batch_size, 10
        x = self.classifier(x) # shape: batch_size, 1
        return x

class DummyDataset(Dataset):
    def __init__(self, device='cpu'):
        super().__init__()
        self.device = device
    def __getitem__(self, idx):
        image = torch.zeros(3, 256, 256)
        return image.to(self.device)
    def __len__(self):
        return 10

model = MyModel().to(DEVICE)

concept0 = Concept(0, 'concept0', DataLoader(DummyDataset(device=DEVICE), batch_size=10))
concept1 = Concept(1, 'concept1', DataLoader(DummyDataset(device=DEVICE), batch_size=10))

tcav = TCAV(model, layers='conv')

cavs = tcav.compute_cavs([[concept0, concept1]])
  1. Running the tcav.compute_cavs(...) line throws:
RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)

See the full stack here

Expected behavior

The method compute_cavs() should run without errors, in any of CPU or GPU (or the docs should state that only CPU is supported?)

Environment

Describe the environment used for Captum

  • Captum / PyTorch Version: captum 0.4.0, torch 1.7.1+cu110
  • OS (e.g., Linux): Ubuntu 18.04.5
  • How you installed Captum / PyTorch (conda, pip, source): source
  • Build command you used (if compiling from source): pip install -e ~/software/captum
  • Python version: 3.6
  • CUDA/cuDNN version: cu110
  • GPU models and configuration: using a GPU RTX 3090
  • Any other relevant information: I'm running captum in the master branch, latest commit is f658185

Additional context

  • If the device is cpu (i.e. use DEVICE = 'cpu' in the code above) the code runs without error, and saves the CAV vectors to the ./cav folder
  • If the CAV vectors are precomputed, the code runs without error (in GPU and CPU)

Possible culprit

        # ...
        self.lm.fit(DataLoader(TensorDataset(x_train, y_train)))

        self.lm.linear.to(x_test.device) # Add this

        predict = self.lm(x_test)

        predict = self.lm.classes()[torch.argmax(predict, dim=1)]
        predict = predict.to(x_test.device) # Add this
        # ...
  • Probably this is not an optimal solution, I'm not familiar enough with the codebase to know better or to further inspect this 馃榿

Btw, TCAV is an awesome feature! 馃殌

facebook-github-bot pushed a commit that referenced this issue Aug 17, 2021
Summary:
Addresses the issues: #721 #719 #720

Pull Request resolved: #725

Reviewed By: bilalsal

Differential Revision: D30356015

Pulled By: NarineK

fbshipit-source-id: 010a5263bdfc33e8c4d3f9de523d9d3ba3969f49
@NarineK
Copy link
Contributor

NarineK commented Aug 17, 2021

fixed with #725

@NarineK NarineK closed this as completed Aug 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants