TCAV: cannot run compute_cavs() in cuda #719

pdpino · 2021-07-14T17:29:53Z

🐛 Bug

When using a model in the GPU, when no cav vectors are precomputed (i.e. the ./cav folder has not been created), and when using the default classifier for TCAV() (i.e. passing classifier=None), running tcav.compute_cavs() throws a wrong-device error.

To Reproduce

Steps to reproduce the behavior:

Remove precomputed cav vectors (if any), i.e. the ./cav folder
Run the following code

import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
from captum.concept import TCAV, Concept

DEVICE = 'cuda'

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 10, 10)
        self.pool = nn.AdaptiveAvgPool2d((1, 1))
        self.flatten = nn.Flatten()
        self.classifier = nn.Linear(10, 1)
    def forward(self, images):
        # images shape: batch_size, 3, height, width
        x = self.conv(images) # shape: batch_size, 10, features-height, features-width
        x = self.pool(x) # shape: batch_size, 10, 1, 1
        x = self.flatten(x) # shape: batch_size, 10
        x = self.classifier(x) # shape: batch_size, 1
        return x

class DummyDataset(Dataset):
    def __init__(self, device='cpu'):
        super().__init__()
        self.device = device
    def __getitem__(self, idx):
        image = torch.zeros(3, 256, 256)
        return image.to(self.device)
    def __len__(self):
        return 10

model = MyModel().to(DEVICE)

concept0 = Concept(0, 'concept0', DataLoader(DummyDataset(device=DEVICE), batch_size=10))
concept1 = Concept(1, 'concept1', DataLoader(DummyDataset(device=DEVICE), batch_size=10))

tcav = TCAV(model, layers='conv')

cavs = tcav.compute_cavs([[concept0, concept1]])

Running the tcav.compute_cavs(...) line throws:

RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)

See the full stack here

Expected behavior

The method compute_cavs() should run without errors, in any of CPU or GPU (or the docs should state that only CPU is supported?)

Environment

Describe the environment used for Captum

Captum / PyTorch Version: captum 0.4.0, torch 1.7.1+cu110
OS (e.g., Linux): Ubuntu 18.04.5
How you installed Captum / PyTorch (conda, pip, source): source
Build command you used (if compiling from source): pip install -e ~/software/captum
Python version: 3.6
CUDA/cuDNN version: cu110
GPU models and configuration: using a GPU RTX 3090
Any other relevant information: I'm running captum in the master branch, latest commit is f658185

Additional context

If the device is cpu (i.e. use DEVICE = 'cpu' in the code above) the code runs without error, and saves the CAV vectors to the ./cav folder
If the CAV vectors are precomputed, the code runs without error (in GPU and CPU)

Possible culprit

It seems to me that this could be an issue with the SkLearnSGDClassifier model and how it handles devices
I was able to hot-fix it by adding these two lines in the DefaultClassifier().train_and_eval() method:

        # ...
        self.lm.fit(DataLoader(TensorDataset(x_train, y_train)))

        self.lm.linear.to(x_test.device) # Add this

        predict = self.lm(x_test)

        predict = self.lm.classes()[torch.argmax(predict, dim=1)]
        predict = predict.to(x_test.device) # Add this
        # ...

Probably this is not an optimal solution, I'm not familiar enough with the codebase to know better or to further inspect this 😁

Btw, TCAV is an awesome feature! 🚀

The text was updated successfully, but these errors were encountered:

Summary: Addresses the issues: #721 #719 #720 Pull Request resolved: #725 Reviewed By: bilalsal Differential Revision: D30356015 Pulled By: NarineK fbshipit-source-id: 010a5263bdfc33e8c4d3f9de523d9d3ba3969f49

NarineK · 2021-08-17T06:20:05Z

fixed with #725

pdpino mentioned this issue Jul 14, 2021

TCAV: cannot run interpret() in cuda #721

Closed

NarineK mentioned this issue Jul 21, 2021

Support TCAV on cuda #725

Closed

NarineK closed this as completed Aug 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TCAV: cannot run compute_cavs() in cuda #719

TCAV: cannot run compute_cavs() in cuda #719

pdpino commented Jul 14, 2021

NarineK commented Aug 17, 2021

TCAV: cannot run compute_cavs() in cuda #719

TCAV: cannot run compute_cavs() in cuda #719

Comments

pdpino commented Jul 14, 2021

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Possible culprit

NarineK commented Aug 17, 2021