AssertionError: Torch not compiled with CUDA enabled - DETECTRON CPU/LINUX TRAINING ERROR 

Hello,

I recently created a venv and downloaded pytorch in the following way (cpu only):

> pip install torch==1.5.1+cpu torchvision==0.6.1+cpu -f https://download.pytorch.org/whl/torch_stable.html

Then downloaded the pre-built detectron2 for linux & cpu with the following (all other prereqs are installed also)

>python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.5/index.html

## Instructions To Reproduce the Issue:

I am training on a custom dataset, and the trainer.train() line is seeing the following error:

>AssertionError: Torch not compiled with CUDA enabled

1. Here is my code the get there
```
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import cv2
import os
import random
from matplotlib import pyplot as plt

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
from detectron2.structures import BoxMode
from detectron2.data.datasets import register_coco_instances
from detectron2.data.catalog import DatasetCatalog
from detectron2.engine import HookBase

register_coco_instances("boat_train", {}, "/home/Documents/train/instances.json", "/home/Documents/train")
register_coco_instances("boat_val", {}, "/home/Documents/val/instances.json", "/home/Documents/val")

from detectron2.engine import DefaultTrainer
from detectron2.engine import TrainerBase

#Specify Model yaml & weights to grab
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_C4_1x.yaml"))
cfg.MODEL.WEIGHTS = "/home/svidelock/source/Detectron/model_final_721ade.pkl" # Let training initialize from model zoo 

#Spcify DIR for output, if not specified will create "output" DIR
# cfg.OUTPUT_DIR = '/home/svidelock/source/Detectron/HyperParamDetectron/output2/'

#Specify Datasets
cfg.DATASETS.TRAIN = ("boat_train",) #list of the pre-computed proposal files for trianing
cfg.DATASETS.TEST = ("boat_val",) #validation set

#Hyperparams
cfg.SOLVER.IMS_PER_BATCH = 2 #means that in 1 iteration the model sees 2 images 
cfg.SOLVER.BASE_LR = 0.02 #learning rate

#Some other configurable items
cfg.DATALOADER.NUM_WORKERS = 2 # depends on harware config ... 
# cfg.SOLVER.WARMUP_ITERS = 1000 #constant learning rate
# cfg.SOLVER.STEPS = (1000, 1500) #Decaying learning rate
# cfg.SOLVER.GAMMA = 0.001 # The iteration number to decrease learning rate by GAMMA
cfg.SOLVER.MAX_ITER = 500 # Model will stop after this many iterations
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 #look into
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # only has one class (boat)

#specify if CPU Training
cfg.MODEL.DEVICE='cpu'#cpu training

#Checkpoint/ValidationSet Params
cfg.TEST.EVAL_PERIOD = 20 # Tests validation set every 20 itterations
cfg.SOLVER.CHECKPOINT_PERIOD = cfg.TEST.EVAL_PERIOD #saves a checkpoint model each time we validate 

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

```
2. __full logs__ you observed:
```
[07/17 13:16:51 d2.engine.train_loop]: Starting training from iteration 0
ERROR [07/17 13:17:05 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 130, in train
    self.run_step()
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 227, in run_step
    with torch.cuda.stream(torch.cuda.Stream()):
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torch/cuda/streams.py", line 21, in __new__
    with torch.cuda.device(device):
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py", line 201, in __init__
    self.idx = _get_device_index(device, optional=True)
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torch/cuda/_utils.py", line 31, in _get_device_index
    return torch.cuda.current_device()
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py", line 330, in current_device
    _lazy_init()
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py", line 149, in _lazy_init
    _check_driver()
  File "/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py", line 47, in _check_driver
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
[07/17 13:17:05 d2.engine.hooks]: Total training time: 0:00:13 (0:00:00 on hooks)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-7-7c9d1293789c> in <module>
      4 # trainer.register_hooks([early_stoping])
      5 trainer.resume_or_load(resume=False)
----> 6 trainer.train()

~/unineunet-test2/lib/python3.6/site-packages/detectron2/engine/defaults.py in train(self)
    396             OrderedDict of results, if evaluation is enabled. Otherwise None.
    397         """
--> 398         super().train(self.start_iter, self.max_iter)
    399         if len(self.cfg.TEST.EXPECTED_RESULTS) and comm.is_main_process():
    400             assert hasattr(

~/unineunet-test2/lib/python3.6/site-packages/detectron2/engine/train_loop.py in train(self, start_iter, max_iter)
    128                 for self.iter in range(start_iter, max_iter):
    129                     self.before_step()
--> 130                     self.run_step()
    131                     self.after_step()
    132             except Exception:

~/unineunet-test2/lib/python3.6/site-packages/detectron2/engine/train_loop.py in run_step(self)
    225 
    226         # use a new stream so the ops don't wait for DDP
--> 227         with torch.cuda.stream(torch.cuda.Stream()):
    228             metrics_dict = loss_dict
    229             metrics_dict["data_time"] = data_time

~/unineunet-test2/lib/python3.6/site-packages/torch/cuda/streams.py in __new__(cls, device, priority, **kwargs)
     19 
     20     def __new__(cls, device=None, priority=0, **kwargs):
---> 21         with torch.cuda.device(device):
     22             return super(Stream, cls).__new__(cls, priority=priority, **kwargs)
     23 

~/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py in __init__(self, device)
    199 
    200     def __init__(self, device):
--> 201         self.idx = _get_device_index(device, optional=True)
    202         self.prev_idx = -1
    203 

~/unineunet-test2/lib/python3.6/site-packages/torch/cuda/_utils.py in _get_device_index(device, optional)
     29         if optional:
     30             # default cuda device index
---> 31             return torch.cuda.current_device()
     32         else:
     33             raise ValueError('Expected a cuda device with a specified index '

~/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py in current_device()
    328 def current_device():
    329     r"""Returns the index of a currently selected device."""
--> 330     _lazy_init()
    331     return torch._C._cuda_getDevice()
    332 

~/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py in _lazy_init()
    147             raise RuntimeError(
    148                 "Cannot re-initialize CUDA in forked subprocess. " + msg)
--> 149         _check_driver()
    150         if _cudart is None:
    151             raise AssertionError(

~/unineunet-test2/lib/python3.6/site-packages/torch/cuda/__init__.py in _check_driver()
     45 def _check_driver():
     46     if not hasattr(torch._C, '_cuda_isDriverSufficient'):
---> 47         raise AssertionError("Torch not compiled with CUDA enabled")
     48     if not torch._C._cuda_isDriverSufficient():
     49         if torch._C._cuda_getDriverVersion() == 0:

AssertionError: Torch not compiled with CUDA enabled
```

## Expected behavior:

The model should run. I created a virtual environment in the same way about a week ago and have no issues, but when I recreate a new virtual environment it (with all cpu installs, and specifying cpu in the config, I receive the above error.

## Environment:
```
sys.platform           linux
Python                 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0]
numpy                  1.19.0
detectron2             0.2 @/home/svidelock/unineunet-test2/lib/python3.6/site-packages/detectron2
Compiler               GCC 7.3
CUDA compiler          not available
DETECTRON2_ENV_MODULE  <not set>
PyTorch                1.5.1+cpu @/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torch
PyTorch debug build    False
GPU available          False
Pillow                 7.2.0
torchvision            0.6.1+cpu @/home/svidelock/unineunet-test2/lib/python3.6/site-packages/torchvision
fvcore                 0.1.1.post20200716
cv2                    4.3.0
---------------------  ----------------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=0, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AssertionError: Torch not compiled with CUDA enabled - DETECTRON CPU/LINUX TRAINING ERROR #41598

Instructions To Reproduce the Issue:

Expected behavior:

Environment:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AssertionError: Torch not compiled with CUDA enabled - DETECTRON CPU/LINUX TRAINING ERROR #41598

Description

Instructions To Reproduce the Issue:

Expected behavior:

Environment:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions