## Installation

TVM must be installed from source&mdash;there is no easier way of installing it otherwise, currently. The installation path is documented on the ["Install from source"](https://tvm.apache.org/docs/install/from_source.html#python-package-installation) page in the TVM documentation. 

For starters you'll need to start up a Spell worksplace with the following `conda-file` configuration:

```yaml
name: spell
channels:
  - conda-forge
dependencies:
  - numpy
  - pandas
  - xgboost
  - tornado
  - pip:
     - torch
     - cloudpickle
     - psutil
```

You can then use `scripts/install_tvm.sh` can be used to install it in a Spell environment:

```bash
$ chmod +x /spell/scripts/install_tvm.sh
$ sudo /spell/scripts/install_tvm.sh
```

This script installs LLVM 6.0 (TVM requires LLVM>=4.0), installs various other TVM precursors, and builds a version of TVM with CUDA and Relay graph debugging enabled.

## WIP

https://tvm.apache.org/docs/tutorials/index.html, https://tvm.apache.org/docs/tutorials/get_started/relay_quick_start.html#sphx-glr-tutorials-get-started-relay-quick-start-py

In [3]:
!chmod +x /spell/scripts/install_tvm.sh

In [4]:
!sudo /spell/scripts/install_tvm.sh

+ [[ ! -d /tmp/tvm ]]
+ git clone --recursive https://github.com/apache/tvm /tmp/tvm
Cloning into '/tmp/tvm'...
remote: Enumerating objects: 21, done.[K
remote: Counting objects: 100% (21/21), done.[K
remote: Compressing objects: 100% (17/17), done.[K
remote: Total 95501 (delta 6), reused 4 (delta 4), pack-reused 95480[K
Receiving objects: 100% (95501/95501), 36.67 MiB | 33.38 MiB/s, done.
Resolving deltas: 100% (69908/69908), done.
Submodule 'dlpack' (https://github.com/dmlc/dlpack) registered for path '3rdparty/dlpack'
Submodule 'dmlc-core' (https://github.com/dmlc/dmlc-core) registered for path '3rdparty/dmlc-core'
Submodule '3rdparty/rang' (https://github.com/agauniyal/rang) registered for path '3rdparty/rang'
Submodule '3rdparty/vta-hw' (https://github.com/apache/incubator-tvm-vta) registered for path '3rdparty/vta-hw'
Cloning into '/tmp/tvm/3rdparty/dlpack'...
remote: Enumerating objects: 22, done.        
remote: Counting objects: 100% (22/22), done.        
remote: Compress

In [1]:
import numpy as np

from tvm import relay
from tvm.relay import testing
import tvm
from tvm import te
from tvm.contrib import graph_runtime

In [2]:
batch_size = 1
num_class = 1000
image_shape = (3, 224, 224)
data_shape = (batch_size,) + image_shape
out_shape = (batch_size, num_class)

mod, params = relay.testing.resnet.get_workload(
    num_layers=18, batch_size=batch_size, image_shape=image_shape
)

# print(mod.astext(show_meta_data=False))

In [3]:
opt_level = 3
target = tvm.target.cuda()
with tvm.transform.PassContext(opt_level=opt_level):
    lib = relay.build(mod, target, params=params)

...100%, 0.47 MB, 5147 KB/s, 0 seconds passed


Cannot find config for target=cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown -thread_warp_size=32, workload=('dense_small_batch.cuda', ('TENSOR', (1, 512), 'float32'), ('TENSOR', (1000, 512), 'float32'), None, 'float32'). A fallback configuration is used, which may bring great performance regression.


ValueError: Traceback (most recent call last):
  [bt] (8) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x65) [0x7fec2395b985]
  [bt] (7) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x3a0) [0x7fec237a90b0]
  [bt] (6) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(tvm::relay::backend::RelayBuildModule::BuildRelay(tvm::IRModule, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::NDArray, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, tvm::runtime::NDArray> > > const&)+0x1d0e) [0x7fec237a7fae]
  [bt] (5) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(tvm::build(tvm::Map<tvm::runtime::String, tvm::IRModule, void, void> const&, tvm::Target const&)+0xdf) [0x7fec2325002f]
  [bt] (4) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(tvm::build(tvm::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)+0x584) [0x7fec2324f704]
  [bt] (3) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(tvm::codegen::Build(tvm::IRModule, tvm::Target)+0x62f) [0x7fec232eb6df]
  [bt] (2) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::IRModule, tvm::Target)>::AssignTypedLambda<tvm::runtime::Module (*)(tvm::IRModule, tvm::Target)>(tvm::runtime::Module (*)(tvm::IRModule, tvm::Target))::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x677) [0x7fec232f2597]
  [bt] (1) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(tvm::codegen::BuildCUDA(tvm::IRModule, tvm::Target)+0x2be) [0x7fec238e09be]
  [bt] (0) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(+0x121040b) [0x7fec2395840b]
  File "/opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 81, in cfun
    rv = local_pyfunc(*pyargs)
  File "/opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/autotvm/measure/measure_methods.py", line 722, in tvm_callback_cuda_compile
    ptx = nvcc.compile_cuda(code, target=target, arch=AutotvmGlobalScope.current.cuda_target_arch)
  File "/opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/contrib/nvcc.py", line 71, in compile_cuda
    raise ValueError("arch(sm_xy) is not passed, and we cannot detect it from env")
ValueError: arch(sm_xy) is not passed, and we cannot detect it from env

Googling this error message brought me to [this TVM discuss thread](https://discuss.tvm.apache.org/t/solved-compile-error-related-to-autotvm/804), which states that the likely root cause is that the install is borked. Running the suggested code:

In [5]:
import tvm
print(tvm.gpu(0).exist)
print(tvm.gpu(0).compute_version)

False


TVMError: Traceback (most recent call last):
  [bt] (3) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x65) [0x7fec2395b985]
  [bt] (2) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(+0x1211fa9) [0x7fec23959fa9]
  [bt] (1) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(tvm::runtime::CUDADeviceAPI::GetAttr(DLContext, tvm::runtime::DeviceAttrKind, tvm::runtime::TVMRetValue*)+0x9fd) [0x7fec23a03c2d]
  [bt] (0) /opt/conda/envs/spell/lib/python3.9/site-packages/tvm-0.8.dev392+gb8ac8d94d-py3.9-linux-x86_64.egg/tvm/libtvm.so(+0x12bada2) [0x7fec23a02da2]
  File "/tmp/tvm/src/runtime/cuda/cuda_device_api.cc", line 62
TVMError: 
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading == false: CUDA: CUDA driver version is insufficient for CUDA runtime version

In [10]:
!which nvcc

/usr/local/cuda/bin/nvcc


In [15]:
!echo $PATH

/opt/conda/envs/spell/bin:/opt/conda/condabin:/opt/conda/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin


In [13]:
%ls /usr/local/cuda/

[0m[01;34mbin[0m/     [01;34mdoc[0m/     [01;36minclude[0m@  LICENSE  [01;34mnvvm[0m/   [01;34mshare[0m/  [01;34mtargets[0m/
[01;34mcompat[0m/  [01;34mextras[0m/  [01;36mlib64[0m@    [01;34mnvml[0m/    README  [01;34msrc[0m/    version.txt


In [8]:
!cat /usr/local/cuda/version.txt

CUDA Version 10.0.130


In [9]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130


In [6]:
!nvidia-smi

Mon Dec 28 22:55:55 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   20C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

SO says to first check that the environment CUDA is not borked. To test this, I'll run a self-contained demo script in PyTorch ([this one](https://github.com/spellml/cnn-cifar10/blob/master/models/train_basic.py)).

In [6]:
import torchvision
from torch.utils.data import DataLoader
import torch
from torch import nn
from torch import optim
import numpy as np
from spell.metrics import send_metric

import os
if not os.path.exists("/spell/checkpoints/"):
    os.mkdir("/spell/checkpoints/")

transform_train = torchvision.transforms.Compose([
    torchvision.transforms.RandomHorizontalFlip(),
    # torchvision.transforms.Lambda(lambda x: torch.tensor(np.array(x).reshape((3, 32, 32)) / 255, dtype=torch.float)),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
])
train_dataset = torchvision.datasets.CIFAR10("/mnt/cifar10/", train=True, transform=transform_train, download=True)
train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=False)

class CIFAR10Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.cnn_block_1 = nn.Sequential(*[
            nn.Conv2d(3, 32, 3),
            nn.ReLU(),
            nn.Conv2d(32, 32, 3),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
            nn.Dropout(0.25)
        ])
        self.cnn_block_2 = nn.Sequential(*[
            nn.Conv2d(32, 32, 3),
            nn.ReLU(),
            nn.Conv2d(32, 32, 3),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
            nn.Dropout(0.25)
        ])
        self.flatten = lambda inp: torch.flatten(inp, 1)
        self.head = nn.Sequential(*[
            nn.Linear(800, 512),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(512, 10)
        ])
    
    def forward(self, X):
        X = self.cnn_block_1(X)
        X = self.cnn_block_2(X)
        X = self.flatten(X)
        X = self.head(X)
        return X

clf = CIFAR10Model()
clf.cuda()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(clf.parameters())

def train():
    NUM_EPOCHS = 10
    for epoch in range(1, NUM_EPOCHS + 1):
        losses = []

        for i, (X_batch, y_cls) in enumerate(train_dataloader):
            optimizer.zero_grad()

            y = y_cls.cuda()
            X_batch = X_batch.cuda()

            y_pred = clf(X_batch)
            loss = criterion(y_pred, y)
            loss.backward()
            optimizer.step()

            curr_loss = loss.item()
            if i % 200 == 0:
                print(
                    f'Finished epoch {epoch}/{NUM_EPOCHS}, batch {i}. Loss: {curr_loss:.3f}.'
                )
                send_metric("loss", curr_loss)

            losses.append(curr_loss)

        print(
            f'Finished epoch {epoch}. '
            f'avg loss: {np.mean(losses)}; median loss: {np.median(losses)}'
        )
        
        torch.save(clf.state_dict(), f"/spell/checkpoints/epoch_{epoch}.pth")
    torch.save(clf.state_dict(), f"/spell/checkpoints/model_final.pth")

# if __name__ == "__main__":
train()

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to /mnt/cifar10/cifar-10-python.tar.gz


98.9%

Extracting /mnt/cifar10/cifar-10-python.tar.gz to /mnt/cifar10/
Finished epoch 1/10, batch 0. Loss: 2.301.
Finished epoch 1/10, batch 200. Loss: 1.531.
Finished epoch 1/10, batch 400. Loss: 1.420.
Finished epoch 1/10, batch 600. Loss: 1.516.
Finished epoch 1/10, batch 800. Loss: 1.298.
Finished epoch 1/10, batch 1000. Loss: 1.360.
Finished epoch 1/10, batch 1200. Loss: 1.267.
Finished epoch 1/10, batch 1400. Loss: 1.289.
Finished epoch 1. avg loss: 1.5152859805641614; median loss: 1.4874082803726196
Finished epoch 2/10, batch 0. Loss: 1.237.
Finished epoch 2/10, batch 200. Loss: 1.473.
Finished epoch 2/10, batch 400. Loss: 0.979.
Finished epoch 2/10, batch 600. Loss: 1.253.
Finished epoch 2/10, batch 800. Loss: 1.190.


KeyboardInterrupt: 

In [3]:
# %pip install torchvision
# %pip install spell

In [7]:
!echo $PATH

/opt/conda/envs/spell/bin:/opt/conda/condabin:/opt/conda/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin


In [43]:
# import os; executables = []; [os.listdir(path) for path in os.environ['PATH'].split(":")]

In [14]:
[ex for ex in executables if 'nvcc' in ex]

['nvcc', 'nvcc.profile']

In [44]:
!find / -path **/nvcc -type f

/usr/local/cuda-10.0/bin/nvcc


In [58]:
!which nvcc

/usr/local/cuda/bin/nvcc


In [59]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130


In [22]:
!find / -path **/libcuda.so -type f

/usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libcuda.so


In [52]:
!cp /usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libcuda.so /usr/local/cuda/targets/x86_64-linux/lib/stubs/libcuda.so

cp: '/usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libcuda.so' and '/usr/local/cuda/targets/x86_64-linux/lib/stubs/libcuda.so' are the same file


In [56]:
%ls /usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libcuda.so

/usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libcuda.so


In [16]:
%ls /usr/local/cuda/bin

[0m[01;32mbin2c[0m*     [01;32mcuda-gdbserver[0m*  [01;32mgpu-library-advisor[0m*  [01;32mnvlink[0m*
[01;34mcrt[0m/       [01;32mcuda-memcheck[0m*   [01;32mnvcc[0m*                 [01;32mnvprof[0m*
[01;32mcudafe++[0m*  [01;32mcuobjdump[0m*       nvcc.profile          [01;32mnvprune[0m*
[01;32mcuda-gdb[0m*  [01;32mfatbinary[0m*       [01;32mnvdisasm[0m*             [01;32mptxas[0m*


We have two `*cuda*` folders?

In [23]:
%ls /usr/local/

[0m[01;34mbin[0m/   [01;34mcuda-10.0[0m/  [01;34mgames[0m/    [01;34mlib[0m/  [01;34msbin[0m/   [01;34msrc[0m/
[01;36mcuda[0m@  [01;34metc[0m/        [01;34minclude[0m/  [01;36mman[0m@  [01;34mshare[0m/


In [25]:
%ls /usr/local/cuda/

[0m[01;34mbin[0m/     [01;34mdoc[0m/     [01;36minclude[0m@  LICENSE  [01;34mnvvm[0m/   [01;34mshare[0m/  [01;34mtargets[0m/
[01;34mcompat[0m/  [01;34mextras[0m/  [01;36mlib64[0m@    [01;34mnvml[0m/    README  [01;34msrc[0m/    version.txt


In [27]:
%ls /usr/local/cuda-10.0/

[0m[01;34mbin[0m/     [01;34mdoc[0m/     [01;36minclude[0m@  LICENSE  [01;34mnvvm[0m/   [01;34mshare[0m/  [01;34mtargets[0m/
[01;34mcompat[0m/  [01;34mextras[0m/  [01;36mlib64[0m@    [01;34mnvml[0m/    README  [01;34msrc[0m/    version.txt


In [60]:
!nvidia-smi

Tue Dec 29 19:19:08 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   33C    P0    26W /  70W |   1060MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

So for whatever reason the machine appears to have CUDA installed twice, once to the `/usr/local/cuda/` path and once to the `/usr/local/cuda-10.0` path. The `nvcc` installed in `/usr/local/cuda/` is the one that is linked to on `PATH`, and no other `nvcc` is present on `PATH`, which leads me to believe that this is the CUDA installation that TVM is finding and linking to.

In [2]:
!ldd /tmp/tvm/build/libtvm.so

	linux-vdso.so.1 (0x00007fffe32d6000)
	libnvrtc.so.10.0 => /usr/local/cuda/lib64/libnvrtc.so.10.0 (0x00007ff75f21d000)
	libLLVM-6.0.so.1 => /usr/lib/llvm-6.0/lib/libLLVM-6.0.so.1 (0x00007ff75b781000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ff75b57d000)
	libcudart.so.10.0 => /usr/local/cuda/lib64/libcudart.so.10.0 (0x00007ff75b303000)
	libcuda.so.1 => /usr/local/cuda/targets/x86_64-linux/lib/stubs/libcuda.so.1 (0x00007ff75b0f7000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff75aed8000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007ff75ab4f000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ff75a7b1000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ff75a599000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff75a1a8000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff7620f1000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ff759fa0000)
	libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007

In [5]:
!cat /usr/local/cuda/version.txt

CUDA Version 10.0.130


In [6]:
%%bash
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update && sudo apt-get -y install cuda

Executing: /tmp/apt-key-gpghome.oBLuuuxAiN/gpg.1.sh --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
Hit:1 https://deb.nodesource.com/node_12.x bionic InRelease
Ign:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Ign:3 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Hit:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release
Hit:5 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release
Hit:8 http://security.ubuntu.com/ubuntu bionic-security InRelease
Hit:9 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic InRelease
Hit:10 http://archive.ubuntu.com/ubuntu bionic InRelease
Hit:11 http://archive.ubuntu.com/ubuntu bionic-updates InRelease
Hit:12 http://archive.ubuntu.com/ubuntu bionic-backports InRelease
Reading package lists...
Reading package lists...
Building dependency tree

--2020-12-29 20:45:50--  https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 190 [application/octet-stream]
Saving to: ‘cuda-ubuntu1804.pin’

     0K                                                       100% 5.16M=0s

2020-12-29 20:45:50 (5.16 MB/s) - ‘cuda-ubuntu1804.pin’ saved [190/190]

gpg: requesting key from 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub'
gpg: key F60F4B3D7FA2AF80: "cudatools <cudatools@nvidia.com>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1
Traceback (most recent call last):
  File "/usr/bin/add-apt-repository", line 12, in <module>
    from softwareproperties.SoftwareProperties import SoftwarePro

CalledProcessError: Command 'b'wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin\nsudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600\nsudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub\nsudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"\nsudo apt-get update && sudo apt-get -y install cuda\n'' returned non-zero exit status 100.