<a href="https://colab.research.google.com/github/ML-HW-SYS/a3-WDaugherty/blob/main/2_auot_conv1d_gpu.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1D Convolution on GPU

## 1. Set-up 

In [1]:
# Mount google drive 
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [2]:
# Make sure your token is stored in a txt file at the location below.
# This way there is no risk that you will push it to your repo
# Never share your token with anyone, it is basically your github password!
with open('/content/gdrive/MyDrive/ece5545/token.txt') as f:
    token = f.readline().strip()
# Use another file to store your github username    
# with open('/content/gdrive/MyDrive/ece5545/git_username.txt') as f:
#     handle = f.readline().strip()

In [35]:
# Clone your github repo
YOUR_TOKEN = token
YOUR_HANDLE = 'WDaugherty'
BRANCH = "main"

%mkdir /content/gdrive/MyDrive/ece5545
%cd /content/gdrive/MyDrive/ece5545
!git clone https://{YOUR_TOKEN}@github.com/ML-HW-SYS/a3-{YOUR_HANDLE}.git
%cd /content/gdrive/MyDrive/ece5545/a3-{YOUR_HANDLE}
!git checkout {BRANCH}
!git pull
%cd /content/gdrive/MyDrive/ece5545

PROJECT_ROOT = f"/content/gdrive/MyDrive/ece5545/a3-{YOUR_HANDLE}"

mkdir: cannot create directory ‘/content/gdrive/MyDrive/ece5545’: File exists
/content/gdrive/MyDrive/ece5545
fatal: destination path 'a3-WDaugherty' already exists and is not an empty directory.
/content/gdrive/MyDrive/ece5545/a3-WDaugherty
Already on 'main'
Your branch is up to date with 'origin/main'.
remote: Enumerating objects: 7, done.[K
remote: Counting objects: 100% (7/7), done.[K
remote: Compressing objects: 100% (2/2), done.[K
remote: Total 4 (delta 2), reused 4 (delta 2), pack-reused 0[K
Unpacking objects: 100% (4/4), 544 bytes | 11.00 KiB/s, done.
From https://github.com/ML-HW-SYS/a3-WDaugherty
   1a024fc..4d2ce17  main       -> origin/main
Updating 1a024fc..4d2ce17
Fast-forward
 src/ops.py | 78 [32m+++++++++++++++++++++++++++++++[m[31m-------------------------------[m
 1 file changed, 39 insertions(+), 39 deletions(-)
/content/gdrive/MyDrive/ece5545


In [4]:
# This extension reloads all imports before running each cell
%load_ext autoreload
%autoreload 2

In [5]:
!ls {PROJECT_ROOT}

1_auto_conv1d_cpu.ipynb  4_gemm_gpu.ipynb	  space-time-dwsp.ipynb
1_conv1d_cpu.ipynb	 5-conv2d_dw_gpu.ipynb	  space-time-GEMM.ipynb
2-conv1d_gpu.ipynb	 leaderboard_id.txt	  src
2_conv1d_gpu.ipynb	 README.md		  tests
3-conv1d_fpga.ipynb	 space-time-1D_CPU.ipynb
4-gemm_gpu.ipynb	 space-time-1D_GPU.ipynb


## 2. Install TVM

In [6]:
!pip install tlcpack-nightly-cu102 -f https://tlcpack.ai/wheels

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://tlcpack.ai/wheels
Collecting tlcpack-nightly-cu102
  Downloading https://github.com/tlc-pack/tlcpack/releases/download/v0.12.dev/tlcpack_nightly_cu102-0.13.dev47%2Bg608d35717-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (408.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m408.0/408.0 MB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tlcpack-nightly-cu102
Successfully installed tlcpack-nightly-cu102-0.13.dev47+g608d35717


## 3. Implement `make_conv1d_gpu_scheduler_func` function in `src.ops`

In that function, you are required to implemented 1D convolution and use TVM to optimize it.
Let $x \in \mathbb{R}^m$ and $y \in \mathbb{R}^n$, then 
$$
\operatorname{conv1d}(x, y)_i = \sum_{j=-\infty}^{\infty} x[j]y[i-j], \forall i \in \{0, 1, \dots, m + n - 1\}
$$

Please use zero padding and unit stride. Please see the numpy convolution function for more detail: [link](https://numpy.org/doc/stable/reference/generated/numpy.convolve.html).

The `make_conv1d_gpu_scheduler_func` takes $m$ and $n$, which are the size of the two 1D input array. 
You should return both the TVM scheduler and the TVM opterator for 
1. Input $x$
2. Input $y$
3. Output $out$

The scheduler should be able to used to build a function with signature $func(x, y, out)$. 
Please see the following cells for usage.

In [36]:
import tvm
import numpy as np
import sys
import logging
from tvm import te
from tvm import autotvm
# Adding assignment 3 to the system path
# Make sure this matches your git directory
sys.path.insert(0, PROJECT_ROOT)
from src.ops import make_conv1d_gpu_scheduler

M = 16384
N = 32
dtype = 'float32'
task = autotvm.task.create("make_conv1d_gpu_scheduler", args=(M, N), target='llvm')

# Set the search space
n_trial = 1000
measure_option = autotvm.measure_option(
    builder=autotvm.LocalBuilder(),
    runner=autotvm.LocalRunner(number=20, repeat=3, min_repeat_ms=100, timeout=4)
)

tuner = autotvm.tuner.RandomTuner(task)
tuner.tune(n_trial=n_trial,
           measure_option=measure_option,
           callbacks=[autotvm.callback.log_to_file("conv1d_gpu.log")])

# Load the best configuration found by AutoTVM
dispatch_context = autotvm.apply_history_best("conv1d_gpu.log")
best_config = dispatch_context.query(task.target, task.workload)





a_np = np.random.rand(M).astype(dtype)
w_np = np.random.rand(N).astype(dtype)
b_np = np.convolve(a_np, w_np)

# Build the function using the best configuration
with tvm.target.Target('llvm'):
    with autotvm.apply_history_best("conv1d_gpu.log"):
        s, [A, W, B] = make_conv1d_gpu_scheduler(M, N)
        func = tvm.build(s, [A, W, B], "llvm")



dev = tvm.cpu()
a = tvm.nd.array(a_np, dev)
w = tvm.nd.array(w_np, dev)
b = tvm.nd.array(np.zeros((M+N-1), dtype), dev)
func(a, w, b)
evaluator = func.time_evaluator(func.entry_name, dev, number=1, repeat=1)

print("Answer:", b_np)
print("Output:", b)
print(f"1D conv TVM runtime: %f ms" % (evaluator(a, w, b).mean * 1e3))

[autoreload of src.ops failed: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/IPython/extensions/autoreload.py", line 245, in check
    superreload(m, reload, self.old_objects)
  File "/usr/local/lib/python3.9/dist-packages/IPython/extensions/autoreload.py", line 394, in superreload
    module = reload(module)
  File "/usr/lib/python3.9/imp.py", line 314, in reload
    return importlib.reload(module)
  File "/usr/lib/python3.9/importlib/__init__.py", line 169, in reload
    _bootstrap._exec(spec, module)
  File "<frozen importlib._bootstrap>", line 613, in _exec
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/content/gdrive/MyDrive/ece5545/a3-WDaugherty/src/ops.py", line 212, in <module>
    def make_conv1d_gpu_scheduler(M, N):
  File "/usr/local/lib/python3.9/dist-packages/tvm/autotvm/task/task.py", line 442, in _decorate
    _register_cu

TypeError: ignored

In [None]:
print(tvm.lower(s, [A, W, B], simple_mode=True))

In [None]:
%cd {PROJECT_ROOT}
!python -m pytest tests/test_1dconv_gpu.py