<a href="https://colab.research.google.com/github/Tensor-Reloaded/AI-Learning-Hub/blob/main/resources/advanced_pytorch/UsingCppModules.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using CPP modules



In [None]:
TODO: Add explanations

## Setup

In [5]:
!pip install timed-decorator



In [None]:
import shutil
import os
import subprocess

if not os.path.isdir("optimized_f1_score"):
    subprocess.run(["git", "clone", "https://www.github.com/Tensor-Reloaded/AI-Learning-Hub"], check=True)
    shutil.copytree("AI-Learning-Hub/resources/advanced_pytorch/optimized_f1_score", "optimized_f1_score")
    shutil.rmtree("AI-Learning-Hub", ignore_errors=True)


In [7]:
import torch
from timed_decorator.simple_timed import timed
from optimized_f1_score import f1_macro_cpp, f1_macro_py

In [8]:
@timed(use_seconds=True, show_args=True)
def test_f1_cpp(x: torch.Tensor, y: torch.Tensor, classes: int):
    f1_macro_cpp.f1_macro(x, y, classes)


@timed(use_seconds=True, show_args=True)
def test_f1_py(x: torch.Tensor, y: torch.Tensor, classes: int):
    f1_macro_py.f1_macro(x, y, classes)


In [13]:
num_classes = 100
size = 10000

In [14]:
torch.random.manual_seed(3)
x = torch.randint(0, num_classes, (size,))
y = torch.randint(0, num_classes, (size,))
test_f1_cpp(x, y, num_classes)
test_f1_py(x, y, num_classes)
x_cuda = x.cuda()
y_cuda = y.cuda()
test_f1_cpp(x_cuda, y_cuda, num_classes)
test_f1_py(x_cuda, y_cuda, num_classes)


test_f1_cpp(CpuTensor[10000], CpuTensor[10000], 100) -> total time: 0.010340000s
test_f1_py(CpuTensor[10000], CpuTensor[10000], 100) -> total time: 0.014629100s
test_f1_cpp(CudaTensor[10000], CudaTensor[10000], 100) -> total time: 0.060475500s
test_f1_py(CudaTensor[10000], CudaTensor[10000], 100) -> total time: 0.060972100s


In [15]:
%%timeit
f1_macro_cpp.f1_macro(x, y, num_classes)

8.23 ms ± 443 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [16]:
%%timeit
f1_macro_py.f1_macro(x, y, num_classes)

12 ms ± 2.15 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [17]:
%%timeit
f1_macro_cpp.f1_macro(x_cuda, y_cuda, num_classes)

64.2 ms ± 7.54 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [18]:
%%timeit
f1_macro_py.f1_macro(x_cuda, y_cuda, num_classes)

58.6 ms ± 2.61 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


Exercises:
1. TODO

---

| All     | [advanced_pytorch/](https://github.com/Tensor-Reloaded/AI-Learning-Hub/blob/main/resources/advanced_pytorch) |
|---------|-- |
| Current | [Using Cpp Modules](https://github.com/Tensor-Reloaded/AI-Learning-Hub/blob/main/resources/advanced_pytorch/UsingCppModules.ipynb) |