<a href="https://colab.research.google.com/github/Tensor-Reloaded/AI-Learning-Hub/blob/main/resources/advanced_pytorch/UsingCppModules.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using CPP modules



In [12]:
# TODO: Add explanations

## Setup

In [2]:
!pip install timed-decorator

Collecting timed-decorator
  Downloading timed_decorator-1.6.1-py3-none-any.whl.metadata (18 kB)
Downloading timed_decorator-1.6.1-py3-none-any.whl (12 kB)
Installing collected packages: timed-decorator
Successfully installed timed-decorator-1.6.1


In [3]:
import shutil
import os
import subprocess

if not os.path.isdir("optimized_f1_score"):
    subprocess.run(["git", "clone", "https://www.github.com/Tensor-Reloaded/AI-Learning-Hub"], check=True)
    shutil.copytree("AI-Learning-Hub/resources/advanced_pytorch/optimized_f1_score", "optimized_f1_score")
    shutil.rmtree("AI-Learning-Hub", ignore_errors=True)


In [4]:
import torch
from timed_decorator.simple_timed import timed
from optimized_f1_score import f1_macro_cpp, f1_macro_py

Building Optimized F1 Score
Done building


In [5]:
@timed(use_seconds=True, show_args=True)
def test_f1_cpp(x: torch.Tensor, y: torch.Tensor, classes: int):
    f1_macro_cpp.f1_macro(x, y, classes)


@timed(use_seconds=True, show_args=True)
def test_f1_py(x: torch.Tensor, y: torch.Tensor, classes: int):
    f1_macro_py.f1_macro(x, y, classes)


In [6]:
num_classes = 100
size = 10000

In [7]:
torch.random.manual_seed(3)
x = torch.randint(0, num_classes, (size,))
y = torch.randint(0, num_classes, (size,))
test_f1_cpp(x, y, num_classes)
test_f1_py(x, y, num_classes)
x_cuda = x.cuda()
y_cuda = y.cuda()
test_f1_cpp(x_cuda, y_cuda, num_classes)
test_f1_py(x_cuda, y_cuda, num_classes)


test_f1_cpp(CpuTensor[10000], CpuTensor[10000], 100) -> total time: 0.052254405s
test_f1_py(CpuTensor[10000], CpuTensor[10000], 100) -> total time: 0.025944236s
test_f1_cpp(CudaTensor[10000], CudaTensor[10000], 100) -> total time: 0.145290371s
test_f1_py(CudaTensor[10000], CudaTensor[10000], 100) -> total time: 0.109727034s


In [8]:
%%timeit
f1_macro_cpp.f1_macro(x, y, num_classes)

5.08 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [9]:
%%timeit
f1_macro_py.f1_macro(x, y, num_classes)

7.68 ms ± 124 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [10]:
%%timeit
f1_macro_cpp.f1_macro(x_cuda, y_cuda, num_classes)

13.6 ms ± 1.03 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [11]:
%%timeit
f1_macro_py.f1_macro(x_cuda, y_cuda, num_classes)

18.6 ms ± 2.13 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


Exercises:
1. TODO

---

| All     | [advanced_pytorch/](https://github.com/Tensor-Reloaded/AI-Learning-Hub/blob/main/resources/advanced_pytorch) |
|---------|-- |
| Current | [Using Cpp Modules](https://github.com/Tensor-Reloaded/AI-Learning-Hub/blob/main/resources/advanced_pytorch/UsingCppModules.ipynb) |