In [73]:
from importlib.metadata import version

import torch

torch.manual_seed(123)
print("TORCH VERSION :", version("torch"))
device = (
    "cuda" if torch.cuda.is_available() else "mps" if torch.backend.mps.is_available() else "cpu"
)
print("GPU  : ", device)

TORCH VERSION : 2.2.1
GPU  :  cuda


In [74]:
import sys

In [75]:
# Integer multiplication is faster than floating point operation
x_int = torch.randint(0, 128, (1000, 100, 100), dtype=torch.int16)
print("shape : ", x_int.shape, "size : ", sys.getsizeof(x_int), "bytes")

shape :  torch.Size([1000, 100, 100]) size :  88 bytes


In [76]:
x_float = torch.rand((1000, 100, 100), dtype=torch.float16)
print("shape : ", x_float.shape, "size : ", sys.getsizeof(x_float), "bytes")

shape :  torch.Size([1000, 100, 100]) size :  88 bytes


In [77]:
# Comparision on CPU
%timeit x_int * x_int
%timeit x_float * x_float

160 µs ± 9.18 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
157 µs ± 5 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [78]:
x_int = x_int.to(device)
x_float = x_float.to(device)

In [81]:
# Comparision on GPU
%timeit x_int * x_int
%timeit x_float * x_float

16.4 µs ± 508 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
16.6 µs ± 331 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


The statement **"Integer multiplication is faster than floating-point operation"** is generally true, but it can depend on various factors such as the hardware architecture, the specific operations being performed, and the optimization capabilities of the compiler or interpreter.

However, it's essential to note that modern processors often have specialized floating-point units (FPUs) that can perform floating-point operations efficiently. 

<b>Here are some reasons why integer multiplication might be faster than floating-point multiplication in certain scenarios:</b>

Hardware Optimization: Some processors have dedicated integer arithmetic units that can perform integer operations more efficiently than floating-point operations. Additionally, integer multiplication can be performed using simpler circuits compared to floating-point multiplication, leading to potentially faster execution.

Data Representation: Floating-point numbers typically require more storage and processing overhead due to their representation (e.g., mantissa, exponent) compared to integers, which can be represented more straightforwardly.

Instruction-Level Parallelism: Integer multiplication operations may be more amenable to parallelization and pipelining in the processor, allowing for greater throughput compared to floating-point operations.

Compiler Optimization: Compilers may apply more aggressive optimizations to integer operations, such as loop unrolling and constant folding, leading to faster execution.



In [82]:
## TODO : EXplore and understand quantization better.