- FLOPs vs. MACs
    - FLOPs: Floating point operations，乘法、加法都算；
    - MACs: multiply–accumulate operations，一个乘法和加法算一个MAC
        - FLOPS = 2MACs
    - Most of modern hardware architectures uses FMA instructions for operations with tensors.
FMA computes $a*x+b$ as one operation. Roughly GMACs = 0.5 * GFLOPs
- FLOPs vs. FLOPS
    - FLOPs 表示次数，末尾的 s 表示复数；
    - FLOPS 表示速度，末尾的 s 表示 per second；
        - Floating point operations per second
- `thop`

In [2]:
# !pip install thop

### mlp & cnn

#### mlp

- 对于一个具有 $N$ 个输入节点和 $M$ 个输出节点的全连接层，其 FLOPs 计算为：

    $$
    \text{FLOPs}=2\times N\times M
    $$

In [3]:
import torch
import torch.nn as nn
from thop import profile

# 定义一个简单的 MLP 模型
class SimpleMLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleMLP, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        x = self.fc1(x)
        x = torch.relu(x)
        x = self.fc2(x)
        return x

In [9]:
# 定义输入参数
input_size = 128  # 输入特征维度
hidden_size = 64  # 隐藏层大小
output_size = 10  # 输出特征维度
batch_size = 32   # 批大小

# 创建 MLP 模型和一个随机输入张量
model = SimpleMLP(input_size, hidden_size, output_size)
input_data = torch.randn(batch_size, input_size)

# 使用 thop 计算 FLOPs 和参数量
MACs, params = profile(model, inputs=(input_data,))

# 展示结果
print(f"MACs: {MACs}")
print(f"Parameters: {params}")

[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
MACs: 282624.0
Parameters: 8906.0


In [10]:
(128 * 64 + 64 * 10) * 32

282624

In [12]:
(128 * 64 + 64) + (64 * 10 + 10)

8906

#### CNN

$$
FLOPs=2\cdot (out_w\times out_h)\cdot in_{channels} * k_{size}^2\times out_{channels}
$$

In [13]:
model = nn.Conv2d(3, 32, kernel_size=3)
inputs = torch.randn(1, 3, 128, 128)
MACs, params = profile(model, inputs=(inputs, ), verbose=False)

In [15]:
MACs, params

(13716864.0, 0)

In [18]:
# 128 - 2: out width, out height
((128 - 2)*(128 - 2)) * 3 * 3**2 * 32

13716864