# 𝚝𝚘𝚛𝚌𝚑𝚖𝚎𝚝𝚎𝚛 🚀

<details>
<summary>👋 𝙸𝚗𝚝𝚛𝚘𝚍𝚞𝚌𝚝𝚒𝚘𝚗</summary>

An `all-in-one` tool for `Pytorch` model analysis, providing end-to-end measurement capabilities, including:
- parameter statistics
- computational cost analysis
- memory usage tracking
- inference time
- throughput analysis

</details>

<details>
<summary>✨ 𝙲𝚘𝚛𝚎 𝙵𝚞𝚗𝚌𝚝𝚒𝚘𝚗𝚊𝚕𝚒𝚝𝚢</summary>

1. **Parameter Analysis**
    - Total/trainable parameter quantification
    - Layer-wise parameter distribution analysis
    - Gradient state tracking (requires_grad flags)

2. **Computational Profiling**
    - FLOPs/MACs precision calculation
    - Operation-wise calculation distribution analysis
    - Dynamic input/output detection (number, type, shape, ...)

3. **Memory Diagnostics**
    - Input/output tensor memory awareness
    - Hierarchical memory consumption analysis

4. **Performance Benchmarking**
    - Auto warm-up phase execution (eliminates cold-start bias)
    - Device-specific high-precision timing
    - Inference latency  & Throughput Benchmarking

5. **Visualization Engine**
    - Centralized configuration management
    - Programmable tabular report
        1. Style customization and real-time rendering
        2. Dynamic table structure adjustment
        3. Real-time data analysis in programmable way
        4. Multi-format data export
    - Rich-text hierarchical structure tree rendering
        1. Style customization and real-time rendering
        2. Smart module folding based on structural equivalence detection

6. **Cross-Platform Support**
    - Automatic model-data co-location
    - Seamless device transition (CPU/CUDA)

</details>

---

- 📜 𝐋𝐢𝐜𝐞𝐧𝐬𝐞: [AGPL-3.0](https://github.com/TorchMeter/torchmeter/blob/master/LICENSE)
- 👨‍🎨 𝐀𝐮𝐭𝐡𝐨𝐫: [Ahzyuan](https://github.com/Ahzyuan)
- 🎯 𝐑𝐞𝐩𝐨 𝐇𝐨𝐦𝐞: https://github.com/TorchMeter/torchmeter
- 📦 𝐏𝐲𝐏𝐈: https://pypi.org/project/torchmeter/

---
<font size=3>

1. Feel free to report bugs and suggestions!
   - [𝖨𝗌𝗌𝗎𝖾𝗌](https://github.com/TorchMeter/torchmeter/issues)
   - [𝖣𝗂𝗌𝖼𝗎𝗌𝗌𝗂𝗈𝗇𝗌](https://github.com/TorchMeter/torchmeter/discussions)
   - [𝖯𝗎𝗅𝗅 𝖱𝖾𝗊𝗎𝖾𝗌𝗍𝗌](https://github.com/TorchMeter/torchmeter/pulls)

2. Looking forward to your star ⭐️ if `torchmeter` is helpful to you!

</font>

In [31]:
# install 
# %pip install torchmeter

In [32]:
import torch
from rich import print
from torchvision import models
from torchmeter import Meter, get_config

# 1. Model Preparation and Initialization
model = models.vgg19_bn()
metered_model = Meter(model)
cfg = get_config()

# 2. Zero-intrusion proxy: use as like using the original model
metered_model.features.requires_grad_(False)

# 3. Input Processing and Inference (Implicit automatic device synchronization between input data and model)
input = torch.randn(1, 3, 224, 224)
if torch.cuda.is_available():
    metered_model.to('cuda')

# 4. Standard forward propagation
output = metered_model(input)  

In [33]:
# 5. Model Structure Analysis
# --------------------------------------------
## Enable repeated block folding
print("="*10, " enable smart folding of repeat blocks", "="*10)
metered_model.tree_fold_repeat = True
print(metered_model.structure)

## Disable
print("="*10, " disable smart folding of repeat blocks", "="*10)
metered_model.tree_fold_repeat = False
print(metered_model.structure)

In [34]:
## Disable interval output to adapt to Jupyter Notebook.
cfg.render_interval = 0

In [35]:
# 6. Full-Stack Model Analytics
# --------------------------------------------

# 6.1 Parameter Analysis
print("="*10, " Parameter Analysis ", "="*10)

## Total/trainable parameter quantification
print(metered_model.param)  

## Layer-wise parameter distribution analysis
tb, data = metered_model.profile('param', no_tree=True)

In [36]:
# 6.2 Computational Profiling
print("="*10, " Computational Profiling ", "="*10)

## FLOPs/MACs precision calculation
print(metered_model.cal)

## Operation-wise calculation distribution analysis
tb, data = metered_model.profile('cal', no_tree=True)

In [37]:
# 6.3 Memory Diagnostics
print("="*10, " Memory Diagnostics ", "="*10)

## Input/output tensor memory awareness
print(metered_model.mem)

## Hierarchical memory consumption analysis
tb, data = metered_model.profile('mem', no_tree=True)

In [38]:
# 6.4 Performance Benchmarking
print("="*10, " Inference latency & Throughput benchmarking ", "="*10)

## Customized preheating phase
metered_model.ittp_warmup = 10  
metered_model.ittp_benchmark_time = 20

## Inference latency  & Throughput Benchmarking
print(metered_model.ittp)
tb, data = metered_model.profile('ittp', no_tree=True)

Warming Up: 100%|██████████| 10/10 [00:00<00:00, 402.66it/s]
Benchmark Inference Time & Throughput: 100%|██████████| 1280/1280 [00:01<00:00, 674.62module/s]


Warming Up: 100%|██████████| 10/10 [00:00<00:00, 445.18it/s]
Benchmark Inference Time & Throughput: 100%|██████████| 1280/1280 [00:01<00:00, 650.63module/s]


In [39]:
# 7. Rich Visualization
# --------------------------------------------

# 7.1 Rich-text hierarchical structure tree rendering
from rich.box import ROUNDED

print("="*10, " Tree rendering customization ", "="*10)
metered_model.tree_fold_repeat = True
metered_model.tree_levels_args = {
    "default": {"label": "[b gray35](<node_id>) [green]<name>[/green] [cyan]<module_repr>[/]"},
    "1": {"guide_style": "cornflower_blue"}
}
metered_model.tree_repeat_block_args = {
    "title": "[[b]<repeat_time>[/b]] [i]Times Repeated[/]",
    "box": ROUNDED
}
print(metered_model.structure)

In [40]:
# 7.2 Customization in displaying programmable tabular report
print("="*10, " Table rendering customization ", "="*10)
metered_model.table_column_args.justify = "left"
metered_model.table_display_args = {
    "style": "#af8700", # or rgb(175,135,0)
    "show_lines": True,
    "show_edge": False    
}

tb, data = metered_model.profile("param", no_tree=True)

In [41]:
## discard above settings
cfg.restore()
cfg.render_interval = 0

In [42]:
# 7.3 Custom rendering of table content
## 7.3.1 structure tree + tabular report
print("="*10, " Table report with tree ", "="*10)
tb, data = metered_model.profile("param", no_tree=False)

In [43]:
## 7.3.2 tabular report with raw data
print("="*10, " Table report with raw data ", "="*10)
tb, data = metered_model.profile("param", no_tree=True, raw_data=True)

In [44]:
# 7.3.3 Column customization of the tabular report
print("="*10, " Table structure customization ", "="*10)
tb, data = metered_model.profile(
    "mem", 
    no_tree=True, 
    pick_cols=["Operation_Id", "Operation_Name", "Param_Cost", "Buffer_Cost", "Output_Cost", "Total"], 
    exclude_cols=["Operation_Name"],
    custom_cols={"Operation_Id": "ID", 
                 "Param_Cost": "Param Cost", 
                 "Buffer_Cost": "Buffer Cost", 
                 "Output_Cost": "Output Cost"},
    keep_custom_name = True,
    newcol_name="Index",
    newcol_func=lambda df: list(range(len(df))),
    newcol_type=int,
    newcol_idx=0,
    keep_new_col=True
)

## check the new columns are kept
print(metered_model.table_cols("mem"))

In [45]:
## discard above settings
cfg.restore()
cfg.render_interval = 0

In [46]:
# 7.4 Programmable tabular report: Real-time data analysis in programmable way
def newcol_logic(df):
    num_col = df['Number']
    return num_col.map_elements(
        lambda x: f"{100 * x / metered_model.param.TotalNum:.4f} %",
        return_dtype=str
    )

print("="*10, " Programmable tabular report ", "="*10)
origin_col = metered_model.table_cols('param')
print(f"origin cols: {origin_col}")

tb, data = metered_model.profile(
    'param', 
    no_tree = True,
    exclude_cols=["Operation_Name"],
    custom_cols={"Operation_Id": 'ID', 
                 "Param_Name": 'Param Name', 
                 "Requires_Grad": 'Trainable',
                 "Numeric_Num": "Number"},
    newcol_name='Percentage',
    newcol_func=newcol_logic,
    newcol_type=str
)

In [47]:
# 7.5 Multi-format export of tabular report
print("="*10, " Tabular report export ", "="*10)
tb, data = metered_model.profile(
    'param', 
    show=False,
    no_tree = True,
    exclude_cols=["Operation_Name"],
    custom_cols={"Operation_Id": 'ID', 
                 "Param_Name": 'Param Name', 
                 "Requires_Grad": 'Trainable',
                 "Numeric_Num": "Number"},
    newcol_name='Percentage',
    newcol_func=newcol_logic,
    newcol_type=str,
    save_to='./param_report.xlsx' # or csv
)

In [48]:
# 8. Cross-Platform Support
print("="*10, " Cross-Platform Support ", "="*10)

metered_model.to("cpu")
print(metered_model.device)

if torch.cuda.is_available():
    metered_model.device = "cuda:0"
    print(metered_model.device)

In [49]:
# 9. Model Summary

print("="*10, " Model Information ", "="*10)
print(metered_model.model_info)

In [50]:
# 10. Statistics Overview
print("="*10, " Statistics Overview ", "="*10)
print(metered_model.overview())

print("="*10, " Statistics Overview (no warnings) ", "="*10)
print(metered_model.overview(show_warning=False))

print("="*10, " Statistics Overview (custom) ", "="*10)
print(metered_model.overview("param", "mem"))

Warming Up: 100%|██████████| 10/10 [00:00<00:00, 376.86it/s]
Benchmark Inference Time & Throughput: 100%|██████████| 1280/1280 [00:02<00:00, 607.88module/s]


Warming Up: 100%|██████████| 10/10 [00:00<00:00, 431.45it/s]
Benchmark Inference Time & Throughput: 100%|██████████| 1280/1280 [00:02<00:00, 633.55module/s]


In [51]:
# 11. Advanced Usage

## 11.1 Post export of tabular report
print("="*10, " Custom export ", "="*10)
metered_model.table_renderer.export(
    df=data,
    save_path=".",
    format="csv",
    file_suffix="custom_export",
    raw_data=True
)

In [52]:
## 11.2 Custimation of repeat footer
import torch.nn as nn
from random import sample

print("="*10, " Custom repeat block footer ", "="*10)

class RepeatModel(nn.Module):
    def __init__(self, repeat_winsz:int=1, repeat_time:int=2):
        super(RepeatModel, self).__init__()
        
        layer_candidates = [nn.Linear(10, 10), 
                            nn.ReLU(),
                            nn.Identity()]

        pick_modules = sample(layer_candidates, repeat_winsz)
        all_modules = pick_modules * repeat_time

        self.layers = nn.ModuleList(all_modules)

metered_model = Meter(RepeatModel(repeat_winsz=2, repeat_time=3), 
                      device="cpu")

### 11.2.1 change to a hard-coding string 
print("-"*10, " Using hard-coding string ", "-"*10)
metered_model.tree_renderer.repeat_footer = "My custom footer"
print(metered_model.structure)

In [53]:
### 11.2.2 change to a string with operation node attributes
print("-"*10, " Using dynamic string with attributes resolved ", "-"*10)
metered_model.tree_renderer.repeat_footer = "The type of first module is <type>"
print(metered_model.structure)

In [54]:
### 11.2.3 change to a function which accept a attr-dictionary and return a string
print("-"*10, " Using funtion ", "-"*10)
def my_footer(attr_dict):
    repeat_win_size = attr_dict["repeat_winsz"]
    if repeat_win_size > 1:
        return f"There are {repeat_win_size} modules in a repeat window"
    else:
        return "The repeat window only contains one module"
metered_model.tree_renderer.repeat_footer = my_footer
print(metered_model.structure)

In [55]:
## 11.3 Centralized configuration management
print("="*10, " Efficient config management ", "="*10)

### 11.3.1 show config
print("-"*10, " config display ", "-"*10)
print(cfg)

In [56]:
### 11.3.2 retrieve config setting
print("-"*10, " config settings retrieval ", "-"*10)
print(
    f"config_file: {cfg.config_file}\n",
    f"render time interval: {cfg.render_interval}\n",
    f"tree default guide line style: {cfg.tree_levels_args.default.guide_style}\n",
    f"table col justify: {cfg.table_column_args.justify}\n",
    f"gap between tree and table in profiling: {cfg.combine.horizon_gap}"
)

In [57]:
### 11.3.3 change config settings
print("-"*10, " change config settings ", "-"*10)
cfg.render_interval = 0.5
cfg.tree_levels_args.default.guide_style = "buld"
cfg.table_display_args = {
    "show_header": False,
    "show_lines": True
}
print(cfg.render_interval)
print(cfg.tree_levels_args.default.guide_style)
print(cfg.table_display_args.show_header)
print(cfg.table_display_args.show_lines)

In [58]:
### 11.3.4 save config
import os
print("-"*10, " dump config settings ", "-"*10)
des = "./my_config.yaml"
cfg.dump(save_path=des)
abs_des = os.path.abspath(des)
if os.path.exists(abs_des):
    print(f"config dumped successfully to {abs_des}")

In [59]:
### 11.3.5 restore config
print("-"*10, " restore all config settings ", "-"*10)
cfg.restore()
print(cfg)

In [60]:
### 11.3.6 reuse config
print("-"*10, " config reuse ", "-"*10)
new_cfg = get_config(config_path=abs_des)
print(f"reuse: {new_cfg.config_file}")
print(new_cfg)
print("You can compare with the restored settings in last cell, ",
      "to check if the settings before restore are reused.")