# torch.utils.tensorboard 
——它是 PyTorch 官方提供的 与 `TensorBoard` 对接的工具，可以用来 `可视化和监控训练过程`。

## 1. 什么是 TensorBoard？

`TensorBoard `最早是` TensorFlow `的可视化工具，现在` PyTorch `也能直接对接。

它能可视化：

- Loss / Accuracy 曲线 📈

- 学习率变化

- 模型结构（Graph）

- 权重分布（直方图）

- 输入样本、预测结果（图像/文本/音频）

- Embedding 可视化

---
## 2. `torch.utils.tensorboard` 主要接口

核心类：`SummaryWriter`

In [None]:
from torch.utils.tensorboard import SummaryWriter


它的作用就是往指定目录写入日志文件，然后你用 tensorboard 命令启动可视化。

---

## 3. 常用方法
| 方法                                                                                   | 作用                            |
| ------------------------------------------------------------------------------------ | ----------------------------- |
| `add_scalar(tag, scalar_value, global_step)`                                         | 添加标量（如 loss、accuracy）         |
| `add_scalars(main_tag, {tag1: value1, tag2: value2}, global_step)`                   | 多个标量同时绘制（如 train/val loss 对比） |
| `add_image(tag, img_tensor, global_step, dataformats='CHW')`                         | 添加单张图像                        |
| `add_images(tag, img_tensor, global_step, dataformats='NCHW')`                       | 添加多张图像                        |
| `add_histogram(tag, values, global_step)`                                            | 添加直方图（如权重分布）                  |
| `add_graph(model, input_to_model)`                                                   | 可视化模型结构                       |
| `add_embedding(mat, metadata=None, label_img=None, global_step=None, tag='default')` | 可视化高维向量（如词向量）                 |
| `flush()`                                                                            | 刷新日志（避免还没写入就退出）               |
| `close()`                                                                            | 关闭 writer                     |

---
## 4. 使用示例
（1）基本用法：记录 Loss / Accuracy

In [None]:
from torch.utils.tensorboard import SummaryWriter
import numpy as np

writer = SummaryWriter("runs/exp1")  # 日志保存目录

for epoch in range(100):
    loss = np.random.random()
    acc = np.random.random()
    
    writer.add_scalar("Loss/train", loss, epoch)
    writer.add_scalar("Accuracy/train", acc, epoch)

writer.close()


启动 TensorBoard：

- tensorboard --logdir = runs

然后在浏览器打开` http://localhost:6006 `就能看到曲线了。

---
（2）记录训练 & 验证 Loss 对比

In [None]:
for epoch in range(100):
    train_loss = np.random.random()
    val_loss = np.random.random()

    writer.add_scalars("Loss", {"train": train_loss, "val": val_loss}, epoch)


In [None]:
（3）记录图像

In [None]:
import torch
import torchvision

images = torch.randn(16, 3, 64, 64)  # 一批图像
grid = torchvision.utils.make_grid(images)

writer.add_image("images", grid, 0)

In [None]:
（4）记录直方图（权重分布）

In [None]:
for epoch in range(10):
    weights = torch.randn(1000)
    writer.add_histogram("weights", weights, epoch)


In [None]:
（5）记录模型结构

In [None]:
import torch.nn as nn

model = nn.Sequential(
    nn.Linear(10, 20),
    nn.ReLU(),
    nn.Linear(20, 2)
)

x = torch.randn(1, 10)
writer.add_graph(model, x)


## 5. 推荐的训练流程集成

In [None]:
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter("runs/exp1")

for epoch in range(epochs):
    train_loss, train_acc = train(...)
    val_loss, val_acc = validate(...)

    # 记录标量
    writer.add_scalars("Loss", {"train": train_loss, "val": val_loss}, epoch)
    writer.add_scalars("Accuracy", {"train": train_acc, "val": val_acc}, epoch)

    # 记录权重分布
    for name, param in model.named_parameters():
        writer.add_histogram(name, param.clone().cpu().data.numpy(), epoch)

writer.close()


## 6. 总结

✅ torch.utils.tensorboard 作用：

- 训练过程实时可视化（Loss、Accuracy、学习率）

- 可视化模型结构、权重分布

- 记录样本（图像、音频、文本）

- Embedding 降维展示

👉 它非常适合 调试 & 监控训练过程，比单纯 print 更直观。

# 🔥 PyTorch + TensorBoard 训练代码模板

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.tensorboard import SummaryWriter

# ===============================
# 1. 数据准备
# ===============================
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))  # [-1, 1] 归一化
])

trainset = torchvision.datasets.MNIST(
    root='./data', train=True, download=True, transform=transform
)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

testset = torchvision.datasets.MNIST(
    root='./data', train=False, download=True, transform=transform
)
testloader = torch.utils.data.DataLoader(testset, batch_size=1000, shuffle=False)

# ===============================
# 2. 定义模型
# ===============================
class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28*28, 256)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(256, 10)

    def forward(self, x):
        x = self.flatten(x)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleNet().to(device)

# ===============================
# 3. 损失函数 & 优化器
# ===============================
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# ===============================
# 4. TensorBoard 设置
# ===============================
writer = SummaryWriter("runs/mnist_experiment")

# 可视化模型结构
sample_input = torch.randn(1, 1, 28, 28).to(device)
writer.add_graph(model, sample_input)

# ===============================
# 5. 训练 & 验证函数
# ===============================
def train(epoch):
    model.train()
    running_loss, correct, total = 0.0, 0, 0
    for batch_idx, (inputs, targets) in enumerate(trainloader):
        inputs, targets = inputs.to(device), targets.to(device)

        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        _, predicted = outputs.max(1)
        total += targets.size(0)
        correct += predicted.eq(targets).sum().item()

    epoch_loss = running_loss / len(trainloader)
    epoch_acc = 100. * correct / total
    return epoch_loss, epoch_acc


def validate(epoch):
    model.eval()
    running_loss, correct, total = 0.0, 0, 0
    with torch.no_grad():
        for inputs, targets in testloader:
            inputs, targets = inputs.to(device), targets.to(device)

            outputs = model(inputs)
            loss = criterion(outputs, targets)

            running_loss += loss.item()
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()

    epoch_loss = running_loss / len(testloader)
    epoch_acc = 100. * correct / total
    return epoch_loss, epoch_acc

# ===============================
# 6. 主训练循环
# ===============================
num_epochs = 5
for epoch in range(num_epochs):
    train_loss, train_acc = train(epoch)
    val_loss, val_acc = validate(epoch)

    print(f"Epoch [{epoch+1}/{num_epochs}] "
          f"Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}% "
          f"| Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.2f}%")

    # ---- 写入 TensorBoard ----
    writer.add_scalars("Loss", {"train": train_loss, "val": val_loss}, epoch)
    writer.add_scalars("Accuracy", {"train": train_acc, "val": val_acc}, epoch)

    # 权重分布
    for name, param in model.named_parameters():
        writer.add_histogram(name, param.clone().cpu().data.numpy(), epoch)

writer.close()


## 🚀 使用方法

1、运行训练脚本（会在 runs/mnist_experiment/ 下生成日志文件）。

2、启动 TensorBoard：

- tensorboard --logdir=runs

3、打开浏览器访问：http://localhost:6006
- 你会看到：

    - Loss 曲线（训练/验证对比）
    
    - Accuracy 曲线
    
    - 模型结构（Graph）
    
    - 参数分布直方图

✅ 这样就是一个 科研/工程通用的 PyTorch + TensorBoard 模板。
你只需要换掉 数据集 + 模型，即可快速应用到其他项目。