Add MFU (Model FLOPs Utilization) logging support

**Describe the feature**
Please describe the feature requested here(请在这里描述需求)

希望在训练/推理过程中增加 MFU (Model FLOPs Utilization) 的日志输出。
目前在监控训练性能时，缺乏对硬件利用率的直观指标，MFU 作为衡量模型计算利用率的标准指标，可以帮助我们快速判断算力是否被充分使用，以及发现潜在的性能瓶颈。

**Paste any useful information**
Paste any useful information, including papers, github links, etc.(请在这里描述其他有用的信息，比如相关的论文地址，github链接等)

https://github.com/volcengine/verl/blob/main/verl/utils/flops_counter.py

**Additional context**
Add any other context or information here(其他信息可以写在这里)

- 期望的日志形式：
  - 每个 epoch 或者每 N steps 输出一次 MFU 值
  - 同时记录 batch size、吞吐量、GPU 利用率等信息，方便定位问题
- 应用场景：
  - 性能调优：判断是否需要调整 batch size、并行策略
  -  集群监控：快速比较不同实验、不同配置下的算力利用情况

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add MFU (Model FLOPs Utilization) logging support #5791

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add MFU (Model FLOPs Utilization) logging support #5791

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions