BatchNorm training=True in some Timm classes #1338

yuanyao-nv · 2024-04-03T00:31:17Z

Previously we resolved the issue (#1262) where instance norm in pytorch was decomposing into batch norm with training mode set to True.

I've also encountered some issues with BN set to True in certain timm classes. If you look in this file https://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/std_conv.py, all calls to F.batch_norm() have training=True. Any thoughts on what would be the best solution here? I'm wondering if there's a way to always set training to False during tracing.

The text was updated successfully, but these errors were encountered:

justinchuby · 2024-04-03T00:44:39Z

I wonder if it would make sense to replace when to false after we obtain the model?

justinchuby · 2024-04-03T03:06:01Z

@thiagocrepaldi

yuanyao-nv · 2024-04-03T16:50:16Z

I wonder if it would make sense to replace when to false after we obtain the model?

Yea I think someone brought up this point before: unless we anticipate use cases for training ops, onnx-rewriter can potentially just set all the training attributes it sees to False? But then that's assuming the user will always run the onnx-rewriter a posteriori.

justinchuby · 2024-04-03T16:54:40Z

I wonder why training=True in the first place? Was pytorch trying to create some kind of training behavior even when eval() was called on the model?

BowenBao · 2024-04-04T01:39:58Z

But then that's assuming the user will always run the onnx-rewriter a posteriori.

This can be integrated into exporter since it operates on standard onnx domains.

In onnx, training only affects whether bn emits extra outputs. cc @gramalingam I think that's our conclusion last time. So if these outputs are unused, it is safe to remove the outputs and flip the flag in the node.

Addresses Issue #1338

gramalingam mentioned this issue Jun 6, 2024

Update handling of batch-norm in DCE optimization #1591

Merged

gramalingam added a commit that referenced this issue Jun 7, 2024

Update handling of batch-norm in DCE optimization (#1591)

87618e8

Addresses Issue #1338

justinchuby mentioned this issue Aug 8, 2024

[Pattern] BatchNorm training=True -> False #1791

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BatchNorm training=True in some Timm classes #1338

BatchNorm training=True in some Timm classes #1338

yuanyao-nv commented Apr 3, 2024

justinchuby commented Apr 3, 2024

justinchuby commented Apr 3, 2024

yuanyao-nv commented Apr 3, 2024 •

edited

Loading

justinchuby commented Apr 3, 2024

BowenBao commented Apr 4, 2024

BatchNorm training=True in some Timm classes #1338

BatchNorm training=True in some Timm classes #1338

Comments

yuanyao-nv commented Apr 3, 2024

justinchuby commented Apr 3, 2024

justinchuby commented Apr 3, 2024

yuanyao-nv commented Apr 3, 2024 • edited Loading

justinchuby commented Apr 3, 2024

BowenBao commented Apr 4, 2024

yuanyao-nv commented Apr 3, 2024 •

edited

Loading