Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low precision bug for batch-norm models #1

Closed
zoharli opened this issue Nov 23, 2021 · 4 comments
Closed

Low precision bug for batch-norm models #1

zoharli opened this issue Nov 23, 2021 · 4 comments

Comments

@zoharli
Copy link

zoharli commented Nov 23, 2021

When I verify a CROWN-ibp pre-trained model using alpha-beta-CROWN, it consistenly shows Result: image x prediction is incorrect, skipped. As a result the overall verified acc. is very low depsite CROWN-verified acc. is relatively normal.

I guess that's batch-norm support problem, because in the printed log I see two new warnings:

/home/zhangheng/anaconda3/envs/alpha-beta-crown/lib/python3.7/site-packages/torch/nn/functional.py:2113: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

  if size_prods == 1:

/home/zhangheng/anaconda3/envs/alpha-beta-crown/lib/python3.7/site-packages/torch/onnx/symbolic_helper.py:680: UserWarning: ONNX export mode is set to inference mode, but operator batch_norm is set to training  mode. The model will be exported in inference, as specified by the export mode.

  training_mode + ", as specified by the export mode.")

I tried some scripts without batchnorm included in exp_configs folder, there are no these two warnings.

Here are the log file, please have a look, thank you!

CROWN-ibp-model-verify.txt

@huanzhang12
Copy link
Member

@zoharli Thanks for reporting and I will try to reproduce this on my side and keep your posted.

@zoharli
Copy link
Author

zoharli commented Dec 1, 2021

@huanzhang12 Thank you! Below you can find my model definition and checkpoint file, you can try it directly, hope this can help.
BTW, when testing with CROWN, I find that sometimes clean acc. gets even lower than CROWN acc, is that normal? Thank you!

def cifar_model_deep(num_classes=10,bn=True,channels=None):
    # cifar deep
    if channels==None:
        channels=[32,64,64,128,128]
    assert(len(channels)==5)
    module_list=[
        nn.Conv2d(3, channels[0], 3, stride=1, padding=1),
        nn.ReLU(),
        nn.Conv2d(channels[0], channels[1], 4, stride=2, padding=1),
        nn.ReLU(),
        nn.Conv2d(channels[1], channels[2], 3, stride=1, padding=1),
        nn.ReLU(),
        nn.Conv2d(channels[2], channels[3], 4, stride=2, padding=1),
        nn.ReLU(),
        nn.Conv2d(channels[3], channels[4], 4, stride=2, padding=1),
        nn.ReLU(),
        # nn.Conv2d(128, 256, 4, stride=2, padding=1),
        # nn.ReLU(),
        Flatten(),
        nn.Linear(channels[4]*4*4, 100),
        nn.ReLU(),
        nn.Linear(100, num_classes)
    ]

    new_module_list=[]
    for m in module_list:
        new_module_list.append(m)
        if bn and isinstance(m,nn.Conv2d):
            new_module_list.append(nn.BatchNorm2d(m.out_channels))

    model=nn.Sequential(*new_module_list)

    print(model)

    return model

cifar_model_deep_cibp.zip

@huanzhang12
Copy link
Member

@zoharli We've found the issue for this model. The problem is within the model loader, which did not set the model into eval() mode, so the batchnorm mean and variance were unexpectedly changed. This is easy to fix in utils.py:

https://github.com/huanzhang12/alpha-beta-CROWN/blob/590412434734bf80863cf37da8421f422e39ef37/complete_verifier/utils.py#L178-L180

Just add model_ori.eval() after the above lines to put the model in evaluation mode. We will fix this in our repository, but before we update our repository you can simply add model_ori.eval() and it should solve this problem.

Clean acc. should not be lower than CROWN acc, if that still happens to you after adding model_ori.eval() please let me know, thanks!

@zoharli
Copy link
Author

zoharli commented Dec 1, 2021

Oh, I get it. Thank you, Truly appreciate your timely help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants