Add tests for results in script vs eager mode #1430

driazati · 2019-10-08T01:37:55Z

This copies some logic from test_jit.py to check that a TorchScript'ed
model's outputs are the same as outputs from the model in eager mode.

To support differences in TorchScript / eager mode outputs, an
unwrapper function can be provided per-model.

These tests take a while (ex. 4s to 3 minutes for inception_v3), so PYTORCH_TEST_WITH_SLOW=1 must be on for the TorchScript tests to run (there are CI changes to enable this flag). With module caching changes coming after 1.3 this time should be dramatically reduced.

This is ready to go but needs #1407 to land first

driazati · 2019-10-08T01:38:27Z

test/common_utils.py

        else:
            self.assertNestedTensorObjectsEqual(output, expected, rtol=rtol, atol=atol)

+    def assertEqual(self, x, y, prec=None, message='', allow_inf=False):


This all copied from PyTorch's common_utils

this is maybe more general than it needs to be - would assertNestedTensorObjectsEqual suffice ? this function e.g. does coercion between numbers and tensor results that you wouldn't want to allow for testing model equality. it's also a good amount of code.

assertEqual has been pretty stable and widely used in PyTorch, I think it's better to just copy the entire thing and avoid having to move it over piece by piece later on if some functionality is missing from assertNestedTensorObjectsEqual

okay, but we shouldn't have both in this file, we should only have one.

driazati · 2019-10-08T01:38:42Z

test/common_utils.py

        else:
            self.assertEqual(a, b)
+
+    def checkModule(self, nn_module, args, unwrapper=None, skip=False):


This is copied from pytorch/test/jit_utils.py

eellison

looks good

eellison · 2019-10-08T03:46:09Z

test/common_utils.py

        else:
            self.assertNestedTensorObjectsEqual(output, expected, rtol=rtol, atol=atol)

+    def assertEqual(self, x, y, prec=None, message='', allow_inf=False):


this is maybe more general than it needs to be - would assertNestedTensorObjectsEqual suffice ? this function e.g. does coercion between numbers and tensor results that you wouldn't want to allow for testing model equality. it's also a good amount of code.

test/test_models.py

torchvision/models/segmentation/_utils.py

fmassa · 2019-10-08T15:38:55Z

test/test_models.py

+# as possible
+SCRIPT_MODELS_TO_FIX = [
+    'test_inception_v3',
+    'test_fcn_resnet101',


Does this mean that the models don't give the same results in eager and torchscript?

There is a bug in test_deeplabv3_resnet101. It compiles but when executing gives pytorch/pytorch#27549. The others that were here previously were just unimplemented, but they're good to go now

test/test_models.py

eellison · 2019-10-08T16:21:16Z

test/test_models.py

+SCRIPT_MODELS_TO_FIX = [
+    'test_inception_v3',
+    'test_fcn_resnet101',
+    'test_deeplabv3_resnet101',


I went to #1352 and the model results stayed the same before and after. Also IntermediateLayerGetter is used in other models which correctly return the eager and script result. So it's unclear why these two models are failing.

When i ran one in script I had to cntrl-C because it was taking so long. I would guess it's a bug somewhere in the JIT runtime.

These tests do add a of time to the test exec time (@suo's module changes should help this), this PR uses PYTORCH_TEST_WITH_SLOW and @slowTest (on by default in the CI) to get around that.

codecov-io · 2019-10-08T17:23:09Z

Codecov Report

Merging #1430 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master    #1430   +/-   ##
=======================================
  Coverage   65.97%   65.97%           
=======================================
  Files          91       91           
  Lines        7229     7229           
  Branches     1095     1095           
=======================================
  Hits         4769     4769           
  Misses       2153     2153           
  Partials      307      307

Impacted Files	Coverage Δ
torchvision/models/resnet.py	`88.62% <100%> (+0.06%)`	⬆️
torchvision/models/mobilenet.py	`87.17% <100%> (+0.16%)`	⬆️
torchvision/models/detection/rpn.py	`81.05% <100%> (-0.25%)`	⬇️
torchvision/models/quantization/resnet.py	`93.25% <100%> (ø)`	⬆️
torchvision/models/quantization/shufflenetv2.py	`90.16% <100%> (ø)`	⬆️
torchvision/models/quantization/mobilenet.py	`87.75% <100%> (ø)`	⬆️
torchvision/models/shufflenetv2.py	`86.04% <100%> (+0.16%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e8bcf8...6e07772. Read the comment docs.

fmassa

Thanks for the PR!

The change in config.yml can't get in without a corresponding change in config.yml.in.

What are the tests that are taking too long now? Can we maybe enable them in a separate PR, because we might need a separate CI for it maybe.

fmassa · 2019-10-14T11:08:02Z

.circleci/config.yml


          export DOCKER_IMAGE=soumith/conda-cuda
-          export VARS_TO_PASS="-e PYTHON_VERSION -e BUILD_VERSION -e PYTORCH_VERSION -e UNICODE_ABI -e CU_VERSION"
+          export PYTORCH_TEST_WITH_SLOW=1


Those changes should be done in config.yml.in, and then we should call python regenerate.py that will create the config.yml for you.

Also, there is a single CI test that runs with this configuration (gpu.medium), which doesn't get tested on Python2 for example.

I think that it might be better to separate this change from this PR, because it might require changing how we handle CI builds.

Most of the tests have a pretty significant bump in runtime

Top 10 with PYTORCH_TEST_WITH_SLOW=0:

63.04s call test/test_models.py::ModelTester::test_memory_efficient_densenet 60.13s call test/test_models.py::ModelTester::test_fasterrcnn_double 48.19s call test/test_models.py::ModelTester::test_keypointrcnn_resnet50_fpn 41.29s call test/test_models.py::ModelTester::test_maskrcnn_resnet50_fpn 36.69s call test/test_models.py::ModelTester::test_fasterrcnn_resnet50_fpn 24.86s call test/test_models.py::ModelTester::test_vgg19_bn 24.33s call test/test_models.py::ModelTester::test_vgg16_bn 23.98s call test/test_models.py::ModelTester::test_vgg13 23.49s call test/test_models.py::ModelTester::test_vgg11_bn 23.13s call test/test_models.py::ModelTester::test_deeplabv3_resnet101

Top 10 after PYTORCH_TEST_WITH_SLOW=1:

575.13s call test/test_models.py::ModelTester::test_densenet121 276.74s call test/test_models.py::ModelTester::test_fcn_resnet101 218.32s call test/test_models.py::ModelTester::test_inception_v3 182.89s call test/test_models.py::ModelTester::test_shufflenet_v2_x1_0 179.08s call test/test_models.py::ModelTester::test_mobilenet_v2 171.40s call test/test_models.py::ModelTester::test_deeplabv3_resnet101 127.75s call test/test_models.py::ModelTester::test_googlenet 93.93s call test/test_models.py::ModelTester::test_resnext50_32x4d 70.70s call test/test_models.py::ModelTester::test_memory_efficient_densenet

PyTorch does have a CI job specifically for slow tests, maybe something like that would be better here too. I can separate the CI changes to a different PR, leaving this one to rely on manually setting PYTORCH_TEST_WITH_SLOW=1 to run the TorchScript tests.

Yeah, I agree. Let me add a specific CI test for testing this in a single configuration for now

@driazati tests are failing for faster rcnn with precision issues. Can you have a look (FYI I force-pushed to your branch to rebase). I'm looking into configuring CI to run the slow tests

This copies some logic from `test_jit.py` to check that a TorchScript'ed model's outputs are the same as outputs from the model in eager mode. To support differences in TorchScript / eager mode outputs, an `unwrapper` function can be provided per-model.

…kmodule

driazati

~~There was some jitter on the test results for a couple models but there was no change to the model code so there are some new --accept results~~

driazati · 2019-10-23T20:53:40Z

test/common_utils.py


 ACCEPT = os.getenv('EXPECTTEST_ACCEPT')
+TEST_WITH_SLOW = os.getenv('PYTORCH_TEST_WITH_SLOW', '0') == '1'
+TEST_WITH_SLOW = True # TODO: Delete this line once there is a PYTORCH_TEST_WITH_SLOW aware CI job


This should be deleted once the aforementioned CI job is up and running

…kmodule

eellison

This seems good to me to land - probably need to rebase one more time to make sure that it works on master. What do you think @fmassa ?

…kmodule

fmassa · 2019-10-31T12:57:20Z

There seems to be a problem with latest PyTorch binaries

AttributeError: 'RecursiveScriptModule' object has no attribute 'forward'

CI on master seems to be working OK, #1542

I'm ok merging this as soon as CI is green (that's BTW why I haven't merged it last week when I first rebased it)

…kmodule

* Add tests for results in script vs eager mode This copies some logic from `test_jit.py` to check that a TorchScript'ed model's outputs are the same as outputs from the model in eager mode. To support differences in TorchScript / eager mode outputs, an `unwrapper` function can be provided per-model. * Fix inception, use PYTORCH_TEST_WITH_SLOW * Update * Remove assertNestedTensorObjectsEqual * Add PYTORCH_TEST_WITH_SLOW to CircleCI config * Add MaskRCNN unwrapper * fix prec args * Remove CI changes * update * Update * remove expect changes * Fix tolerance bug * Fix breakages * Fix quantized resnet * Fix merge errors and simplify code * DeepLabV3 has been fixed * Temporarily disable jit compilation

driazati commented Oct 8, 2019

View reviewed changes

eellison reviewed Oct 8, 2019

View reviewed changes

fmassa reviewed Oct 8, 2019

View reviewed changes

eellison reviewed Oct 8, 2019

View reviewed changes

driazati mentioned this pull request Oct 8, 2019

[jit] torchvision deeplabv3_resnet101 fails in interpreter pytorch/pytorch#27549

Closed

driazati requested review from fmassa and eellison October 8, 2019 20:25

fmassa requested changes Oct 14, 2019

View reviewed changes

eellison mentioned this pull request Oct 15, 2019

Added correctness tests for classification models #1468

Merged

Your Name added 8 commits October 18, 2019 15:19

Fix inception, use PYTORCH_TEST_WITH_SLOW

ccfd00b

Update

9299fe2

Remove assertNestedTensorObjectsEqual

fed5ce5

Add PYTORCH_TEST_WITH_SLOW to CircleCI config

c842557

Add MaskRCNN unwrapper

d4ca330

fix prec args

557a15c

Remove CI changes

b3471f9

fmassa force-pushed the driazati/checkmodule branch from 6df1d48 to b3471f9 Compare October 18, 2019 13:19

Your Name added 3 commits October 23, 2019 11:41

Merge branch 'master' of github.com:pytorch/vision into driazati/chec…

44f02f1

…kmodule

update

7cc2897

Update

8922bef

driazati commented Oct 23, 2019

View reviewed changes

Your Name and others added 3 commits October 23, 2019 14:04

remove expect changes

0ce7102

Fix tolerance bug

fda9a92

Merge branch 'master' of github.com:pytorch/vision into driazati/chec…

9e60684

…kmodule

driazati requested a review from fmassa October 31, 2019 01:37

eellison reviewed Oct 31, 2019

View reviewed changes

Merge branch 'master' of github.com:pytorch/vision into driazati/chec…

4e84e0f

…kmodule

Your Name and others added 8 commits November 5, 2019 16:15

Merge branch 'master' of github.com:pytorch/vision into driazati/chec…

7578d46

…kmodule

Fix breakages

2da2d19

Fix quantized resnet

f00dcd1

Merge branch 'master' of github.com:pytorch/vision into driazati/chec…

a440686

…kmodule

Merge branch 'master' of github.com:pytorch/vision into driazati/chec…

c87eac4

…kmodule

Fix merge errors and simplify code

6e7bc18

DeepLabV3 has been fixed

305b8d7

Temporarily disable jit compilation

6e07772

fmassa merged commit 227027d into master Nov 30, 2019

fmassa deleted the driazati/checkmodule branch November 30, 2019 15:23

fmassa mentioned this pull request Nov 30, 2019

is faster rcnn scriptable？I tried，but failed~ #1618

Closed

Add tests for results in script vs eager mode #1430

Add tests for results in script vs eager mode #1430

Uh oh!

Conversation

driazati commented Oct 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

driazati Oct 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov-io commented Oct 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

driazati left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

fmassa commented Oct 31, 2019

Uh oh!

Uh oh!

driazati commented Oct 8, 2019 •

edited

Loading

driazati Oct 8, 2019 •

edited

Loading

codecov-io commented Oct 8, 2019 •

edited

Loading

driazati left a comment •

edited

Loading