Skip to content

Conversation

@ZolotukhinM
Copy link

@ZolotukhinM ZolotukhinM commented Feb 13, 2020

Stack from ghstack:

It was requested in #33114.

Differential Revision: D19910600

@ZolotukhinM ZolotukhinM requested a review from apaszke as a code owner February 13, 2020 00:56
ZolotukhinM pushed a commit that referenced this pull request Feb 13, 2020
It was requested in #33114.

ghstack-source-id: b4e037b
Pull Request resolved: #33261
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Feb 13, 2020
@dr-ci
Copy link

dr-ci bot commented Feb 13, 2020

💊 CircleCI build failures summary and remediations

As of commit 07e5925:

None of the build failures appear to be your fault.

  • 1/1 recognized as flaky ❄️
    • Re-run these jobs?

Detailed failure analysis

One may explore the probable reasons each build failed interactively on the Dr. CI website.

❄️ 1 failure recognized as flaky

The following build failures have been detected as flaky and may not be your fault:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (1/1)

Step: "Test" (full log | pattern match details) ❄️

Feb 14 23:05:27 AssertionError: 6 not less than or equal to 1e-05 :
Feb 14 23:05:27 ---------------------------------------------------------------------- 
Feb 14 23:05:27 Traceback (most recent call last): 
Feb 14 23:05:27   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_distributed.py", line 175, in wrapper 
Feb 14 23:05:27     self._join_processes(fn) 
Feb 14 23:05:27   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_distributed.py", line 256, in _join_processes 
Feb 14 23:05:27     self._check_return_codes(elapsed_time) 
Feb 14 23:05:27   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_distributed.py", line 280, in _check_return_codes 
Feb 14 23:05:27     self.assertEqual(first_process.exitcode, 0) 
Feb 14 23:05:27   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 893, in assertEqual 
Feb 14 23:05:27     super(TestCase, self).assertLessEqual(abs(x - y), prec, message) 
Feb 14 23:05:27 AssertionError: 6 not less than or equal to 1e-05 :  
Feb 14 23:05:27  
Feb 14 23:05:27 ---------------------------------------------------------------------- 
Feb 14 23:05:27 Ran 88 tests in 99.362s 
Feb 14 23:05:27  
Feb 14 23:05:27 FAILED (failures=1, skipped=1) 
Feb 14 23:05:27  
Feb 14 23:05:27 Generating XML reports... 
Feb 14 23:05:27 Traceback (most recent call last): 
Feb 14 23:05:27   File "test/run_test.py", line 486, in <module> 
Feb 14 23:05:27     main() 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 3 times.

@ZolotukhinM ZolotukhinM requested a review from suo February 13, 2020 17:43
Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good. Before we go forward this I would like to figure out if we even need a getCustomPostFusionPasses run. The original justification was This is done last to give internal optimization passes priority. Do we have an actual user who would want to do this (or an idea of such a user)? If we can avoid bifurcating the system, we should.

@ZolotukhinM
Copy link
Author

Thanks for taking a look! I'm now also thinking of moving pre-fusion passes closer to the fusion - probably right before it. That would allow us to use this mechanism for registering non-standard fusers if we need it. I'll check with the issue author if that would work for them.

@bwasti, do you remember the motivation to move custom passes after the fusion?

@ZolotukhinM
Copy link
Author

From an offline discussion with @bwasti: the reason of the move was that we had passes that needed to be run after BMM/QuantFusion/etc.

IMHO it seems a bit arbitrary that we have these two places for registering a pass, and probably in future we might want to add a pass at some other location - at that point I think we would need a more flexible mechanism for specifying when to run the pass. But for now I think having these two places should be good enough - that supports the users that we know about, and it doesn't require too much complexity in the code. @eellison, @bwasti, does it sound reasonable?

@seanprime7
Copy link

seanprime7 commented Feb 14, 2020

One suggestion I can make here is to make the registration take the pass and the priority. Then run the passes based on the priorities.

You can give builtin passes large enough priority gaps, e.g., DecomposeOps -> 400 and LowerSimpleTuples -> 600, then custom passes can be registered to run before with 0 <= priority < 400, after with priority >600, or between these optimizations with 400 < priority < 600..

ZolotukhinM pushed a commit that referenced this pull request Feb 14, 2020
It was requested in #33114.

ghstack-source-id: 3654e0f
Pull Request resolved: #33261
RegisterPostFusionPass(Pass p);
};

using RegisterPass = RegisterPostFusionPass;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we have like a total of 5 (five) users of this API but I wonder if it would be better to have a compiler error than a user error as we had previously.
To enumerate, we could:

  • RegisterPass = RegisterPostFusionPass
  • RegisterPass = RegisterPreFusionPass
  • Either of the above two, and add a deprecation message to RegisterPass
  • Not set an alias for RegisterPass, so that users get a compile time error.

Any thoughts here ? It doesn't matter that much, especially if this is a temporary pass manager solution (although I wouldn't count on it being temporary).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a need to compile error on this, since we can preserve the existing behavior exactly. Adding a deprecate warning is also a bit sketchy as we don't have any specific plans to actually deprecate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we did last time is silently change semantics. I think compile time errors are generally preferable.

@facebook-github-bot
Copy link
Contributor

@ZolotukhinM merged this pull request in e1a8958.

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many things happening between the pre- and post- phases in here, so maybe we shouldn't say that it's the fusion that delimits those? Presumably we will be putting even more things in there in the future. Why not call them CustomPrePasses and CustomPostPasses?

@ZolotukhinM
Copy link
Author

There are many things happening between the pre- and post- phases in here, so maybe we shouldn't say that it's the fusion that delimits those? Presumably we will be putting even more things in there in the future. Why not call them CustomPrePasses and CustomPostPasses?

Fair point, I can change the names.

@facebook-github-bot facebook-github-bot deleted the gh/ZolotukhinM/155/head branch February 18, 2020 15:18
@ZolotukhinM
Copy link
Author

There are many things happening between the pre- and post- phases in here, so maybe we shouldn't say that it's the fusion that delimits those? Presumably we will be putting even more things in there in the future. Why not call them CustomPrePasses and CustomPostPasses?

@apaszke, please take a look at #33674.

ttumiel pushed a commit to ttumiel/pytorch that referenced this pull request Mar 4, 2020
…h#33261)

Summary:
Pull Request resolved: pytorch#33261

It was requested in pytorch#33114.

Test Plan: Imported from OSS

Differential Revision: D19910600

Pulled By: ZolotukhinM

fbshipit-source-id: 827f1744b97f386065a21d1ba5d82c1f90edbe46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants