Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove tophub from being the default configuration when compiling #59

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

tkonolige
Copy link
Contributor

Prompted by apache/tvm#10474, this RFC proposes removing tophub from the default compilation workflow. If you rely on tophub being the implicit default, please comment on this RFC.

@masahi @comaniac

# Unresolved questions
[unresolved-questions]: #unresolved-questions

Will there be any actual performance impact in removing tophub from being the default?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the CI, at least we know that the workloads in MobileNet on GPU require TopHub schedules to be compiled. The default schedules are invalid for some GPUs (e.g., the one used in CI) and will result in compilation failure. Given this fact, we cannot guarantee default schedules are valid for all workloads on GPU, and TopHub could more or less help complement this issue.

Thus, if we go with this RFC, the concern I may have is how should we port the TopHub schedule to be the default one? I could imagine that we may end up having a decision-tree like logic for each op to dispatch the desired default schedule. This may make TOPI even more ad-hoc and complicate. I would definitely like to hear others feedback about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the default schedules not work for the specific GPU on CI or do they not work for any GPU? My guess is they don't work for any GPU because they have never been tested. And they haven't been tested because using tophub by default causes the default schedules to not be used. To me it seems like this is a good reason to bring to not use tophub by default.

Branching could be difficult. How many different targets are used in the tophub schedules?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but my impression is the default schedule must work for at least one GPU by the time it was committed, because there's no point to put an invalid schedule anyways (but maybe that valid schedule is no longer valid...)

I also feel that branching is difficult. IIRC, TopHub CUDA should include 3-4 GPUs, but once we rely on default schedules, any future "default" schedules have to be in TOPI as well.

In short, I feel the easiest way for now is 1) removing the tophub context as you suggested, but 2) making sure default schedules could pass CIs. Meanwhile, 3) adding a deprecate warning.

cc @tqchen @masahi @junrushao1994 @areusch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah a couple thoughts:

  • right now we have no way to assert that a particular schedule is actually selected or used by TVM. we ran into this with microTVM's original prototype where we had to hack the tuning log lookup to always return one schedule. if we had such a thing, we could write tests that assert that a particular TopHub schedule was applied and then we could document which GPU it's for.
  • it does seem like, lacking this way of testing (and correct me if i'm missing something--perhaps we do have a way to test this), we will be forced to make a hard cutover to a world where we document which hardware each schedule should compile on via some type of test
  • whatever we do here, there should be a very easy way to get back the old behavior since it's likely some folks will need to use TopHub
  • the root problem @tkonolige set out to address was differences in compiler output between different entry points. it seems like whatever we do above, we should absolutely make sure we shore up that discrepancy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • right now we have no way to assert that a particular schedule is actually selected or used by TVM. we ran into this with microTVM's original prototype where we had to hack the tuning log lookup to always return one schedule. if we had such a thing, we could write tests that assert that a particular TopHub schedule was applied and then we could document which GPU it's for.
    That would be very nice to have. It'll be a little bit of work to pipe through though.
  • it does seem like, lacking this way of testing (and correct me if i'm missing something--perhaps we do have a way to test this), we will be forced to make a hard cutover to a world where we document which hardware each schedule should compile on via some type of test
  • whatever we do here, there should be a very easy way to get back the old behavior since it's likely some folks will need to use TopHub
    with autotvm.tophub.context(target) is the way to do that (added this to the RFC).
  • the root problem @tkonolige set out to address was differences in compiler output between different entry points. it seems like whatever we do above, we should absolutely make sure we shore up that discrepancy.
    This is one part of the issues I've been trying to solve. The other is behavior that is based on content external to the TVM repository.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think @mehrdadh is working on a test framework that selects a specific schedule for an operator. perhaps that may help?


Will there be any actual performance impact in removing tophub from being the default?

Should we provide a deprecation notice, or a change notice, or no notice at all?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with this RFC, I vote for a WARNING in graph executor saying that TopHub is no longer applied automatically, please xxx (provide a link or just describe how to bring TopHub back if users want). We could further discuss how long should the warning be. Formally, it should be there for at least one release cycle (0.9 -> 1.0), but we could make it shorter if everyone agrees.

@tkonolige
Copy link
Contributor Author

I ran with and without tophub on a selection of models in tvm.relay.testing:

resnet
With Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.5029       0.1609       1.1974       0.1431       0.4449

Without Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.0885       0.0842       0.1020       0.0823       0.0072

densenet
With Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   1.7610       1.9728       1.9733       0.9131       0.4239

Without Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   4.3721       5.5795       5.5805       0.5085       1.9678

inception_v3
With Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   4.5442       6.2537       6.2550       0.3917       2.3228

Without Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
  43.6982      64.7263      64.9195       0.3586      26.9297

mlp
With Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.0184       0.0185       0.0197       0.0172       0.0010

Without Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.0154       0.0152       0.0167       0.0148       0.0007

mobilenet
With Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.0872       0.0832       0.1042       0.0816       0.0086

Without Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.0844       0.0840       0.0896       0.0816       0.0028

squeezenet
With Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.1058       0.1051       0.1103       0.1033       0.0024

Without Tophub
Execution time summary:
 mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
   0.1056       0.1051       0.1084       0.1040       0.0016

@comaniac
Copy link
Contributor

comaniac commented Mar 8, 2022

I ran with and without tophub on a selection of models in tvm.relay.testing:

Thanks for the experiments. What's your target device?

@tkonolige
Copy link
Contributor Author

@comaniac rtx-3070

Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be helpful to lay out:

  1. what's the API being added here to let people specify TopHub? How should people explicitly enable it after this RFC lands?
  2. how are we shoring up the config differences between VMCompiler.optimize and VMCompiler.lower? which i assume we answer with (1), but then need to explain the impact as it's explained in the RFC PR.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

Will there be any actual performance impact in removing tophub from being the default?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah a couple thoughts:

  • right now we have no way to assert that a particular schedule is actually selected or used by TVM. we ran into this with microTVM's original prototype where we had to hack the tuning log lookup to always return one schedule. if we had such a thing, we could write tests that assert that a particular TopHub schedule was applied and then we could document which GPU it's for.
  • it does seem like, lacking this way of testing (and correct me if i'm missing something--perhaps we do have a way to test this), we will be forced to make a hard cutover to a world where we document which hardware each schedule should compile on via some type of test
  • whatever we do here, there should be a very easy way to get back the old behavior since it's likely some folks will need to use TopHub
  • the root problem @tkonolige set out to address was differences in compiler output between different entry points. it seems like whatever we do above, we should absolutely make sure we shore up that discrepancy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants