New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ansor] Support multiple output ops and fix Python API printing #6584
Conversation
Current issue: |
Two more bug fixing:
All auto_scheduler tests are now passed. |
p_dag->access_analyzer = AccessAnalyzer(p_dag->tensors); | ||
p_dag->ops = p_dag->access_analyzer->ops_topo_order; | ||
p_dag->flop_ct = FlopEstimator().EstimateFlop(p_dag->ops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did it work before without the fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I know the answer... For the ApplySteps in compute DAG, previously we simply put all non-placeholder ops to the list and take the last op (assuming it is the only output) to create a new schedule. After using access_analyzer.is_output
to check what ops should be put to the list, is_output
throws exceptions due to the out-of-date is_output
map. That's why I found this issue. Maybe most use cases so far don't enable layout rewrite?
Thanks @comaniac @merrymercy |
As I'm trying to use Ansor to tune the operators generated from
te.gradient
, I fixed some issues in this PR so that now Ansor can tune the backward ops (regardless the performance). Detail change list:tvm.thread_axis
tote.thread_axis
.ax0
). For example:where
ax0
insplit
should refer to theax0
frompad_temp_data_grad
, but it has been overridden bypad_temp_shared
aftercache_read
. This PR improvesCleanName
by providing an optional prefix so that we can differentiate those iterators by their stages.cc @merrymercy @jcf94