Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ansor] Support multiple output ops and fix Python API printing #6584

Merged
merged 6 commits into from Oct 2, 2020

Conversation

comaniac
Copy link
Contributor

As I'm trying to use Ansor to tune the operators generated from te.gradient, I fixed some issues in this PR so that now Ansor can tune the backward ops (regardless the performance). Detail change list:

  • [Multiple Output] Create a schedule with all outputs instead of the first output.
  • [Python API Printing] Change tvm.thread_axis to te.thread_axis.
  • [Python API Printing] Add prefix to all iterators to avoid name conflict; otherwise we may refer to the wrong iterator with the same name hint (e.g., ax0). For example:
ax0, ... = tuple(pad_temp_data_grad.op.axis) + ...
...
ax0, ... = tuple(pad_temp_shared.op.axis) + ...
s[pad_temp_data_grad].split(ax0, factor=4)

where ax0 in split should refer to the ax0 from pad_temp_data_grad, but it has been overridden by pad_temp_shared after cache_read. This PR improves CleanName by providing an optional prefix so that we can differentiate those iterators by their stages.

cc @merrymercy @jcf94

@comaniac comaniac marked this pull request as draft September 29, 2020 00:30
@comaniac
Copy link
Contributor Author

Current issue: te.InferBound failed for schedules with multiple outputs. Investigating.

@comaniac
Copy link
Contributor Author

Two more bug fixing:

  1. Only specify output tensors when creating a new schedule; otherwise InferBound may crash.

  2. Update compute DAG attributes such as access analyzer and ops after layout rewriting (cc @minminsun ).

All auto_scheduler tests are now passed.

@comaniac comaniac marked this pull request as ready for review September 29, 2020 01:28
@merrymercy merrymercy self-assigned this Sep 29, 2020
Comment on lines +972 to +974
p_dag->access_analyzer = AccessAnalyzer(p_dag->tensors);
p_dag->ops = p_dag->access_analyzer->ops_topo_order;
p_dag->flop_ct = FlopEstimator().EstimateFlop(p_dag->ops);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did it work before without the fix?

Copy link
Contributor Author

@comaniac comaniac Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I know the answer... For the ApplySteps in compute DAG, previously we simply put all non-placeholder ops to the list and take the last op (assuming it is the only output) to create a new schedule. After using access_analyzer.is_output to check what ops should be put to the list, is_output throws exceptions due to the out-of-date is_output map. That's why I found this issue. Maybe most use cases so far don't enable layout rewrite?

@tqchen tqchen merged commit 72969b2 into apache:master Oct 2, 2020
@tqchen
Copy link
Member

tqchen commented Oct 2, 2020

Thanks @comaniac @merrymercy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants