New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoScheduler] Tutorial on auto-scheduling a network for GPU #6882
Conversation
59496c9
to
a0577cb
Compare
3430105
to
018c4f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm interested in running some of these tutorials to get an idea on the capabilities of the auto-scheduler. However, I think this PR contains a lot of changes/clean-up that aren't related to the tutorial. Would it be possible to break this up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take another look once the items in TODOs are addressed.
@mbaret you actually need everything in this PR to make the tutorial work as it describes. Since auto_scheduler is still an experimental feature in upstream and we don't need to backport other changes to v0.7.1, I think it's fine to keep as it is.
# After auto-tuning, we can compile the network with the best schedules we found. | ||
# All measurement records are dumpled into the log file during auto-tuning, | ||
# so we can read the log file and load the best schedules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to also mention what happen and what messages you will see if there is no valid schedules in the log file.
Personally, the issue I find is that when commits don't describe the changes they make, it can be hard to determine what patch in history led to a certain change in behaviour (I've been burnt by this a few times now). Generally, I assume 'add tutorial' type commits won't change behaviour unless it's explicitly flagged. If all these changes are required, perhaps a middle-ground would be to flag them explicitly in the commit message? |
I agree with your that separating PRs for different functions is important for long-term maintenance, and we should do that for every released feature. However, my point is since we haven't fully released auto_scheduler and the total number of auto_scheduler PRs is just a few, it should be easy to identify the PR that changes the certain behavior (in fact, since currently only a few people using the upstream auto_scheduler, I don't think this would be an issue for now). IMHO, it's fine to follow this principle after auto_scheduler is able to perform end-to-end tuning on all three platforms (x86, ARM, NVIDIA GPU). Meanwhile, my primary concern of separating changes to many small PRs is that it will slower the auto_scheduler upstream process due to the high CI traffic. |
def _print_table_info(self, next_task_idx): | ||
# table header | ||
_ffi_api.PrintTitle("Task Scheduler") | ||
print("| ID | Latency (ms) | Speed (GFLOPS) | Trials |") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we extract more information? Like operator name (Conv2D, softmax...) and its shape (1x3x224x224)? Only ID, we have to match its detail information again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not easy to extract this information by parsing the compute dag.
One way to achieve this is to attach this information by using the attrs
in te.compute
when defining ops in TOPI compute functions.
I leave this to future PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
Would you agree then that a more explicit commit message would be valuable? |
tasks, task_weights = auto_scheduler.extract_tasks(mod["main"], params, target) | ||
|
||
# Define the objective as the end-to-end exeuction time of the network | ||
objective = lambda costs: sum(c * w for c, w in zip(costs, task_weights)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As someone not very familiar with the auto-scheduler, it seems a bit strange to me that this is exposed here. Could this not be a default objective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point. I changed the interface to hide this from users.
I'm not sure how helpful it is, but I'll let @merrymercy decide. |
e91eff0
to
3e2640b
Compare
3e2640b
to
4b8db6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just some nits.
3f91bd1
to
912e990
Compare
912e990
to
9fe0a25
Compare
Let me merge this PR first because it fixed multiple bugs. |
@mbaret The tutorial is online now https://tvm.apache.org/docs/tutorials/auto_scheduler/tune_network_cuda.html |
…#6882) * add a tutorial on auto-scheduling a network for cuda * fix typo * fix training time printing * fix lint * fix * upload logs * fix * use weighted sum as the default objective function * update ci logs * fix the bug in kill_child_processes * fix test * address comments * add early stopping in task scheduler & fix a stuck issue in measurement * fix lint * trigger CI * fix early stopping
…#6882) * add a tutorial on auto-scheduling a network for cuda * fix typo * fix training time printing * fix lint * fix * upload logs * fix * use weighted sum as the default objective function * update ci logs * fix the bug in kill_child_processes * fix test * address comments * add early stopping in task scheduler & fix a stuck issue in measurement * fix lint * trigger CI * fix early stopping
…#6882) * add a tutorial on auto-scheduling a network for cuda * fix typo * fix training time printing * fix lint * fix * upload logs * fix * use weighted sum as the default objective function * update ci logs * fix the bug in kill_child_processes * fix test * address comments * add early stopping in task scheduler & fix a stuck issue in measurement * fix lint * trigger CI * fix early stopping
kill_child_processes
Todo