[AutoScheduler] Tutorial on auto-scheduling a network for GPU #6882

merrymercy · 2020-11-08T05:43:58Z

Add a tutorial on auto-scheduling a network for GPU.
Fix a bug in kill_child_processes
Improve the interface, output message, and early stopping of the task scheduler
Improve 'call_func_with_timeout`

Todo

Upload pre-tuned logs into CI
Add a section on how to register new operators
Improve fallback messages and add a section to explain them

merrymercy · 2020-11-08T06:08:41Z

cc @jcf94 @FrozenGene @comaniac @tqchen @junrushao1994

mbaret

I'm interested in running some of these tutorials to get an idea on the capabilities of the auto-scheduler. However, I think this PR contains a lot of changes/clean-up that aren't related to the tutorial. Would it be possible to break this up?

comaniac

Will take another look once the items in TODOs are addressed.
@mbaret you actually need everything in this PR to make the tutorial work as it describes. Since auto_scheduler is still an experimental feature in upstream and we don't need to backport other changes to v0.7.1, I think it's fine to keep as it is.

src/auto_scheduler/feature.cc

tutorials/auto_scheduler/tune_network_cuda.py

comaniac · 2020-11-08T23:35:40Z

tutorials/auto_scheduler/tune_network_cuda.py

+# After auto-tuning, we can compile the network with the best schedules we found.
+# All measurement records are dumpled into the log file during auto-tuning,
+# so we can read the log file and load the best schedules.


Better to also mention what happen and what messages you will see if there is no valid schedules in the log file.

mbaret · 2020-11-08T23:48:19Z

@mbaret you actually need everything in this PR to make the tutorial work as it describes. Since auto_scheduler is still an experimental feature in upstream and we don't need to backport other changes to v0.7.1, I think it's fine to keep as it is.

Personally, the issue I find is that when commits don't describe the changes they make, it can be hard to determine what patch in history led to a certain change in behaviour (I've been burnt by this a few times now). Generally, I assume 'add tutorial' type commits won't change behaviour unless it's explicitly flagged. If all these changes are required, perhaps a middle-ground would be to flag them explicitly in the commit message?

comaniac · 2020-11-09T00:05:39Z

Personally, the issue I find is that when commits don't describe the changes they make, it can be hard to determine what patch in history led to a certain change in behaviour (I've been burnt by this a few times now). Generally, I assume 'add tutorial' type commits won't change behaviour unless it's explicitly flagged. If all these changes are required, perhaps a middle-ground would be to flag them explicitly in the commit message?

I agree with your that separating PRs for different functions is important for long-term maintenance, and we should do that for every released feature. However, my point is since we haven't fully released auto_scheduler and the total number of auto_scheduler PRs is just a few, it should be easy to identify the PR that changes the certain behavior (in fact, since currently only a few people using the upstream auto_scheduler, I don't think this would be an issue for now). IMHO, it's fine to follow this principle after auto_scheduler is able to perform end-to-end tuning on all three platforms (x86, ARM, NVIDIA GPU). Meanwhile, my primary concern of separating changes to many small PRs is that it will slower the auto_scheduler upstream process due to the high CI traffic.

FrozenGene · 2020-11-09T02:36:56Z

python/tvm/auto_scheduler/task_scheduler.py

+    def _print_table_info(self, next_task_idx):
+        # table header
+        _ffi_api.PrintTitle("Task Scheduler")
+        print("|  ID  | Latency (ms) | Speed (GFLOPS) | Trials |")


Could we extract more information? Like operator name (Conv2D, softmax...) and its shape (1x3x224x224)? Only ID, we have to match its detail information again.

It is not easy to extract this information by parsing the compute dag.
One way to achieve this is to attach this information by using the attrs in te.compute when defining ops in TOPI compute functions.
I leave this to future PRs.

mbaret · 2020-11-09T10:05:51Z

Meanwhile, my primary concern of separating changes to many small PRs is that it will slower the auto_scheduler upstream process due to the high CI traffic.

Would you agree then that a more explicit commit message would be valuable?

tutorials/auto_scheduler/tune_network_cuda.py

mbaret · 2020-11-09T10:21:10Z

tutorials/auto_scheduler/tune_network_cuda.py

+tasks, task_weights = auto_scheduler.extract_tasks(mod["main"], params, target)
+
+# Define the objective as the end-to-end exeuction time of the network
+objective = lambda costs: sum(c * w for c, w in zip(costs, task_weights))


As someone not very familiar with the auto-scheduler, it seems a bit strange to me that this is exposed here. Could this not be a default objective?

Very good point. I changed the interface to hide this from users.

tutorials/auto_scheduler/tune_network_cuda.py

python/tvm/auto_scheduler/task_scheduler.py

comaniac · 2020-11-09T18:42:14Z

Would you agree then that a more explicit commit message would be valuable?

I'm not sure how helpful it is, but I'll let @merrymercy decide.

comaniac

LGTM. Just some nits.

python/tvm/auto_scheduler/measure_record.py

python/tvm/auto_scheduler/task_scheduler.py

python/tvm/auto_scheduler/measure_record.py

…cuda-tutorial

comments are addressed

merrymercy · 2020-11-13T08:37:20Z

Let me merge this PR first because it fixed multiple bugs.
I will send follow up PRs to improve the fallback mechanism in relay.build when there is no valid schedule in the log file.

merrymercy · 2020-11-13T12:24:44Z

@mbaret The tutorial is online now https://tvm.apache.org/docs/tutorials/auto_scheduler/tune_network_cuda.html
You can try it and any feedback is welcomed!

…#6882) * add a tutorial on auto-scheduling a network for cuda * fix typo * fix training time printing * fix lint * fix * upload logs * fix * use weighted sum as the default objective function * update ci logs * fix the bug in kill_child_processes * fix test * address comments * add early stopping in task scheduler & fix a stuck issue in measurement * fix lint * trigger CI * fix early stopping

merrymercy force-pushed the pr-cuda-tutorial branch 2 times, most recently from 59496c9 to a0577cb Compare November 8, 2020 06:07

merrymercy force-pushed the pr-cuda-tutorial branch 5 times, most recently from 3430105 to 018c4f4 Compare November 8, 2020 17:54

mbaret requested changes Nov 8, 2020

View reviewed changes

comaniac requested changes Nov 8, 2020

View reviewed changes

FrozenGene reviewed Nov 9, 2020

View reviewed changes

mbaret previously requested changes Nov 9, 2020

View reviewed changes

jcf94 requested changes Nov 9, 2020

View reviewed changes

python/tvm/auto_scheduler/task_scheduler.py Outdated Show resolved Hide resolved

merrymercy added 8 commits November 12, 2020 05:01

add a tutorial on auto-scheduling a network for cuda

d0c06bc

fix typo

5265a2d

fix training time printing

22cc36a

fix lint

731c239

fix

ce60f93

upload logs

88539dc

fix

49ff7fe

use weighted sum as the default objective function

361e4f0

merrymercy force-pushed the pr-cuda-tutorial branch from e91eff0 to 3e2640b Compare November 12, 2020 13:32

update ci logs

4b8db6b

merrymercy force-pushed the pr-cuda-tutorial branch from 3e2640b to 4b8db6b Compare November 12, 2020 13:38

fix the bug in kill_child_processes

24f06fd

comaniac approved these changes Nov 12, 2020

View reviewed changes

python/tvm/auto_scheduler/measure_record.py Outdated Show resolved Hide resolved

python/tvm/auto_scheduler/task_scheduler.py Show resolved Hide resolved

python/tvm/auto_scheduler/task_scheduler.py Show resolved Hide resolved

merrymercy added 3 commits November 12, 2020 12:32

fix test

30b5184

address comments

9eb211b

add early stopping in task scheduler & fix a stuck issue in measurement

1d43c7f

merrymercy force-pushed the pr-cuda-tutorial branch 2 times, most recently from 3f91bd1 to 912e990 Compare November 13, 2020 02:25

fix lint

9fe0a25

merrymercy force-pushed the pr-cuda-tutorial branch from 912e990 to 9fe0a25 Compare November 13, 2020 02:25

FrozenGene approved these changes Nov 13, 2020

View reviewed changes

jcf94 approved these changes Nov 13, 2020

View reviewed changes

merrymercy commented Nov 13, 2020

View reviewed changes

python/tvm/auto_scheduler/measure_record.py Outdated Show resolved Hide resolved

merrymercy added 3 commits November 12, 2020 21:15

trigger CI

b998c2f

fix early stopping

a400420

Merge branch 'pr-cuda-tutorial' of github.com:merrymercy/tvm into pr-…

4c57564

…cuda-tutorial

merrymercy merged commit 050a836 into apache:main Nov 13, 2020

merrymercy deleted the pr-cuda-tutorial branch November 13, 2020 08:36

merrymercy mentioned this pull request Nov 15, 2020

Make AutoScheduler handling of errors during measure consistent with AutoTvm #6909

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoScheduler] Tutorial on auto-scheduling a network for GPU #6882

[AutoScheduler] Tutorial on auto-scheduling a network for GPU #6882

merrymercy commented Nov 8, 2020 •

edited

merrymercy commented Nov 8, 2020

mbaret left a comment

comaniac left a comment

comaniac Nov 8, 2020

mbaret commented Nov 8, 2020

comaniac commented Nov 9, 2020

FrozenGene Nov 9, 2020

merrymercy Nov 12, 2020 •

edited

FrozenGene Nov 13, 2020

mbaret commented Nov 9, 2020

mbaret Nov 9, 2020

merrymercy Nov 12, 2020 •

edited

comaniac commented Nov 9, 2020

comaniac left a comment

merrymercy commented Nov 13, 2020 •

edited

merrymercy commented Nov 13, 2020 •

edited

[AutoScheduler] Tutorial on auto-scheduling a network for GPU #6882

[AutoScheduler] Tutorial on auto-scheduling a network for GPU #6882

Conversation

merrymercy commented Nov 8, 2020 • edited

Todo

merrymercy commented Nov 8, 2020

mbaret left a comment

Choose a reason for hiding this comment

comaniac left a comment

Choose a reason for hiding this comment

comaniac Nov 8, 2020

Choose a reason for hiding this comment

mbaret commented Nov 8, 2020

comaniac commented Nov 9, 2020

FrozenGene Nov 9, 2020

Choose a reason for hiding this comment

merrymercy Nov 12, 2020 • edited

Choose a reason for hiding this comment

FrozenGene Nov 13, 2020

Choose a reason for hiding this comment

mbaret commented Nov 9, 2020

mbaret Nov 9, 2020

Choose a reason for hiding this comment

merrymercy Nov 12, 2020 • edited

Choose a reason for hiding this comment

comaniac commented Nov 9, 2020

comaniac left a comment

Choose a reason for hiding this comment

merrymercy commented Nov 13, 2020 • edited

merrymercy commented Nov 13, 2020 • edited

merrymercy commented Nov 8, 2020 •

edited

merrymercy Nov 12, 2020 •

edited

merrymercy Nov 12, 2020 •

edited

merrymercy commented Nov 13, 2020 •

edited

merrymercy commented Nov 13, 2020 •

edited