Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]FamilySeer: A new search method for Auto-scheduler #57

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

noobotdj
Copy link

@comaniac
Copy link
Contributor

Thanks for the RFC. Will review next week.

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the RFC. Please see my comments.
Moreover, it would be good to have a tracking issue with a PR plan, which could have a list of PRs to be filed. In addition to the major change and unit tests, you should also have a PR working on tutorials, documents, etc.


# start tuning
#tuner.tune(tune_option) #add new parameter to tune function
tuner.tune(tune_option,search_policy="sketch.xgb.family_op")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The policy name could be more informative. family_op is not a common term after all.


```

The `tuner` loads the `tune_option` into the `tune` function. There are several parameters in the `tune` function (Refer to class [Taskscheduler](https://tvm.apache.org/docs/reference/api/python/auto_scheduler.html?highlight=taskscheduler#tvm.auto_scheduler.TaskScheduler)). Users can enable our method by changing the `search_policy` parameter to `sketch.xgb.family_<family_algorithm>`. We currently provide two family algorithms as an option: `op` refers to classifying subgraphs based on the core operation, and `hash` refers to classifying subgraphs based on operation sequence. We recommend using `op` to achieve better performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this description, op seems not an accurate term. IIUC, both op and hash target to subgraphs. The only different is the way to determine the similarity, so again, it would be good to have a better naming.

In addition, since you recommend to use op for better performance, could you elaborate the need of hash?

else:
family_group[task_hash].append(idx)

elif class_type == "ind":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ind the third version? I didn't find it in the previous section.

if class_type == "op":
for idx, task in enumerate(tasks):
task_layers = task.desc.split('_')
if task_layers[1] not in family_group:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that this is a bit too ad-hoc:

  1. It relies on the task.desc, which is supposed to be used only for user reference and the format isn't guaranteed.
  2. It identifies the second op (e.g., task_layers[1]) to be the anchor op, but this may not 100% hold.


The accuracy of the cost model determines the search quality, but Ansor uses monolithic cost model to predict different computation graphs (subgraphs), resulting in an accuracy loss during tuning.

The task scheduler allocates most of the time budget to subgraphs with most improving potential (i.e., those with the highest latency). This approach works well at the beginning of the autotuning. However, as the potential subgraph gradually reaches its peak performance with adequate time budget, other subgraphs have little time budget to reach its peak performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, the solution to this problem is to improve the task scheduler strategy. Since the task scheduler predicts the peak performance of a task, as long as this prediction is accurate enough, it shouldn't spend more time on the task that has achieved the peak performance.

……
```

The foresee tuning takes `task_idx_groups` (A list of subgraph families) and `skip_measures_per_round` as inputs and tunes all the subgraphs inside the list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is skip_measures_per_round?

# Drawbacks
[drawbacks]: #drawbacks

When searching on a larger search space (such as larger batch size), FamilySeer performs similarly or sometimes worse than Auto-scheduler. This is because a larger search space requires more time before the cost model can provide an accurate prediction. Deploying an inaccurate cost model on Foresee tuning may result in spending time budget on non-improvement code transformations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that this can be improved somehow. For example, at the warmup stage (e.g., the first N trials), all cost models share the same training data from all tasks so that you have the same behavior as the current auto-scheduler. Afterward, you apply different data to each cost model to benefit from the task groups.

# Prior art
[prior-art]: #prior-art

Please refer to [this paper](https://arxiv.org/abs/2201.00194).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a "prior" art lol
You should just put this link to the summary section. And for this section, you should put other related work (e.g., Ansor, FlexTensor, AutoTVM, etc).

# Unresolved questions
[unresolved-questions]: #unresolved-questions

Our search method is up for [discussion](https://discuss.tvm.apache.org/t/rfc-familyseer-a-new-search-method-for-auto-scheduler/11877).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an unresolved question. Unresolved questions should be the technical difficulty or the drawbacks of the proposed approach.

- Feature Name: (FamilySeer: A new search method for Auto-scheduler)
- Start Date: (2021-01-07)
- RFC PR: [apache/tvm-rfcs#57](https://github.com/apache/tvm-rfcs/pull/57)
- GitHub Issue: [apache/tvm#9875](https://github.com/apache/tvm/pull/9875)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link you put is the pull request but not the tracking issue. Please open a new issue to track the PR progress and put the link here.

@comaniac comaniac self-assigned this Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants