New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] [AutoTVM] Implementing an auto-tuning library/cache #4150
Comments
Thank you for this proposal. This is helpful to manage local log files. One question about: with config_library:
relay.build(...) What is the relationship between config_library and autotvm dispatch context? It seems that this design replaces dispatch context with config_library. And how are different dispatch contexts managed in this case? |
Thanks for the RFC. I like the idea of the config library concept. Some concerns/questions:
In this way, we can implement the resume logic you proposed in the constructor of tuners.
I agree with you that when loading the history, in addition to checking if the history config matches the current task in terms of target, op name, shapes and attributes, we also need to check the device as you mentioned. We can try to retrieve the target device info using system call and add it to every record when dumping to file/database. In my personal opinion, we also need to invalid the history/records when TVM has been updated, because it may result in different performance even for the same config on the same device. The simplest way is checking the timestamp. For example, we let users config an expiration time when creating a tuner:
In this example, a user wants the tuner to load Again thanks for the proposal, and any of my suggestions can be adjusted/discussed if they are too overwhelming or unnecessary. |
Thanks @kevinthesun and @comaniac for the responses!
I'm not intending to replace the existing dispatch context, only provide some syntactic sugar. We could just override the
Does it make sense to generalise here? As far as I can tell, TopHub doesn't store tuning history just optimal configs, so there's no way to 'resume' a TopHub tuning session. In some way we have to determine whether the existing 'tuning effort' to produce a particular config is sufficient and number of trials is the only obvious way I can think of characterising this. I'd be happy to look at any alternative implementation idea though.
This would be a good start, but I think this needs to also be something a user can fully specify. For instance, we might be interesting in driver versions, memory clock speeds or even physical parameters such as board cooling. Which system calls were you considering using to determine the platform? Perhaps have a default method that relies on these calls with the ability to pass additional arbitrary info to
I agree with this, but maybe it can be included as part of the previous point on board configuration? In a general sense we need an idea of whether a particular config is 'compatible' with our current platform and I think it's reasonable to include TVM version as a part of this. |
Thanks for the reponses and I think they are valuable. I embedded my opinions with yours and leave the dispatch context for @kevinthesun. Also cc @tqchen and @icemelon9 for their inputs.
I agree with you that TopHub is serving a different purpose if we consider trial number in the resume logic, but they can still share the same implementation and history format in the way I suggested. My concern of using trial number is that it limits the use case of this RFC only for resuming interrupted tuning but not others, such as transferring the tuning process to the others, or reuse the configs of 2000 trial random search to launch a new grid search, etc. Alternativaly, we could decouple the history and a specific tuning process. Speicifically, we do not add any tuning process specific information to the config library but just let the tuner determine if it can reuse the result from the config library or not when it needs to measure that config. For example, the tuning process was interrupted in the 50th trial so we have 50 configs in the library. When resuming the tuning, the tuner still starts from scratch but it could save the time of measuring those 50 configs when the tuner follows the same tuning process. One advantage is that this scenario is applicable to different tuner or even different models with the same task. One drawback of my alternative comapred to yours is that if the tuning process is non-deterministic (e.g., random search) then we might spend time on tuning different configs, but I think this can be workaround by either exposing an optional random seed argument in tuner (such as
I have the same question actually. This part is relatively vague and probably need some other's input.
Your response remineded me that the current config history already includes a version information, although it is always 0.1. Not sure if we can make use of it and save some efforts. |
Thanks for the helpful discussion. Some of the common themes that I see
It would be great if we can dissect the discussion, e.g. reach a consensus for meta-data format that we prefer, and then talks about possible context library behaviors and possibility of implementing different variants of libraries |
@comaniac I think I understand where our different approaches are coming from. I was proposing that only the optimal configurations be permanently saved to the config library (like with TopHub) and a temporary log file of tuning configs would be maintained only during a tuning job. Storing all of the tuning history would rapidly result in huge files which I think would be fine in the case of a database but seems unwise for text files (in terms of search performance). From my experience using AutoTVM, I often find interrupted tuning sessions occur while tuning a large multilayer network. In this case, I mostly care about skipping the layers that have already been fully tuned. Restarting the partially tuned layer from scratch is often not a significant time penalty in comparison. I see that this approach is not nearly as good in a workflow that involves iteratively tuning a network more and more, in which case you would save a significant amount of time by being able to resume using the tuning history. A compromise between the two options might be, as you said, making the tuners deterministic. That way by just knowing the number of trials we can determine which configs can be skipped without needing to store the entire history. I don't think this can be made to work with the xgb tuner though (maybe just treat that as a special case?) |
@mbarrett97 I see your point. If the problem is narrowed down to "skip some tasks in a model when resuming the tuning that was accidently interrupted", then your proposal is a lightweight working solution. Maybe we can file another RFC focusing on a more general history reuse support. Then talking back to your proposal, the current solution is using |
@comaniac Having given this some thought, I think it's reasonable to support both approaches. I didn't want to include full logs because I was hoping to also be able to use config library to distribute tuned configs, however it should be fine to just 'export' a config library with only optimal configs. In that case, I propose the following. Have each auto-tuning session create a new 'job'. This job will have an entry in a JSON file ('job index') containing at least the target string, start/finish time of the job and a path to the history file generated. Optionally we permit some arbitrary JSON to describe the platform in more detail. By default, we delete the history file when a job completes (but keep the job entry in the index), however a flag can be passed to retain the history. Now if a task needs to be resume, first a simple check can be done to see if the existing optimal config has already been tuned with sufficiently many trials (and with the right tuner/platform). If so, skip, otherwise search the job index to see if any history files qualify to restart the tuning. In that case, we can use your proposal. |
For local log file management, how about we store the best K schedules for each workload? User can choose how many schedules they would like to keep. |
Got your point, altough I think you can always pick the best config before distribution like the current AutoTVM use case. Current AutoTVM log all configs to a JSON file, and if a user only wants to keep the best one, she uses
If I undertand correctly, you are going to add tuning process metadata to the JSON file in addition to the configs, like the example code snippet you proposed in the very beginning of this RFC. Since you propose to use config library as the "database" to log all configs (the argument of Another suggestion is naming.
Yeah I think this part is relatively clear. |
@mbarrett97 I wonder why not just using the transfer learning in the AutoTVM. After using transfer learning, AutoTVM will skip the tasks that have been tried before. See the example at |
@icemelon9 This suggestion is more about infrastructure so that we're not required to keep track of individual log files and how they were produced. We need this to decide whether or not we can skip a task based on existing results. @comaniac @kevinthesun I've updated the PR to include more concretely the ideas being discussed. I think an auto-tuning 'job' is distinct from a task as I am using it to refer to a series of tasks tuned sequentially (eg. tuning a network would be a 'job'). A JSON file containing all of the jobs is produced which contains information such as the start/finish time of the job, target/platform parameters and importantly the optimal configs for each task in the job. In principle this would allow you to 'revert' an auto-tuning job from the config library if you discovered you'd done something invalid during a job (I've done this a few times...) Keeping the entire history of a job can be controlled by a flag. I'm hacking one of the tutorial scripts to use the config library mechanism instead,
|
Some comments after reading the example and the current PR.
|
@comaniac I've done some refactoring to disentangle 'TuningJob' from the ConfigLibrary. The tuning loop now looks like this:
Using 'with job' puts the job into the global tuning scope. The job will then automatically register it's own callback to the tuners and a new tuner method If you don't specify a ConfigLibrary with a job, it will just log all the results to the specified log file. Config files are indexed within the library by target, so to use configs from the library you can simply do I've updated my PR (#4151) accordingly. Note that the PR does not include every feature discussed here but is intended as initial infrastructure on top of which more advanced features can be developed. |
I went through the new proposal and the PR. This looks much better to me from the perspecitive of functionality. One concern in my mind is the long term maintaince. It seems like we will have more and more new features dealing with a set of tasks. As @tqchen mentioned in another RFC, it might be better to make a task pass manager to manage such processes, but we should be able to integrate this one to the task pass manager later on once we have it ready. @eqy @kevinthesun do you guys have any other concerns? |
Auto-tuning currently relies on manually keeping track of various log files. This can quickly become quite unwieldy when tuning for many different devices, trying to do partial tuning or restarting a tuning session.
Proposals
Create an offline library of auto-tune configurations into which you can feed auto-tuning logs and have the optimal configurations be saved. The library should store not just the configuration, but also the tuning conditions (eg. tuner + no. of trials). This way, it is possible to check whether or not 'sufficient' tuning has already been done on a particular task and if so that task can be skipped. I propose an interface which to the library which would make a typical auto-tuning loop look something like the following:
You would then use the library with something as simple as:
Additional Thoughts
In order to reliably interact with existing records in the library, you need to be able to determine the exact platform/device that the tuning was performed on. I currently use the '-model' parameter to store this information (eg. -model=hikey960), but it would be better to be able to store some arbitrary json object here so that additional platform configuration options can be specified (eg. clock speeds, driver versions etc).
The current logging system is also heavily reliant on writing essentially flat text files. A config library would probably be more suited to being stored in a nosql/json database, however for now I've stuck to keeping it flat.
My WIP PR is here #4151.
Comments/suggestions are welcomed!
Implementation
The text was updated successfully, but these errors were encountered: