This repository contains the data and code for the following paper:
Accuracy can Lie: On the Impact of Surrogate Model in Configuration Tuning
- Table of content
- Introduction
- Code and quick start
- Datasets
- Raw experiments results
- RQ_supplementary
To ease the expensive measurements during configuration tuning, it is natural to build a surrogate model as the replacement of the system and thereby the configuration performance can be cheaply evaluated. Yet, a stereotype therein is that the higher the model accuracy, the better the tuning result would be, or vice versa. This 'accuracy is all' belief drives our research community to build more and more accurate models and criticize a tuner due to the inaccuracy of the model it uses. However, this practice raises some previously unaddressed questions, e.g., whether the model and its accuracy are really that important for the tuning result? Do those somewhat small accuracy improvements reported (e.g., a few % error reduction) in existing work really matter much to the tuners? What role does model accuracy play in the impact of tuning quality? To answer those related questions, in this paper, we conduct one of the largest-scale empirical studies to date---running over the period of 13 months
-
code
-- Data (Datasets, the target need to start with "$<")
-- batch (Batch model-based tuners)
-- models (Surrogate models)
-- sequential (Sequential model-based tuners)
-- util (Util for tuners)
-- utils (Utils for models)
-- requirements.txt (Essential requirments need to be installed)
-- run (A simple run on system "7z", the working path is "./model-impact/code") -
Python 3.8+
To run the code, cd "./model-impact/code" as working path and install the essential requirements:
pip install -r requirements.txt
And you can run below code to have a quick start:
python3 run.py
The datasets are originally from
https://zenodo.org/records/7544891#.ZDQzsMLMLN8:
- Brotli
- XGBoost
- DConvert
- 7z
- ExaStencils
- Kanzi
- Jump3r
- Spark
https://github.com/DeepPerf/DeepPerf:
- LLVM
- BDBC
- HSQLDB
- Polly
- JavaGC
https://zenodo.org/record/7504284#.ZDQ66sLMLN8:
- Lrzip
https://github.com/FlashRepo/Flash-MultiConfig:
- noc-CM-log
- SaC
https://github.com/pooyanjamshidi/deeparch-xplorer:
- DeepArch
https://github.com/anonymous12138/multiobj:
- MariaDB
https://drive.google.com/drive/folders/1qxYzd5Om0HE1rK0syYQsTPhTQEBjghLh:
- Polly
https://github.com/xdbdilab/CM-CASL:
- Spark
- Redis
- Hadoop
- Tomcat
https://github.com/ai-se/BEETLE:
- Storm
Thanks for their efforts. Details of the datasets are given in our paper.
The experiment data reported in the work can be found at: https://zenodo.org/records/11172102. The naming rule follow as:
Result: PickleLocker_[tune]_results/[Data_big\small]/[Data]/[model]_[seed_num].csv
- e.g. "./PickleLocker_atconf_results/Data_big/7z/RF_seed101.csv"
RQ_supplementary contains the supplementary files for our research questions.