Skip to content


Repository files navigation


Code associated with the ICLR2023 paper AANG: Automating Auxiliary Learning


  title={AANG: Automating Auxiliary Learning},
  author={Dery, Lucio M and Michel, Paul and Khodak, Mikhail and Neubig, Graham and Talwalkar, Ameet},
  journal={arXiv preprint arXiv:2205.14082},

Notes on Installation

First, get original environment associated with the Dont-Stop-Pretraining paper.

conda env create -f environment.yml
conda activate domains

Note that extra packages can be found in this file :


You can look up the appropriate versions in the above file if you try to run after the first installation above and run into package not found errors.

Data Formatting

See datasets/* for an example dataset.

Code expects {train, test, dev}.jsonl as the data formatting.

Code for creating auxiliary objectives expects the train.jsonl to be converted to .txt file via simply concatenating the text (without the labels) from the train.jsonl file.

To add a new dataset, create a dict like at the end of the file that has all the information about the task data. Also update the get_task_info method in with the new task details.


command flags are formatted {example}{description}

-task              {citation_intent}{name of task. These are listed in the function -- get_task_info()}
-base-spconfig     {citation.supervised}{name of the search space -- list of search space names in AutoSearchSpace/ in the get_config()}       
-patience          {20}{How long to keep running after validation set performance has plateaud before ending trianing} 
-grad-accum-steps  {4}{Number of gradient accumulation steps. This takes into already takes into account the total batch size so no need to update that if this is updated}
-exp-name          {SUPERVISED}{Name given to the experiment}
-gpu-list          {"[0, 1]"}{string array of the list of gpus to use. The script will automatically split hyper-parameters runs amongst these gpus} 
-hyperconfig       {partial_big}{Name of the hyper-parmeter config to explore. List is present in get_hyper_config(). }
-runthreads        {}{this is a flag. Turn this off if experiments have already been run and you just want to re-aggregate results}
-pure-transform    {}{this is a flag. This determines whether we start the corruption Transforms are pure transforms (replace only, mask only) verus mixed transforms as with BERT}

To run a single hyper-parameter configuration -- inspect the get_base_runstring function from and populate with your hand designed hyper-parameters.

Important hyper-parameters

soptlr : Learning rate for weighting between primary and auxiliary objectives
aux-lr : Learning rate for weighting amongst auxiliary objectives. 
classflr : Overall learning rate for task.

Hyper-parameters for fitting the dev-head as in META-TARTAN can be found in the function AutoSearchSpace/ - add_modelling_options(). They are set to the defaults used in the META-TARTAN paper.

To remove meta-learning and just use static multitasking just set soptlr = aux-lr = 0 and the default equalized weightings will be used. Note that the current implementaiton of static multitasking is not faster than the meta-learning approach because we just set the weighting learning rate to 0 (all the overhead from computing meta-gradients is still incurred). Users are free to re-implement multitasking efficiently.

Addendums on Running

The run commands are in

If you run with the appropriate settings, results will be saved as a csv in resultsSheet which you can analyze Data for 1 dataset citation_intent/ACL-ARC has been provided. Data for other tasks can be obtained by following the instructions listed here :

Experiments were run on A100 or A6000 - large memory devices are preferred because of meta-learning approach. If you have memory issues you can increase


which will accumulate gradients over more steps with smaller batches

Results are checkpointed into a folder called autoaux_outputs - which can get big - you can either clear it out regularly or just reduce the checkpoint frequency in the code.


The best checkpoints for ACL-ARC and HYPERPARTISAN tasks are linked here.

Evaluation Procedure for Paper Results

We run each method and associated hyperparameter configuration for 3 seeds. For each seed, we early stop on the validation set and take the checkpoint with the best validation performance -- we then evaluate this checkpoint on the test set. The score of a hyper-parameter configuration is the average over the test performances for the different seeds.

For all methods, we rank the hyper-parameter configurations and take the one with the best test performance.


No description, website, or topics provided.






No releases published


No packages published
