Skip to content

VITA-Group/Junk_DNA_Hypothesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 

Repository files navigation

Lu Yin*, Ajay Jaiswal*, Shiwei Liu,Souvik Kundu, Zhangyang Wang

University of Texas at Austin, Eindhoven University of Technology, University of Oxford, Intel Labs

The code can be contacted at l.yin@tue.nl.

Table of contents


Installation

Please check INSTALL.md for installation instructions.

Various Task difficulty

We provide a quick overview of the arguments:

  • --model_name_or_path: The identifier for the model on the Hugging Face model hub.
  • --TASK_NAME: the name of the fine-tuned tasks.
  • --sparsity: Denotes the percentage of weights to be pruned.
  • --sparse_init: Specifies the type of sparsity [sparse_nm, sparse_unstuctured] .
  • --method: a flag to of the output_dir.

TASK DIFFICULTY SETTING 2: Varying the Adequacy of Target Domain Data


Scripts example

TO BE RELEASED SOON

TASK DIFFICULTY SETTING 2: Varying the Option Count in Multiple-choice QA Setting


Scripts example

TO BE RELEASED SOON

TASK DIFFICULTY SETTING 3: Varying context length for Retrieval-Augmented QA


TO BE RELEASED SOON

TASK DIFFICULTY SETTING 4: Varying number of k-shot examples for in-context learning

TO BE RELEASED SOON

TASK DIFFICULTY SETTING 5: TASK DIFFICULTY SETTING 5: Estimating LLM-facing Task Difficulty by Normalized Human-LLM Performance Gap

TO BE RELEASED SOON

TASK DIFFICULTY SETTING 6: TASK DIFFICULTY SETTING 6: Factoid-based v.s. Multiple-choice QA

TO BE RELEASED SOON

Scripts example

TO BE RELEASED SOON

Are Pre-trained Magnitude Values Indeed the True Gem?

Scripts example

-- vary sparse_method with

  • --freeze_weights: Sparse Transfer
  • --freeze_weights_frompretrain: Dense Transfer with Freezing
  • -or leave it emply: Sparse to Dense Transfer
cd ./GLUE_tasks 

for seed in 41 42 43
do
  for TASK_NAME in QNLI 
  do 
    for sparsity in 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
    do
      for validation_split_percentage in 100
      do
      python Glue_prune_oneshot.py \
        --method Glue_noembed_freeze_weights \
        --validation_split_percentage $validation_split_percentage \
        --sparse_method \
        --noembed \
        --sparsity $sparsity \
        --model_name_or_path roberta-large \
        --task_name $TASK_NAME \
        --max_length 512 \
        --per_device_train_batch_size 16 \
        --learning_rate 2e-5 \
        --num_train_epochs 3 \
        --seed $seed \
        --output_dir ./roberta/Glue_noembed_freeze_weights/$TASK_NAME/$sparsity/$validation_split_percentage/$seed/
      done
    done
  done
done


Results

Citation

if you find this repo is helpful, please cite

@article{yin2024junk,
  title={Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs},
  author={Yin, Lu and Jaiswal, Ajay and Liu, Shiwei  and Kundu, Souvik and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2310.02277v2},
  year={2024}
}

About

"Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity" Lu Yin, Shiwei Liu, Ajay Jaiswal, Souvik Kundu, Zhangyang Wang

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages