ChunkFT

ChunkFT is a memory-efficient fine-tuning codebase for large language models. It extends Hugging Face Trainer workflows with chunk-wise parameter updates, optional PEFT modules, optimizer-state offloading/prefetching, and DeepSpeed integration.

Highlights

Chunk-wise training: update a subset of parameter chunks at a time.
PEFT integration through peft_function, including LoRA-style methods.
Trainer variants for classification, sequence-to-sequence, generation, QA, NER, pretraining, instruction tuning, and general task-collection experiments.
Optional DeepSpeed mixed-precision optimizer monkey patches.
Chunk optimizer-state offloading and prefetching controlled by CHUNKFT_* environment variables.
One universal launch script: scripts/run_chunkft.sh.

Repository Layout

chnk/                         # Core ChunkFT package
chnk/trainer.py               # ChunkTrainer and chunk state management
chnk/seqtrainer.py            # ChunkSeq2SeqTrainer
chnk/qatrainer.py             # ChunkQuestionAnsweringTrainer
chnk/registerCallBack.py      # Model-family callback and parameter selection rules
chnk/optimizers/              # Optimizer wrappers and DeepSpeed patching
chnk/utils/                   # PEFT helpers, chunk utilities, checkpoint helpers
examples/                     # Python training entry points
scripts/run_chunkft.sh        # Universal launch script
dsconfig/                     # DeepSpeed configs
glue/                         # GLUE data/resources
models/                       # Local model customizations
metrics/                      # Local metric implementations

Installation

Create an environment with PyTorch, Transformers, Datasets, PEFT, Accelerate, and DeepSpeed:

pip install torch transformers datasets peft accelerate deepspeed evaluate

Install task-specific metric dependencies from metrics/*/requirements.txt only when needed.

Universal Script

All old task-specific scripts under scripts/ have been removed. Use the single launcher:

bash scripts/run_chunkft.sh

The script is configured with environment variables and supports these task modes:

TASK_MODE=glue
TASK_MODE=generation
TASK_MODE=qa
TASK_MODE=ner
TASK_MODE=pretrain
TASK_MODE=instruct
TASK_MODE=tasks

The first two positional arguments are optional shortcuts. The chunk strategy defaults to row:

bash scripts/run_chunkft.sh <chunk_num> <chunk_update_interval>

Extra arguments after -- are passed directly to the selected Python entry point.

Quick Start

Run GLUE SST-2 with ChunkFT:

TASK_MODE=glue \
TASK_NAME=sst2 \
MODEL_NAME_OR_PATH=/path/to/model \
OUTPUT_DIR=outputs/sst2_chunkft \
bash scripts/run_chunkft.sh 4 1

Common Script Variables

Model and task:

TASK_MODE: one of glue, generation, qa, ner, pretrain, instruct, tasks.
MODEL_NAME_OR_PATH: base model path or Hugging Face model name.
TASK_TYPE: PEFT task type, for example SEQ_CLS, CAUSAL_LM, SEQ_2_SEQ_LM, TOKEN_CLS, or QUESTION_ANS.
TASK_NAME: task name for GLUE or task-collection runs.
DATASET_NAME: dataset name for generation, QA, or NER entry points.
DATASET_DIR: local dataset directory for pretraining or instruction tuning.
MODEL_TYPE: model family hint for some entry points, such as llama, opt, or gpt2.

Training:

NUM_GPUS: number of processes for torchrun.
CUDA_VISIBLE_DEVICES: visible GPU ids.
OUTPUT_DIR: output directory prefix.
LEARNING_RATE: learning rate.
NUM_TRAIN_EPOCHS: number of epochs when MAX_STEPS is unset.
MAX_STEPS: optional step-based training limit.
PER_DEVICE_TRAIN_BATCH_SIZE: train batch size per device.
PER_DEVICE_EVAL_BATCH_SIZE: eval batch size per device.
FP16: set 1 to add --fp16.

ChunkFT:

ENABLE_CHUNKFT: set 1 to add --chunk_tuning; set 0 for normal training.
CHUNK_NUM: number of parameter chunks.
CHUNK_UPDATE_INTERVAL: number of optimizer updates before switching chunks.
ENABLE_CHUNK_PREFETCH: controls --enable_chunk_prefetch for entry points that support it.

Direct Python Usage

You can still call entry points directly:

python examples/run_glue.py \
  --model_name_or_path /path/to/model \
  --task_name sst2 \
  --do_train \
  --do_eval \
  --output_dir outputs/sst2_chunkft/model \
  --TaskType SEQ_CLS \
  --chunk_tuning \
  --chunk_num 4 \
  --chunk_update_interval 4

Runtime Flags

CHUNKFT_ENABLE_MONKEY_PATCHES=0: disable DeepSpeed monkey patches.
CHUNKFT_ENABLE_PREFETCH=0: disable chunk optimizer-state prefetching.
CHUNKFT_ASYNC_OFFLOAD=0: disable async offload behavior.
CHUNKFT_PIN_MEMORY=0: disable pinned-memory transfers.
CHUNKFT_CUDA_SYNC=1: force CUDA synchronization for debugging.
CHUNKFT_EMPTY_CACHE=1: empty CUDA cache at selected runtime points.
CHUNKFT_DEBUG_GPU_USAGE=1: enable GPU usage debug logging.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChunkFT

Highlights

Repository Layout

Installation

Universal Script

Quick Start

Common Script Variables

Direct Python Usage

Runtime Flags

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
chnk		chnk
dsconfig		dsconfig
examples		examples
glue		glue
metrics		metrics
models		models
scripts		scripts
.gitignore		.gitignore
readme.md		readme.md

Folders and files

Latest commit

History

Repository files navigation

ChunkFT

Highlights

Repository Layout

Installation

Universal Script

Quick Start

Common Script Variables

Direct Python Usage

Runtime Flags

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages