GPTQ + QuIP for VLMs

Currently supports modular GPTQ for LLaVA and VQAV2 evaluation. Code built on top of the QuIP repository.

Quantization

Quantization of Conv2d and Conv1d layers is not implemented by the authors for QuIP (see original implementation)

The same data is used for both LLaVA and BLIP-2 calibration, sourced from the llava_instruct_150k dataset.

GPTQ:

LLaVA 1.5

python llava.py llava-hf/llava-1.5-7b-hf llava_instruct_150k --wbits 4 --nsamples 128 [--save quantized.safetensors] --quant gptq --pre_gptqH [--eval vqav2 seed1] [--skip-last-{vision,proj,language}]

BLIP-2

python blip2.py Salesforce/blip2-opt-2.7b llava_instruct_150k --wbits 8 --nsamples 128 --quant gptq --pre_gptqH --eval

LDLQ (QuIP):

LLaVA 1.5

python llava.py llava-hf/llava-1.5-7b-hf llava_instruct_150k --wbits 4 --nsamples 128 [--save quantized.safetensors] --quant ldlq --incoh_processing [--eval vqav2 seed1] [--skip-last-{vision,proj,language}]

BLIP-2

python blip2.py Salesforce/blip2-opt-2.7b llava_instruct_150k --wbits 8 --nsamples 128 --quant ldlq --incoh_processing --eval

Parameters:

Parameter	Description
`--wbits`	Number of bits for each block of the VLM, as `vision,proj,llm`. For example, `--wbits 2,4,8` quantizes the vision block to 2 bits, the projector to 4 bits, and the language model to 8 bits
`--skip_last_{language,proj,vision}`	if set, skips the last layer of the specified block
`--exclude-conv`	Whether to exclude `nn.Conv2D` layers in the quantization process (i.e. only quantize linear layers)

Notes:

--skip_last_vision skips model.vision_tower.vision_model.encoder.layers.23.mlp.fc2
--skip_last_proj skips the linear_2 layer of the projector
--skip_last_language skips the lm_head layer
--incoh_processing (necessary argument when running QuIP / LDLQ) is a "meta argument which sets the following flags --pre_gptqH --pre_rescale --pre_proj --qfn b" (from the original QuIP README)

Evaluation

Pass the --eval flag to evaluate the model on a benchmark. Accepted values are vqav2 and seed1.

VQAv2

The evaluation requires the image dataset to be downloaded locally in a vqav2/Images/ directory created in this repo. The vqav2 directory should also contain the question file and annotation file, as downloaded from the website:

quip
├── README.md
├── vqav2/
    ├── Images/mscoco/{dataset, ex. 'val2014'}
    ├── Questions/{question file, ex. 'v2_OpenEnded_mscoco_val2014_questions.json'}
    ├── Annotations/{Annotation file for the questions}

The results of the evaluation are written to the vqav2/Results directory, first in a .jsonl file. This is so that if the evaluation gets interrupted for some reason, it can be resumed without recomputing all answers from the beginning. Once all questions have been answered, a valid [...]_results.json file is generated, and the accuracy evaluation is performed. Results are printed to standard output and saved in a Results/[...]_accuracy.json file.

SEED-1

The benchmark requires the image dataset to be downloaded locally in seed1/SEED-Bench-image (from here), and the question file to be placed alongside it in the seed1 directory.

quip
├── README.md
├── seed1/
    ├── SEED-Bench-image
    ├── SEED-Bench.json

The results of the evaluation are written to a .jsonl file within the seed1 directory. Once all questions have been answered, the script computes the accuracy of the model and writes the results in a .txt file.

GQA

The benchmark requires the images to be downloaded locally to gqa/images, and the question, choices, and scene graph files to be placed in gqa/questions. The files can be downloaded here. For the val split, the directory structure should be the following:

quip
├── README.md
├── gqa/
    ├── eval.py
    ├── images
    ├── questions/
        ├── val_all_questions.json
        ├── val_choices.json
        ├── val_sceneGraphs.json

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
gqa		gqa
seed1		seed1
vqav2		vqav2
zeroShot		zeroShot
.gitignore		.gitignore
ORIG_README.md		ORIG_README.md
README.md		README.md
bal.py		bal.py
blip2.py		blip2.py
compute_Hsummary.py		compute_Hsummary.py
datautils.py		datautils.py
gptq.py		gptq.py
llama.py		llama.py
llava.py		llava.py
method.py		method.py
modelutils.py		modelutils.py
multimodalutils.py		multimodalutils.py
near.py		near.py
opt.py		opt.py
opt_proxy.py		opt_proxy.py
opt_saveH.py		opt_saveH.py
optq_counter.py		optq_counter.py
optq_ldlq_equiv.py		optq_ldlq_equiv.py
quant.py		quant.py
requirements.txt		requirements.txt
sample_job.sh		sample_job.sh
vector_balance.py		vector_balance.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPTQ + QuIP for VLMs

Quantization

GPTQ:

LDLQ (QuIP):

Evaluation

VQAv2

SEED-1

GQA

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPTQ + QuIP for VLMs

Quantization

GPTQ:

LDLQ (QuIP):

Evaluation

VQAv2

SEED-1

GQA

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages