feat: add quantize_model() fn & a MidnightRose70B #79

sambarnes · 2024-03-15T15:25:56Z

Details

without a calibration dataset:
https://huggingface.co/sambarnes/Midnight-Rose-70B-v2.0.3-GPTQ-naive

with a calibration dataset (VMWare/open-instruct):
https://huggingface.co/sambarnes/Midnight-Rose-70B-v2.0.3-GPTQ

upon further testing: i couldnt actually notice a difference between the two, so gonna just use the not naive one

Code of Conduct

I agree to follow this project's Code of Conduct
I agree to license this contribution under the MIT LICENSE
I checked the current PR for duplication.

…ibration

sambarnes · 2024-03-15T20:51:06Z

modal/runner/shared/quantize.py

+    logger.info(f"Volume now contains {quantized_model_path}")
+
+
+def load_open_instruct(tokenizer, n_samples=128):


this calibration dataset loading is mostly forked from the examples here:
https://github.com/AutoGPTQ/AutoGPTQ/blob/main/examples/quantization/quant_with_alpaca.py

where instead of reading from a json file, we sample from this dataset instead:
https://huggingface.co/datasets/VMware/open-instruct

only adjustments to the code were to limit number of samples & change column labels for the dataset

modal/runner/shared/quantize.py

louisgv

LGTM 👍

sambarnes added 7 commits March 15, 2024 09:25

feat: add quantize_model() fn & a MidnightRose70B

1a170da

deps: revert deps change

f492274

perf: bump up to H100 for MidnightRose70B

bb51ac1

fix: add guard for model path existence in quantizer

e72cc3e

feat: use the VMWare/open-instruct dataset that TheBloke uses for cal…

5e489e8

…ibration

chore: remove unused line

9292bcf

refactor: lil golf

dae0357

sambarnes commented Mar 15, 2024

View reviewed changes

sambarnes and others added 5 commits March 15, 2024 14:53

feat: host both naive & calibrated quantizations

8f6e23e

fix: downsize to A100 40G

5f233c7

Merge branch 'main' into quantizer

d313c8e

fix: delete the naive version and do keep_warm=1

7849a28

chore: add todo for tokenizer files

c5671d3

sambarnes marked this pull request as ready for review March 22, 2024 23:01

louisgv reviewed Mar 22, 2024

View reviewed changes

modal/runner/shared/quantize.py Outdated Show resolved Hide resolved

louisgv approved these changes Mar 22, 2024

View reviewed changes

deps: bump transformers on quantizer to 4.39.1

1892215

sambarnes merged commit 1d737c4 into main Mar 23, 2024
3 checks passed

sambarnes deleted the quantizer branch March 23, 2024 00:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add quantize_model() fn & a MidnightRose70B #79

feat: add quantize_model() fn & a MidnightRose70B #79

sambarnes commented Mar 15, 2024 •

edited

Loading

sambarnes Mar 15, 2024

louisgv left a comment

		logger.info(f"Volume now contains {quantized_model_path}")


		def load_open_instruct(tokenizer, n_samples=128):

feat: add quantize_model() fn & a MidnightRose70B #79

feat: add quantize_model() fn & a MidnightRose70B #79

Conversation

sambarnes commented Mar 15, 2024 • edited Loading

Details

Code of Conduct

sambarnes Mar 15, 2024

Choose a reason for hiding this comment

louisgv left a comment

Choose a reason for hiding this comment

sambarnes commented Mar 15, 2024 •

edited

Loading