Skip to content

Releases: huggingface/optimum-intel

v1.9.3: Patch release

30 Jun 16:24
Compare
Choose a tag to compare

Full Changelog: v1.9.2...v1.9.3

v1.9.2: Patch release

26 Jun 22:31
Compare
Choose a tag to compare
  • Fix INC distillation to be compatible with neural-compressor v2.2.0 breaking changes by @echarlaix in #338

v1.9.1: Patch release

15 Jun 15:47
Compare
Choose a tag to compare
  • Fix inference for OpenVINO export for causal language models by @echarlaix in #351

v1.9.0: OpenVINO models improvements, TorchScript export, INC quantized SD pipeline

12 Jun 09:27
Compare
Choose a tag to compare

OpenVINO and NNCF

  • Ensure compatibility for OpenVINO v2023.0 by @jiwaszki in #265
  • Add Stable Diffusion quantization example by @AlexKoff88 in #294 #304 #326
  • Enable decoder quantized models export to leverage cache by @echarlaix in #303
  • Set height and width during inference for static models Stable Diffusion models by @echarlaix in #308
  • Set batch size to 1 by default for Wav2Vec2 for NNCF compatibility v2.5.0 @ljaljushkin in #312
  • Ensure compatibility for NNCF v2.5 by @ljaljushkin in #314
  • Fix OVModel for BLOOM architecture by @echarlaix in #340
  • Add SD OV model height and width attribute and fix export for torch>=v2.0.0 by @eaidova in #342

Intel Neural Compressor

  • Add TSModelForCausalLM to enable TorchScript export, loading and inference for causal lm models by @echarlaix in #283
  • Remove INC deprecated classes by @echarlaix in #293
  • Enable IPEX model inference for text generation task by @jiqing-feng in #227 #300
  • Add INCStableDiffusionPipeline to enable INC quantized Stable Diffusion model loading by @echarlaix in #305
  • Enable the possibility to provide a quantization function and not a calibration dataset during INC static PTQ by @PenghuiCheng in #309
  • Fix INCSeq2SeqTrainer evaluation step by @AbhishekSalian in #335
  • Fix INCSeq2SeqTrainer padding step by @echarlaix in #336

Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.9.0

v1.8.1: Patch release

01 Jun 17:33
Compare
Choose a tag to compare
  • Fix OpenVINO Trainer for transformers >= v4.29.0 by @echarlaix in #328

Full Changelog: v1.8.0...v1.8.1

v1.8.0: Optimum INC CLI, past key values for OpenVINO decoder models

17 Apr 16:30
Compare
Choose a tag to compare

Optimum INC CLI

Integration of the Intel Neural Compressor dynamic quantization to the Optimum command line interface. Example commands:

optimum-cli inc --help
optimum-cli inc quantize --help
optimum-cli inc quantize --model distilbert-base-cased-distilled-squad --output int8_distilbert/
  • Add Optimum INC CLI to apply dynamic quantization by @echarlaix in #280

Levarage past key values for OpenVINO decoder models

Enable the possibility to use the pre-computed key / values in order to make inference faster. This will be enabled by default when exporting the model.

model = OVModelForCausalLM.from_pretrained(model_id, export=True)

To disable it, use_cache can be set to False when loading the model:

model = OVModelForCausalLM.from_pretrained(model_id, export=True, use_cache=False)
  • Enable the possibility to use the pre-computed key / values for OpenVINO decoder models by @echarlaix in #274

INC config summarizing optimizations details

Fixes

  • Remove dynamic shapes restriction for GPU devices by @helena-intel in #262
  • Enable OpenVINO model caching for CPU devices by @helena-intel in #281
  • Fix the .to() method for causal langage models by @helena-intel in #284
  • Fix pytorch model saving for transformers>=4.28.0 when optimized with OVTrainer @echarlaix in #285
  • Update for task name for ONNX and OpenVINO export for optimum>=1.8.0 by @echarlaix in #286

v1.7.3: Patch release

29 Mar 13:12
Compare
Choose a tag to compare
  • Fix INC distillation to be compatible with neural-compressor v2.1 by @echarlaix in #260

v1.7.2: Patch release

24 Mar 01:09
Compare
Choose a tag to compare
  • Fix OpenVINO Seq2Seq model export for optimum v1.7.3 by @echarlaix in #253

v1.7.1: Patch release

24 Mar 00:46
Compare
Choose a tag to compare
  • Fix IPEX quantization model output by @sywangyi in #218
  • Fix INC pruning and QAT combination by @xin3he in #241
  • Fix loading of stable diffusion model when model config is not adapted by @echarlaix in #237
  • Enable VAE encoder openvino export by @echarlaix in #224
  • Fix openvino fp16 conversion for seq2seq models by @echarlaix in #238
  • Disable scheduler, tokenizer, feature extractor loading when provided by @echarlaix in #245
  • Fix OVTrainer openvino export for structurally pruned model by @yujiepan-work in #236

v1.7.0: OpenVINO pruning, knowledge distillation, Stable Diffusion models inference

02 Mar 16:42
Compare
Choose a tag to compare

NNCF Joint pruning quantization and distillation

Enable joint pruning, quantization and distillation through the OVTrainer by @vuiseng9 in #150

Stable Diffusion models OpenVINO export and inference

Add stable diffusion OpenVINO pipeline by @echarlaix in #195

from optimum.intel.openvino import OVStableDiffusionPipeline

model_id = "stabilityai/stable-diffusion-2-1"
stable_diffusion = OVStableDiffusionPipeline.from_pretrained(model_id, export=True)
prompt = "sailing ship in storm by Rembrandt"
images = stable_diffusion(prompt).images