Intel® Extension for Transformers v1.3.1 Release

kevinintel released this 19 Jan 16:38

· 372 commits to main since this release

Highlights
Improvements
Examples
Bug Fixing
Validated Configurations

Highlights

Support experimental INT4 inference on Intel GPU (ARC and PVC) with Intel Extension for PyTorch as backend
Enhance LangChain to support new vectorstore (e.g., Qdrant)

Improvements

Improve error code handling coverage (dd6dcb4 )
NeuralChat document refine (aabb2fc )
Improve Text-generation API (a4aba8 )
Refactor transformers-like API to adapt to latest transformers version (4e6834a )
NeuralChat integrate GGML INT4 (29bbd8 )
Enable Qdrant vectorstore (f6b9e32 )
Support llama series model for llava finetuning (d753cb )

Examples

Support GGUF Q4_0, Q5_0 and Q8_0 models from HuggnigFcae (1383c7)
Support GPTQ model inference on CPU (f4c58d0 )
Support SOLAR-10.7B-Instruct-v1.0 model (77fb81 )
Support magicoder model and refine load model (f29c1e )
Support Mixstral-8x7b model (9729b6 )
Support Phi-2 model (04f5ef6c )
Evaluate Perplexity of NeuralSpeed (b0b381)

Bug Fixing

Fix GPTQ load in issue ( 226e08 )
Fix tts crash with messy retrieval input and enhance normalizer (4d8d9a )
Support compatible stats format (c0a89c5a )
Fix RAG example for retrieval plugin parameter change (c35d2b )
Fix magicoder tokenizer issue and streaming redundant end format (2758d4 )

Validated Configurations

Python 3.10
Centos 8.4 & Ubuntu 22.04
Intel® Extension for TensorFlow 2.13.0
PyTorch 2.1.0+cpu
Intel® Extension for Torch 2.1.0+cpu

Assets 2