Skip to content

Intel® Extension for Transformers v1.3.1 Release

Compare
Choose a tag to compare
@kevinintel kevinintel released this 19 Jan 16:38
· 372 commits to main since this release
81d4c56

Highlights
Improvements
Examples
Bug Fixing
Validated Configurations

Highlights

  • Support experimental INT4 inference on Intel GPU (ARC and PVC) with Intel Extension for PyTorch as backend
  • Enhance LangChain to support new vectorstore (e.g., Qdrant)

Improvements

  • Improve error code handling coverage (dd6dcb4 )
  • NeuralChat document refine (aabb2fc )
  • Improve Text-generation API (a4aba8 )
  • Refactor transformers-like API to adapt to latest transformers version (4e6834a )
  • NeuralChat integrate GGML INT4 (29bbd8 )
  • Enable Qdrant vectorstore (f6b9e32 )
  • Support llama series model for llava finetuning (d753cb )

Examples

  • Support GGUF Q4_0, Q5_0 and Q8_0 models from HuggnigFcae (1383c7)
  • Support GPTQ model inference on CPU (f4c58d0 )
  • Support SOLAR-10.7B-Instruct-v1.0 model (77fb81 )
  • Support magicoder model and refine load model (f29c1e )
  • Support Mixstral-8x7b model (9729b6 )
  • Support Phi-2 model (04f5ef6c )
  • Evaluate Perplexity of NeuralSpeed (b0b381)

Bug Fixing

  • Fix GPTQ load in issue ( 226e08 )
  • Fix tts crash with messy retrieval input and enhance normalizer (4d8d9a )
  • Support compatible stats format (c0a89c5a )
  • Fix RAG example for retrieval plugin parameter change (c35d2b )
  • Fix magicoder tokenizer issue and streaming redundant end format (2758d4 )

Validated Configurations

  • Python 3.10
  • Centos 8.4 & Ubuntu 22.04
  • Intel® Extension for TensorFlow 2.13.0
  • PyTorch 2.1.0+cpu
  • Intel® Extension for Torch 2.1.0+cpu