Release Intel® Extension for Transformers v1.1.1 Release · intel/intel-extension-for-transformers

Highlights
Bug Fixing & Improvements
Tests & Tutorials

Highlights
In this release, we improved NeuralChat, a customizable chatbot framework under Intel® Extension for Transformers. NeuralChat is now available for you to create your own chatbot within minutes on multiple architectures.

Bug Fixing & Improvements

Fix the code structure and the plugin in NeuralChat (commit 486e9e)
Fix bug in retrieval chat (commit d2cee0)
NeuralChat Inference return correct input len without pad to user (commit 18be4c)
Fix MPT not support left padding issue (commit 24ae58)
Fix double remove dataset columns when concatenation (commit 67ce6e)
Fix DeepSpeed and use cache issue (commit 4675d4)
Fix bugs in predict_stream (commit e1da7e)
Fix docker CPU issues (commit 8fa0dc)
Fix read HuggingFaceH4/oasst1_en dataset issue (commit 76ee68)
Modify Dockerfile for finetuning (commit 797aa2)
Fix the perf of LLaMA2 by static_shape in optimum Habana (commit 481f38)
Remove NeuralChat redundant code and hard codes. (commit 0e1e4d, 037ce8, 10af3c)
Refined NeuralChat finetuning config (commit e372cf)

Tests & Tutorials

Add inference test for LLaMA2 and MPT with HPU (commit 5c4f5e)
Add inference test for LLaMA2 and MPT with Intel CPUs (commit ad4bec, 2f6188)
Add finetuning test for MPT (commit 72d81e, 423242)
Add GHA Unit Tests (commit 49336d)
NeuralChat finetuning tutorial for LLaMA2 and MPT (commit d156e9)
NeuralChat deployment on Intel CPU/ Habana HPU/ Nvidia tutorial (commit b36711)

Validated Configurations

Centos 8.4 & Ubuntu 22.04
Python 3.9
PyTorch 2.0.0
TensorFlow 2.12.0

Acknowledgements
Thanks for the contributions from sywangyi, jiafuzha and itayariel. Thanks to all the participants to Intel Extension for Transformers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intel® Extension for Transformers v1.1.1 Release