Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable NeuralChat Unit Test process #195

Merged
merged 71 commits into from
Sep 6, 2023
Merged

Enable NeuralChat Unit Test process #195

merged 71 commits into from
Sep 6, 2023

Conversation

lvliang-intel
Copy link
Collaborator

Type of Change

feature
API not changed

Description

Enable NeuralChat Unit Test process, add NeuralChat Unit Test to CI.

Expected Behavior & Potential Risk

NeuralChat Unit Test can be launched by CI automatically.

How has this PR been tested?

Pre-CI test.

Dependency Change?

No.

lvliang-intel and others added 30 commits August 30, 2023 00:09
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* initial commit of n_head_kv in MQA

Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>

* add attn ln

Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>

* reorder QKV weight when convert

Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>

* fix typo

Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>

* cherry-pick ggml MQA

Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>

* fix kv cache and reduce handmade mem buffer size

Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>

---------

Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>
no need to maintain mpt model any more in itrex (contained in transformers 4.32.0)

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Haihao Shen <haihao.shen@intel.com>
* Update README.md

Update the readme

* Update README.md

* Update README.md

* Update README.md
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* Update README.md

* Refine the collaboration

Signed-off-by: hshen14 <haihao.shen@intel.com>

---------

Signed-off-by: hshen14 <haihao.shen@intel.com>
* refine code-generation example

Signed-off-by: changwangss <chang1.wang@intel.com>

* remove code

Signed-off-by: changwangss <chang1.wang@intel.com>

* remove invalid code

* improve readme and line length

Signed-off-by: changwangss <chang1.wang@intel.com>

---------

Signed-off-by: changwangss <chang1.wang@intel.com>
Co-authored-by: Haihao Shen <haihao.shen@intel.com>
* add gptq examples

Signed-off-by: YIYANGCAI <yiyang.cai@intel.com>

---------

Signed-off-by: YIYANGCAI <yiyang.cai@intel.com>
Co-authored-by: xinhe <xin3.he@intel.com>
* add OPTIMIZATION_ONLY for setup

Signed-off-by: Xin He <xin3.he@intel.com>

* change name: backends to runtime

Signed-off-by: Xin He <xin3.he@intel.com>

---------

Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: jiafu zhang <jiafu.zhang@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Refine Inference Workflow Readme

---------

Signed-off-by: hshen14 <haihao.shen@intel.com>
Co-authored-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Wang, Chang <chang1.wang@intel.com>
)

Signed-off-by: jiafu zhang <jiafu.zhang@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
* add finetuning test for mpt-7b-chat with hpu

Signed-off-by: jiafu zhang <jiafu.zhang@intel.com>


---------

Signed-off-by: jiafu zhang <jiafu.zhang@intel.com>
* add s8 perchannel quant and kernel.

* add  QKV , add fusion support for s8 PerN

* add amx_int8 pern gelu fusion

* add gelu add fusion for vnni

* split jblas file. add compute type fp32.

* add comp_type fp32 for ffn fusion

* add bf16 for s4 and s4 ffn fusion

* add workspace for jblas functions

* keep one jblas code

* disable mmap as default. change arg --no_mmap to --use_mmap.
* add OPTIMIZATION_ONLY for setup

Signed-off-by: Xin He <xin3.he@intel.com>

* change name: backends to runtime

Signed-off-by: Xin He <xin3.he@intel.com>

* fix bug

Signed-off-by: Xin He <xin3.he@intel.com>

---------

Signed-off-by: Xin He <xin3.he@intel.com>
* Update generate.py

* limit autocast

Signed-off-by: changwangss <chang1.wang@intel.com>

* update readme

Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>

* update readme

Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>

* Unify the BKC settings

Signed-off-by: hshen14 <haihao.shen@intel.com>

* Unify the BKC settings

Signed-off-by: hshen14 <haihao.shen@intel.com>

* Simplify docker file readme

Signed-off-by: hshen14 <haihao.shen@intel.com>

* Format the readme

Signed-off-by: hshen14 <haihao.shen@intel.com>

* Add short description

Signed-off-by: hshen14 <haihao.shen@intel.com>

---------

Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: hshen14 <haihao.shen@intel.com>
Co-authored-by: Lv, Liang1 <liang1.lv@intel.com>
Co-authored-by: hshen14 <haihao.shen@intel.com>
* refine reademe

* refine reademe

* refine table

* Refine LLM Runtime readme

Signed-off-by: hshen14 <haihao.shen@intel.com>

* Continue updating the readme

Signed-off-by: hshen14 <haihao.shen@intel.com>

* Simplify the readme

Signed-off-by: hshen14 <haihao.shen@intel.com>

* add back run_llm.py

* change script arg name

* rename arg

* fix

* add description

* add another way to convert model

* remove additional line

* refine readme

* refine readme, but we need to modify convert script later

* fix model_maps

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

* fix convert_gptj

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

* refine readme

* refine

---------

Signed-off-by: hshen14 <haihao.shen@intel.com>
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
Co-authored-by: hshen14 <haihao.shen@intel.com>
Co-authored-by: zhenwei-intel <zhenwei.liu@intel.com>
* Update README.md

* Update README.md

* Update README.md

---------

Co-authored-by: Haihao Shen <haihao.shen@intel.com>
* refined finetuning config.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* updated readme for new finetuning config.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* simplified code.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

---------

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>
* support bloom

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
lvliang-intel and others added 20 commits September 6, 2023 11:47
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…tension-for-transformers into lvl/neuralchat_ut

Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…ransformers into lvl/neuralchat_ut

Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…ransformers into lvl/neuralchat_ut

Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…tension-for-transformers into lvl/neuralchat_ut

Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…ransformers into lvl/neuralchat_ut

Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
@VincyZhang
Copy link
Contributor

All checks passed before. In the final run, NeuralChat UT passed.
Clean redundant code and merge.

@VincyZhang VincyZhang merged commit 49336d3 into main Sep 6, 2023
22 of 26 checks passed
@VincyZhang VincyZhang deleted the lvl/neuralchat_ut branch September 6, 2023 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet