-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable NeuralChat Unit Test process #195
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* initial commit of n_head_kv in MQA Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com> * add attn ln Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com> * reorder QKV weight when convert Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com> * fix typo Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com> * cherry-pick ggml MQA Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com> * fix kv cache and reduce handmade mem buffer size Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com> --------- Signed-off-by: Yu, Zhentao <zhentao.yu@intel.com>
no need to maintain mpt model any more in itrex (contained in transformers 4.32.0) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Haihao Shen <haihao.shen@intel.com>
* Update README.md Update the readme * Update README.md * Update README.md * Update README.md
* Update README.md * Refine the collaboration Signed-off-by: hshen14 <haihao.shen@intel.com> --------- Signed-off-by: hshen14 <haihao.shen@intel.com>
* refine code-generation example Signed-off-by: changwangss <chang1.wang@intel.com> * remove code Signed-off-by: changwangss <chang1.wang@intel.com> * remove invalid code * improve readme and line length Signed-off-by: changwangss <chang1.wang@intel.com> --------- Signed-off-by: changwangss <chang1.wang@intel.com> Co-authored-by: Haihao Shen <haihao.shen@intel.com>
* add gptq examples Signed-off-by: YIYANGCAI <yiyang.cai@intel.com> --------- Signed-off-by: YIYANGCAI <yiyang.cai@intel.com> Co-authored-by: xinhe <xin3.he@intel.com>
* add OPTIMIZATION_ONLY for setup Signed-off-by: Xin He <xin3.he@intel.com> * change name: backends to runtime Signed-off-by: Xin He <xin3.he@intel.com> --------- Signed-off-by: Xin He <xin3.he@intel.com>
This reverts commit 120e233.
Signed-off-by: jiafu zhang <jiafu.zhang@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Refine Inference Workflow Readme --------- Signed-off-by: hshen14 <haihao.shen@intel.com> Co-authored-by: lvliang-intel <liang1.lv@intel.com> Co-authored-by: Wang, Chang <chang1.wang@intel.com>
* add finetuning test for mpt-7b-chat with hpu Signed-off-by: jiafu zhang <jiafu.zhang@intel.com> --------- Signed-off-by: jiafu zhang <jiafu.zhang@intel.com>
* add s8 perchannel quant and kernel. * add QKV , add fusion support for s8 PerN * add amx_int8 pern gelu fusion * add gelu add fusion for vnni * split jblas file. add compute type fp32. * add comp_type fp32 for ffn fusion * add bf16 for s4 and s4 ffn fusion * add workspace for jblas functions * keep one jblas code * disable mmap as default. change arg --no_mmap to --use_mmap.
* add OPTIMIZATION_ONLY for setup Signed-off-by: Xin He <xin3.he@intel.com> * change name: backends to runtime Signed-off-by: Xin He <xin3.he@intel.com> * fix bug Signed-off-by: Xin He <xin3.he@intel.com> --------- Signed-off-by: Xin He <xin3.he@intel.com>
* Update generate.py * limit autocast Signed-off-by: changwangss <chang1.wang@intel.com> * update readme Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * update readme Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * Unify the BKC settings Signed-off-by: hshen14 <haihao.shen@intel.com> * Unify the BKC settings Signed-off-by: hshen14 <haihao.shen@intel.com> * Simplify docker file readme Signed-off-by: hshen14 <haihao.shen@intel.com> * Format the readme Signed-off-by: hshen14 <haihao.shen@intel.com> * Add short description Signed-off-by: hshen14 <haihao.shen@intel.com> --------- Signed-off-by: changwangss <chang1.wang@intel.com> Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> Signed-off-by: hshen14 <haihao.shen@intel.com> Co-authored-by: Lv, Liang1 <liang1.lv@intel.com> Co-authored-by: hshen14 <haihao.shen@intel.com>
* refine reademe * refine reademe * refine table * Refine LLM Runtime readme Signed-off-by: hshen14 <haihao.shen@intel.com> * Continue updating the readme Signed-off-by: hshen14 <haihao.shen@intel.com> * Simplify the readme Signed-off-by: hshen14 <haihao.shen@intel.com> * add back run_llm.py * change script arg name * rename arg * fix * add description * add another way to convert model * remove additional line * refine readme * refine readme, but we need to modify convert script later * fix model_maps Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com> * fix convert_gptj Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com> * refine readme * refine --------- Signed-off-by: hshen14 <haihao.shen@intel.com> Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com> Co-authored-by: hshen14 <haihao.shen@intel.com> Co-authored-by: zhenwei-intel <zhenwei.liu@intel.com>
* Update README.md * Update README.md * Update README.md --------- Co-authored-by: Haihao Shen <haihao.shen@intel.com>
* refined finetuning config. Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com> * updated readme for new finetuning config. Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com> * simplified code. Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com> --------- Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>
* support bloom Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…tension-for-transformers into lvl/neuralchat_ut Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…ransformers into lvl/neuralchat_ut Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
…tension-for-transformers into lvl/neuralchat_ut
Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…ransformers into lvl/neuralchat_ut Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…tension-for-transformers into lvl/neuralchat_ut Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
…tension-for-transformers into lvl/neuralchat_ut
…ransformers into lvl/neuralchat_ut Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
hshen14
approved these changes
Sep 6, 2023
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
Signed-off-by: Lv, Liang1 <liang1.lv@intel.com>
VincyZhang
approved these changes
Sep 6, 2023
Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
All checks passed before. In the final run, NeuralChat UT passed. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Type of Change
feature
API not changed
Description
Enable NeuralChat Unit Test process, add NeuralChat Unit Test to CI.
Expected Behavior & Potential Risk
NeuralChat Unit Test can be launched by CI automatically.
How has this PR been tested?
Pre-CI test.
Dependency Change?
No.