Add kvint4/8 ete testcase #1448

zhulinJulia24 · 2024-04-18T01:15:38Z

benchmark.yaml change

add kvint4/8 benchmark
change cuda allocate for more effectiveness

daily_ete_test.yaml change

remove kvint8 quantization
remove timeout setting and reorder teststep because the testcase is much more stable

evaluate.yaml change

change model path

pr_test_case chage

remove -s -v to make result more simpler and clearer

autotest/chat_prompt_case.yaml and autotest/prompt_case.yaml

react testcase to reduce error results

autotest/config.yaml and autotest/utils/config_utils.py

split models into chat model and base model and add more models
refactor kvint setting

autotest/interface/restful/test_restful_interface_func_common.py

add more testcase into restful interface

autotest/tools/chat/*

Streamline test cases, use one test case to cover multiple previous test cases
split testcases into chat and base model

autotest/tools/pipeline/*

add kvint4/8 testcases

autotest/tools/restful/*

add kvint4/8 testcases
refactor start and stop restful server process

other changes:

rename -inner-w4a16 models into -inner-4bits
change temperater=0.01 into top_k=1 to make response more stable
remove useless code after refactor kvint4/8 quantization

zhulinJulia24 · 2024-04-18T01:22:56Z

test result

daily ete testcase:

https://github.com/zhulinJulia24/lmdeploy/actions/runs/8720843216
some response in some model is still not stable. I will still follow and improve it.

benchmark testcase:

https://github.com/zhulinJulia24/lmdeploy/actions/runs/8724161035

evaluation testcase:

https://github.com/zhulinJulia24/lmdeploy/actions/runs/8720159669

and
https://github.com/zhulinJulia24/lmdeploy/actions/runs/8724058314

zhulinJulia24 · 2024-04-18T03:15:02Z

pr_test time cost reduce to about 30mins, because of streamline test cases, use one test case to cover multiple previous test cases in cli chat. Currently 4bits quantization is still reserved.

.github/scripts/set_benchmark_param.sh

.github/workflows/benchmark.yml

.github/workflows/daily_ete_test.yml

lvhan028 · 2024-04-18T04:03:50Z

pr_test time cost reduce to about 30mins, because of streamline test cases, use one test case to cover multiple previous test cases in cli chat. Currently 4bits quantization is still reserved.

How long did pr_test workflow cost before this PR?

zhulinJulia24 · 2024-04-18T05:07:28Z

pr_test time cost reduce to about 30mins, because of streamline test cases, use one test case to cover multiple previous test cases in cli chat. Currently 4bits quantization is still reserved.

How long did pr_test workflow cost before this PR?

currently about 1hour.

zhulinJulia24 · 2024-04-18T06:54:23Z

--no-deps fixed

zhulinJulia24 · 2024-04-18T07:09:15Z

benchmark step condition change test ok:

RunningLeon

LGTM

zhulin1 and others added 14 commits April 15, 2024 19:21

add kv case

d10d056

update

fdbe5ec

update

b329803

update

741c668

update

0aaf819

Merge branch 'InternLM:main' into add_kv_testcase

c91b250

update

985e8ed

merge main

c2741bd

update

3ab8762

update

fa5ff45

update

cd99462

update

8659876

evaluation update

bf07060

update

7000dd3

zhulinJulia24 requested review from lvhan028 and RunningLeon April 18, 2024 01:15