Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kvint4/8 ete testcase #1448

Merged
merged 19 commits into from
Apr 18, 2024
Merged

Conversation

zhulinJulia24
Copy link
Collaborator

  1. benchmark.yaml change
  • add kvint4/8 benchmark
  • change cuda allocate for more effectiveness
  1. daily_ete_test.yaml change
  • remove kvint8 quantization
  • remove timeout setting and reorder teststep because the testcase is much more stable
  1. evaluate.yaml change
  • change model path
  1. pr_test_case chage
  • remove -s -v to make result more simpler and clearer
  1. autotest/chat_prompt_case.yaml and autotest/prompt_case.yaml
  • react testcase to reduce error results
  1. autotest/config.yaml and autotest/utils/config_utils.py
  • split models into chat model and base model and add more models
  • refactor kvint setting
  1. autotest/interface/restful/test_restful_interface_func_common.py
  • add more testcase into restful interface
  1. autotest/tools/chat/*
  • Streamline test cases, use one test case to cover multiple previous test cases
  • split testcases into chat and base model
  1. autotest/tools/pipeline/*
  • add kvint4/8 testcases
  1. autotest/tools/restful/*
  • add kvint4/8 testcases
  • refactor start and stop restful server process

other changes:

  • rename -inner-w4a16 models into -inner-4bits
  • change temperater=0.01 into top_k=1 to make response more stable
  • remove useless code after refactor kvint4/8 quantization

@zhulinJulia24
Copy link
Collaborator Author

test result

daily ete testcase:

https://github.com/zhulinJulia24/lmdeploy/actions/runs/8720843216
some response in some model is still not stable. I will still follow and improve it.
image

benchmark testcase:

https://github.com/zhulinJulia24/lmdeploy/actions/runs/8724161035
image
image

evaluation testcase:

https://github.com/zhulinJulia24/lmdeploy/actions/runs/8720159669
image

and
https://github.com/zhulinJulia24/lmdeploy/actions/runs/8724058314
image

@zhulinJulia24
Copy link
Collaborator Author

pr_test time cost reduce to about 30mins, because of streamline test cases, use one test case to cover multiple previous test cases in cli chat. Currently 4bits quantization is still reserved.

@lvhan028
Copy link
Collaborator

pr_test time cost reduce to about 30mins, because of streamline test cases, use one test case to cover multiple previous test cases in cli chat. Currently 4bits quantization is still reserved.

How long did pr_test workflow cost before this PR?

@zhulinJulia24
Copy link
Collaborator Author

zhulinJulia24 commented Apr 18, 2024

pr_test time cost reduce to about 30mins, because of streamline test cases, use one test case to cover multiple previous test cases in cli chat. Currently 4bits quantization is still reserved.

How long did pr_test workflow cost before this PR?

currently about 1hour.

@zhulinJulia24
Copy link
Collaborator Author

--no-deps fixed

image

@zhulinJulia24
Copy link
Collaborator Author

zhulinJulia24 commented Apr 18, 2024

benchmark step condition change test ok:
image

Copy link
Collaborator

@RunningLeon RunningLeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvhan028 lvhan028 merged commit e8a0c9a into InternLM:main Apr 18, 2024
4 checks passed
@zhulinJulia24 zhulinJulia24 deleted the add_kv_testcase branch April 26, 2024 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants