Add new chat cli with auto backend feature #1276

RunningLeon · 2024-03-12T07:14:02Z

Motivation

Support auto backend feature in chat cli

Modification

Combine lmdeploy chat torch and lmdeploy chat turbomind into lmdeploy chat command. Note the old commands still work
Support auto backend feature in cli: lmdeploy chat .

BC-breaking (Optional)

No BC

Use cases (Optional)

pytorch backend

lmdeploy chat internlm/internlm-chat-7b --backend pytorch

turbomind

lmdeploy chat internlm/internlm-chat-7b

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

zhyncs · 2024-03-20T08:55:11Z

Hi @RunningLeon After the recent auto backend feature, there have been a few mishaps. For example, I compiled TurboMind using Python 3.9, so the turbomind.so is for Python 3.9. If I try to import lmdeploy with a version other than Python 3.9, TurboMind won't work. Previously, without auto backend, it would directly throw an error. Now it doesn't show an error but falls back to PyTorch instead. In most cases, this isn't a problem. However, if a user actually wants to use TurboMind and is unaware of the automatic fallback, unexpected results may occur. Therefore, I am thinking that making auto backend an option might be more appropriate so users can explicitly force it off.

RunningLeon · 2024-03-20T10:50:23Z

Hi @RunningLeon After the recent auto backend feature, there have been a few mishaps. For example, I compiled TurboMind using Python 3.9, so the turbomind.so is for Python 3.9. If I try to import lmdeploy with a version other than Python 3.9, TurboMind won't work. Previously, without auto backend, it would directly throw an error. Now it doesn't show an error but falls back to PyTorch instead. In most cases, this isn't a problem. However, if a user actually wants to use TurboMind and is unaware of the automatic fallback, unexpected results may occur. Therefore, I am thinking that making auto backend an option might be more appropriate so users can explicitly force it off.

@zhyncs hi, thanks for your feedback. There is a warning for this case:

lmdeploy/lmdeploy/archs.py

Lines 42 to 50 in 8a2fed8

    
           try: 
        
               from lmdeploy.turbomind.supported_models import \ 
        
                   is_supported as is_supported_turbomind 
        
               turbomind_has = is_supported_turbomind(model_path) 
        
           except ImportError: 
        
               logger.warning( 
        
                   'Lmdeploy with turbomind engine is not installed correctly. ' 
        
                   'You may need to install lmdeploy from pypi or build from source ' 
        
                   'for turbomind engine.')

zhyncs · 2024-03-20T10:56:07Z

Hi @RunningLeon Yes, I did see this warning in source code, but in the actual usage process, seeing the warning did not attract attention. Should we consider raising the log level to error or as mentioned above, adding an option to force it off?

lvhan028 · 2024-03-20T11:53:17Z

Hi, @zhyncs
Seamlessly falling back to the PyTorch engine when a model isn't supported by TurboMind is a key feature of LMDeploy.
Our goal is to offer LMDeploy as a comprehensive suite rather than just an individual engine, ensuring users a hassle-free experience when deploying models without having to worry about engine-specific issues.
We appreciate your suggestion regarding the log level and the potential for an additional optional argument. However, after careful consideration, we have decided to maintain our current approach.
Could you try to set log_level='warning' when you use lmdeploy?

lvhan028 · 2024-03-20T12:37:19Z

@RunningLeon please add logs when fallback to pytorch engine

try: 
     from lmdeploy.turbomind.supported_models import \ 
         is_supported as is_supported_turbomind 
     turbomind_has = is_supported_turbomind(model_path) 
 except ImportError: 
     logger.warning( 
         'Lmdeploy with turbomind engine is not installed correctly. ' 
         'You may need to install lmdeploy from pypi or build from source ' 
         'for turbomind engine.')

If turbomind is installed, but the model is not supported, there is no log to show the engine falllbacks to pytorch engine

lvhan028 · 2024-03-25T06:09:35Z

This case failed

CUDA_VISIBLE_DEVICES=1 lmdeploy chat /workspace/models-140/Qwen/Qwen-7B-Chat/ --backend pytorch

lmdeploy/archs.py

grimoire · 2024-03-25T13:38:52Z

lmdeploy/archs.py

-                       f' Try to run with lmdeploy pytorch engine.')
+    try_run_msg = (f'Try to run with pytorch engine because `{model_path}`'
+                   f' is not explicitly supported by lmdeploy. ')
+    if is_turbomind_installed:


I think these warning should only appear when user intent to use turbomind (for example, sending a turbomindconfig)

The default is using turbomind. Autobackend works only when backendconfig is None or TurbomindConfig.

lmdeploy/cli/utils.py

RunningLeon added 4 commits March 12, 2024 12:50

add new chat cli

744d03f

add deprecate warning

e22ab9e

update docs of chat cli

ca46859

improve cli

6ab6c25

RunningLeon marked this pull request as ready for review March 12, 2024 07:14

RunningLeon added 2 commits March 13, 2024 16:27

Merge remote-tracking branch 'upstream/main' into cli-chat-auto

f6f1ada

remove unused arg engine in lmdeploy list cli

3a7bf02

lvhan028 added the improvement label Mar 19, 2024

lvhan028 mentioned this pull request Mar 20, 2024

[Feature] Support backend change into pytorch backend when using command chat #1255

Closed

Merge remote-tracking branch 'upstream/main' into cli-chat-auto

88cc502

lvhan028 self-requested a review March 20, 2024 11:53

update fallback msg

2a2cf2b

Merge remote-tracking branch 'upstream/main' into cli-chat-auto

baeff85

lvhan028 requested review from grimoire and irexyc March 25, 2024 07:35

lvhan028 approved these changes Mar 25, 2024

View reviewed changes

grimoire reviewed Mar 25, 2024

View reviewed changes

lmdeploy/archs.py Outdated Show resolved Hide resolved

grimoire reviewed Mar 25, 2024

View reviewed changes

lmdeploy/cli/utils.py Outdated Show resolved Hide resolved

resolve comments

3ecacf3

grimoire approved these changes Mar 26, 2024

View reviewed changes

lvhan028 merged commit 893a574 into InternLM:main Mar 26, 2024
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new chat cli with auto backend feature #1276

Add new chat cli with auto backend feature #1276

RunningLeon commented Mar 12, 2024 •

edited

zhyncs commented Mar 20, 2024

RunningLeon commented Mar 20, 2024

zhyncs commented Mar 20, 2024

lvhan028 commented Mar 20, 2024

lvhan028 commented Mar 20, 2024

lvhan028 commented Mar 25, 2024

grimoire Mar 25, 2024

RunningLeon Mar 26, 2024

Add new chat cli with auto backend feature #1276

Add new chat cli with auto backend feature #1276

Conversation

RunningLeon commented Mar 12, 2024 • edited

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

pytorch backend

turbomind

Checklist

zhyncs commented Mar 20, 2024

RunningLeon commented Mar 20, 2024

zhyncs commented Mar 20, 2024

lvhan028 commented Mar 20, 2024

lvhan028 commented Mar 20, 2024

lvhan028 commented Mar 25, 2024

grimoire Mar 25, 2024

Choose a reason for hiding this comment

RunningLeon Mar 26, 2024

Choose a reason for hiding this comment

RunningLeon commented Mar 12, 2024 •

edited